S371: Lab 9

Lab Instructor: Katya Baldina ()

2023-11-29

Announcement

TEST III OPENED TODAY AT 12.45 PM AND DUE TOMORROW 11.59 PM!

No office hours during the test. If you find anything weird about the test, feel free to email me or Prof. Schultz.

Test III: Multiple Choice

Canvas system does not allow you to go back to the previous question and change the answer

• You can only re-enter all answers in the multiple-choice section from the beginning

• So try to jot down your answer for multiple-choice questions on a piece of paper in case you need to change any answers

Concepts

You have to know:

Concepts

To do a proportion test, you need to know sample size and the count of observed occurrences (refer to Lab 10)

You need to know how to find standard error for a proportion (Lecture 19)

The pooled proportion is the weighted average of the proportions of the two samples. It is used in the two-proportions z-test with a pooled approach to estimate the population proportion when the population variances of the two samples are assumed to be equal. (Lecture 20)

You have to know how to find approximate IQR from the boxplot and infer the shape of the distribution (Lecture 4)

Concepts

Means one-sample test (Lecture 13): \(\frac{\delta}{\sqrt{n}}\)

Proportions one-sample test (Lecture 19): \(\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)

Mean two-sample test (test of difference between two means, Lecture 15): \(\sqrt{\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2}}\)

Difference in proportions confidence interval (Lecture 20): \(\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}\)

Difference in proportions hypothesis test (Lecture 20): \(\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}\)

Hypothesis testing

We never accept neither null or alternative hypothesis. Since there are so many possibilities in the real world, there is no way of showing that something is true empirically (Lecture 13).

Instead, we show that some statements are false. Statements can be inconsistent with the data. We reject hypotheses that are inconsistent with observed facts.

In statistics, we reject hypotheses that are below a certain level of significance.

We support alternative hypotheses by rejecting the null

We never confirm or accept the alternative hypothesis

In a hypothesis test, we always test the null hypothesis - not the alternative hypothesis

Correlation (Lecture 24)

Refer to 8.3 in Lecture 24 about r’s magnitude interpretation.

Correlation is not causation!

Word problem example:

A school nurse wants to determine whether age is a factor in whether children choose a healthy snack after school. She conducts a survey of 300 middle school students, with the results below. Test at α=.05 the claim that the proportion who choose a healthy snack differs by grade level. Use the critical value method.

##                 6th 7th 8th Total
## Healthy snack    31  43  51   125
## Unhealthy snack  69  57  49   175
## Total           100 100 100   300

expected values are calculated as \(\frac{columntotal*row total}{grand total}\):

##                    6th    7th    8th  Total
## Healthy snack    41.67  41.67  41.67 125.00
## Unhealthy snack  58.33  58.33  58.33 175.00
## Total           100.00 100.00 100.00 300.00

Word problem example:

\(cell \chi^2 = \frac{(O-E)^2}{E}\)

##                    6th    7th    8th
## Healthy snack   2.7431 0.0430 2.0910
## Unhealthy snack 1.9500 0.0300 1.4930

To find chi square statistic, we just sum all cells:

sum(final)
## [1] 8.3501

And from the r (or you can rely on table), p-value is:

pchisq(q = sum(final),df=2, lower.tail = FALSE)
## [1] 0.01537442

It is less than \(\alpha\) = 0.05, so we reject null hypothesis.