AP Statistics: Hypothesis Testing for Proportions

Hypothesis testing is the statistical engine that drives decision-making from clinical trials to quality control. When you want to test a claim about a population proportion—like whether a new drug's success rate exceeds a placebo's, or if a new manufacturing process reduces the defect rate—you use the procedures in this guide. Mastering this topic transforms you from a passive data observer into an active statistical investigator, capable of using sample evidence to draw conclusions about the wider world.

The Foundation: Stating Hypotheses

Every hypothesis test begins with a clear, contradictory pair of statements about a population parameter. Here, that parameter is the population proportion, denoted by $p$ .

The null hypothesis ( $H_{0}$ ) is a statement of "no effect," "no difference," or the status quo. It always contains an equality ( $=$ , $\leq$ , or $\geq$ ). For a proportion, it is stated as $H_{0} : p = p_{0}$ , where $p_{0}$ is the specific numerical value being tested.
The alternative hypothesis ( $H_{a}$ or $H_{1}$ ) is what you seek evidence for. It is a statement of change, difference, or effect.

You must correctly identify whether the test is one-sided or two-sided. A one-sided test (or one-tailed test) looks for evidence of a change in one specific direction ( $p > p_{0}$ or $p < p_{0}$ ). A two-sided test (or two-tailed test) looks for evidence of a change in either direction ( $p \neq = p_{0}$ ). The research question dictates the choice. For example, testing if a new teaching method increases pass rates requires a one-sided test ( $H_{a} : p > p_{0}$ ). Testing if the proportion of defective parts is different from the standard requires a two-sided test ( $H_{a} : p \neq = p_{0}$ ).

Conditions for a Valid Test

Before any calculations, you must verify that the mathematical model is appropriate. For a one-sample z-test for a proportion, three conditions must be met:

Random: The sample data must come from a well-designed random sample or randomized experiment. This is essential for generalizing to the population.
10% Condition: The sample size $n$ must be no more than 10% of the population size when sampling without replacement. This ensures independence between sample observations.
Large Counts Condition: This checks the normality of the sampling distribution. You must expect at least 10 successes and 10 failures if the null hypothesis is true. That is, verify that both $n p_{0} \geq 10$ and $n (1 - p_{0}) \geq 10$ .

Failing to check these conditions is a critical error. If they are not met, the resulting p-value and conclusion may be invalid.

The Mechanics: Test Statistic and P-Value

With conditions satisfied, you calculate a test statistic, which measures how far your sample result is from the null hypothesis value, in units of standard error. For a proportion, this is the z-test statistic:

$z = \frac{p ^ - p _{0}}{\frac{p _{0} ( 1 - p _{0} )}{n}}$

Here, $\overset{p}{^}$ is the sample proportion (your observed statistic), $p_{0}$ is the hypothesized population proportion from $H_{0}$ , and $n$ is the sample size. The denominator is the standard error of $\overset{p}{^}$ assuming the null hypothesis is true.

This z-score tells you how many standard errors your sample proportion lies from the null value. A large absolute value of $z$ provides evidence against $H_{0}$ .

The p-value is the probability, computed assuming $H_{0}$ is true, of obtaining a sample statistic at least as extreme as the one you actually observed, in the direction specified by $H_{a}$ .

For $H_{a} : p > p_{0}$ , the p-value is $P (Z \geq z)$ .
For $H_{a} : p < p_{0}$ , the p-value is $P (Z \leq z)$ .
For $H_{a} : p \neq = p_{0}$ , the p-value is $2 \times P (Z \geq ∣ z ∣)$ .

You find this probability using the standard Normal distribution (Table A, or technology). A small p-value means your sample result would be very unlikely to occur if the null hypothesis were true, thus casting doubt on $H_{0}$ .

Making a Decision and Stating a Conclusion

You compare the p-value to a predetermined significance level, $α$ (commonly 0.05).

If p-value $\leq α$ , you reject the null hypothesis ( $H_{0}$ ).
If p-value $> α$ , you fail to reject the null hypothesis ( $H_{0}$ ). You never "accept" $H_{0}$ ; you simply lack sufficient evidence against it.

Your conclusion must always be stated in context, using non-technical language related to the original research question. It should seamlessly integrate the decision, the alternative hypothesis, and the context.

Example: A campaign manager claims 60% of voters support her candidate. A poll of 400 random voters finds 220 supporters ( $\overset{p}{^} = 0.55$ ). Test the claim at the $α = 0.05$ level.

$H_{0} : p = 0.60$ , $H_{a} : p \neq = 0.60$ (A "claim" test is typically two-sided unless specified).
Conditions: Random sample, 400 < 10% of all voters, $n p_{0} = 400 (0.6) = 240$ and $n (1 - p_{0}) = 400 (0.4) = 160$ are both $\geq 10$ . ✔
$z = \frac{0.55 - 0.60}{\frac{0.60 ( 0.40 )}{400}} = \frac{- 0.05}{0.0245} \approx - 2.04$
P-value (two-tailed): $2 \times P (Z \leq - 2.04) = 2 (0.0207) = 0.0414$
Decision: Since $0.0414 < 0.05$ , we reject $H_{0}$ .
Conclusion: There is statistically significant evidence at the 0.05 level that the true proportion of voters who support the candidate is different from 0.60.

Connection to Confidence Intervals

There is a beautiful duality between a two-sided hypothesis test at significance level $α$ and a confidence interval with confidence level $C = 1 - α$ . Specifically:

If the null value $p_{0}$ is contained within a $C = 1 - α$ confidence interval for $p$ , then you will fail to reject $H_{0} : p = p_{0}$ at level $α$ .
If the null value $p_{0}$ is not contained within the interval, you will reject $H_{0}$ .

In our voter example, a 95% CI for $p$ is $0.55 \pm 1.96 (0.55) (0.45) /400 \approx (0.501, 0.599)$ . The null value of 0.60 is not inside this interval (0.599 is just below it), which is consistent with our decision to reject $H_{0}$ . The interval provides the additional information of which plausible values for $p$ are supported by the data.

Common Pitfalls

Misstating the Alternative Hypothesis: Let the research question, not the sample data, dictate $H_{a}$ . If you want to know if a new method is better, it's one-sided ( $>$ ). If you want to know if it's different, it's two-sided ( $\neq =$ ). Seeing that $\overset{p}{^} < p_{0}$ in your sample does not mean you should use $<$ for $H_{a}$ .
Using the Wrong Standard Error: In the formula for the z-test statistic, the standard error in the denominator uses the null hypothesis proportion $p_{0}$ , not the sample proportion $\overset{p}{^}$ . Using $\overset{p}{^}$ here is a common calculation error. (Note: $\overset{p}{^}$ is used when constructing a confidence interval, but not in the test statistic under $H_{0}$ ).
Forgetting the "Double" for Two-Tailed P-values: When performing a two-sided test, the p-value is the probability in both tails. A frequent mistake is to report only the area in one tail, effectively cutting the p-value in half and making results appear more significant than they are.
Conclusion Omissions: A conclusion that states only "reject $H_{0}$ " or gives a p-value without context is incomplete. You must state that there is/is not significant evidence for the alternative hypothesis and relate it back to the topic (e.g., "for the candidate's support level," "for the defect rate").

Summary

Hypothesis testing for proportions is a formal process to evaluate a claim about a population proportion $p$ using sample data. It begins by stating a null hypothesis ( $H_{0} : p = p_{0}$ ) and an alternative hypothesis ( $H_{a} : p >, <, or \neq = p_{0}$ ).
The validity of the test depends on three conditions: Random sampling, the 10% Condition, and the Large Counts Condition ( $n p_{0} \geq 10$ and $n (1 - p_{0}) \geq 10$ ).
The z-test statistic $z = \frac{p ^ - p _{0}}{p _{0} ( 1 - p _{0} ) / n}$ measures the discrepancy in standard errors. The p-value quantifies the strength of the evidence against $H_{0}$ based on this statistic.
A decision is made by comparing the p-value to $α$ . The conclusion must be stated clearly in the context of the original problem.
A two-sided hypothesis test at significance level $α$ will reach the same conclusion as checking whether the null value $p_{0}$ falls inside a $C = 1 - α$ confidence interval for $p$ .

AP Statistics: Hypothesis Testing for Proportions

AP Statistics: Hypothesis Testing for Proportions

The Foundation: Stating Hypotheses

Conditions for a Valid Test

The Mechanics: Test Statistic and P-Value

Making a Decision and Stating a Conclusion

Connection to Confidence Intervals

Common Pitfalls

Summary

Write better notes with AI