AP Statistics – Hypothesis Testing & Inference
AP Statistics | Class Notes | February 18, 2025
1. The Logic of Hypothesis Testing
• Hypothesis testing answers: 'Could this sample result have happened by chance, or does it reflect a real
population effect?'
• We assume the null hypothesis (H₀) is true and ask: how likely is our sample result under that assumption?
• If the result is very unlikely under H₀ (p ≤ α), we reject H₀ in favour of the alternative (Hₐ).
• This is analogous to a courtroom: H₀ = 'innocent until proven guilty.' We need evidence beyond reasonable
doubt (α level) to convict.
2. Setting Up Hypotheses
• H₀ (null): always a statement of equality. Example: H₀: μ = 50, or H₀: p = 0.40.
• Hₐ (alternative): the claim we want to test. Can be one-tailed (< or >) or two-tailed (≠).
• Two-tailed example: H₀: μ = 50 vs. Hₐ: μ ≠ 50 (testing if mean differs in either direction).
• One-tailed example: H₀: μ ≤ 50 vs. Hₐ: μ > 50 (testing if mean is specifically higher).
• Always set up hypotheses BEFORE collecting data to avoid bias.
3. The Five Steps of a Significance Test (AP Format)
• Step 1 – State: Write H₀ and Hₐ in context. Define the parameter (e.g. μ = true mean score of all students).
• Step 2 – Plan: Identify the correct test (z-test, t-test, chi-square). Check conditions (Random, Normal,
Independent).
• Step 3 – Do: Calculate the test statistic and p-value.
• Step 4 – Conclude: Compare p to α. State conclusion in context — never just say 'reject H₀.'
• Example conclusion: 'Since p = 0.023 < α = 0.05, we reject H₀. We have convincing evidence that the true
mean score is greater than 50.'
4. Test Statistics & Distributions
• z-test for means: z = (x̄ – μ₀) / (σ/√n). Use when σ is known and n ≥ 30.
• t-test for means: t = (x̄ – μ₀) / (s/√n) with df = n – 1. Use when σ is unknown.
• z-test for proportions: z = (p̂ – p₀) / √(p₀(1–p₀)/n). Condition: np₀ ≥ 10 and n(1–p₀) ≥ 10.
• Chi-square: χ² = Σ (O – E)² / E. Used for categorical data (goodness of fit or independence).
• The t-distribution has heavier tails than the z-distribution and approaches z as df → ∞.
5. p-Values & Significance Levels
• The p-value is the probability of observing a test statistic at least as extreme as ours, assuming H₀ is true.
• Common α levels: 0.10 (lenient), 0.05 (standard), 0.01 (strict).
• Smaller p → stronger evidence against H₀, but statistical significance ≠ practical significance.
• A study with n = 100,000 might detect a trivially small effect that is statistically significant but meaningless in
practice.
6. Type I and Type II Errors
• Type I Error (α): Reject H₀ when it is true. (False positive — convicting an innocent person.)
• Type II Error (β): Fail to reject H₀ when it is false. (False negative — acquitting a guilty person.)
• Power = 1 – β = probability of correctly detecting a real effect.
• Increasing n increases power and reduces both error types.
• Reducing α (e.g. from 0.05 to 0.01) decreases Type I errors but increases Type II errors — there is a trade-
off.
7. Confidence Intervals vs. Hypothesis Tests
• A 95% confidence interval is equivalent to a two-tailed hypothesis test at α = 0.05.
• If a 95% CI for μ does not contain μ₀, we would reject H₀: μ = μ₀ at α = 0.05.
• Confidence intervals give more information (the range of plausible values), not just a binary reject/fail
decision.
• For AP FRQ: always interpret CI as 'We are 95% confident that the true parameter lies between ___ and
___.'
8. Practice Problems
• Q1: A sample of 36 students has x̄ = 72, s = 12. Test H₀: μ = 70 vs. Hₐ: μ > 70 at α = 0.05.
• Q2: A poll shows 48% support out of n = 400. Test if this differs from 50% at α = 0.05.
• Q3: You reduce α from 0.05 to 0.01. How does this affect Type I and Type II error rates?
• Q4: State in plain English what it means if p = 0.002 in the context of Q1 above.
Personal study notes – not for redistribution.