STATISTICS
Complete Beginner's Study Guide
Confidence Intervals • Hypothesis Testing • ANOVA • Chi-Square
With Solved Examples · Easy to Hard · Exam Hints & Formula Shortcuts
TABLE OF CONTENTS
■ Chapter 1: Confidence Interval Estimation
■ Chapter 2: Hypothesis Testing — One-Sample Tests
■ Chapter 3: Two-Sample Tests
■ Chapter 4: One-Way ANOVA
■ Chapter 5: Chi-Square Tests
■ Chapter 6: Exam Keyword → Formula Cheat Sheet
Statistics Study Guide | Page 1
CHAPTER 1: CONFIDENCE INTERVAL ESTIMATION
What Is a Confidence Interval?
A confidence interval (CI) is a range of values that we are fairly sure contains the true population
parameter (like the mean or proportion). Instead of saying "the average is exactly 50", we say "we are 95%
confident the average is between 47 and 53."
■ Key Concept
95% CI means: If we repeated the study 100 times, about 95 of those intervals would contain the true
value.
Wider interval: More confidence but less precision.
Narrower interval: Less confidence but more precision.
Core Formulas
CI for Mean (population SD known — use Z)
CI = x■ ± Z* × (σ / √n)
x■ = sample mean
Z* = critical value (1.645 for 90%, 1.96 for 95%, 2.576 for 99%)
σ = population standard deviation
n = sample size
CI for Mean (population SD unknown — use t)
CI = x■ ± t* × (s / √n)
s = sample standard deviation
t* = t-critical value with df = n − 1 (from t-table)
When to use t: σ is unknown AND/OR n < 30
CI for Proportion
CI = p■ ± Z* × √(p■(1−p■) / n)
p■ = sample proportion (number of successes / n)
Use Z*: 1.96 for 95% CI (most common)
Quick Reference: Critical Values
Confidence Level α Z* (two-tail)
90% 0.10 1.645
Statistics Study Guide | Page 2
95% 0.05 1.960
99% 0.01 2.576
EXAMPLE 1 (Easy) — CI for Mean with Z
A company measures the weight of 36 bags of chips. The sample mean is 150g. The population SD is
known to be 12g. Construct a 95% confidence interval for the true mean weight.
Step 1 Identify values: x■ = 150, σ = 12, n = 36, Z* = 1.96 (for 95%)
Step 2 Calculate SE (Standard Error): SE = σ/√n = 12/√36 = 12/6 = 2
Step 3 Calculate margin of error: ME = Z* × SE = 1.96 × 2 = 3.92
Step 4 CI = x■ ± ME = 150 ± 3.92 → (146.08 , 153.92)
Answer ■ We are 95% confident the true mean weight is between 146.08g and 153.92g.
EXAMPLE 2 (Hard) — CI for Mean with t + Proportion
A random sample of 16 students scored: Mean = 72, s = 8. Population SD is unknown. Construct a 95%
CI for the true mean score. Also, if 10 out of 16 students passed, find the 95% CI for the pass proportion.
Part A — CI for Mean (t-distribution)
Step 1 σ is unknown, n=16 < 30 → use t-distribution
Step 2 df = n−1 = 16−1 = 15; t* = 2.131 (from t-table, df=15, 95%)
Step 3 SE = s/√n = 8/√16 = 8/4 = 2
Step 4 ME = t* × SE = 2.131 × 2 = 4.262
Step 5 CI = 72 ± 4.262 → (67.74 , 76.26)
Answer ■ 95% CI for mean score: (67.74, 76.26)
Part B — CI for Proportion
Step 1 p■ = 10/16 = 0.625; n = 16; Z* = 1.96
Step 2 SE = √(p■(1−p■)/n) = √(0.625×0.375/16) = √(0.01465) = 0.121
Step 3 ME = 1.96 × 0.121 = 0.237
Step 4 CI = 0.625 ± 0.237 → (0.388 , 0.862)
Answer ■ 95% CI for pass rate: (38.8%, 86.2%)
Statistics Study Guide | Page 3
CHAPTER 2: HYPOTHESIS TESTING — ONE-SAMPLE TESTS
What Is Hypothesis Testing?
Hypothesis testing is a method to make decisions about a population using sample data. We start with
an assumption (null hypothesis) and check if data gives enough evidence to reject it.
■ Key Concepts
H■ (Null Hypothesis): The "no effect / no difference" claim. Example: µ = 50
H■ (Alternative Hyp.): What we want to prove. Example: µ ≠ 50 or µ > 50 or µ < 50
p-value: Probability of getting results as extreme as observed, assuming H■ is true.
α (significance level): Usually 0.05. If p-value < α → Reject H■.
Steps for ANY Hypothesis Test
Step 1 State H■ and H■
Step 2 Choose α (significance level, usually 0.05)
Step 3 Calculate the test statistic (Z or t)
Step 4 Find p-value OR compare test stat with critical value
Step 5 Decision: If p-value < α → Reject H■. Otherwise → Fail to Reject H■
Step 6 Conclusion in plain English
One-Sample Z-Test (σ known)
Z = (x■ − µ■) / (σ / √n)
µ■ = the claimed/hypothesized population mean
Use when: σ is known and n is large (n ≥ 30)
One-Sample t-Test (σ unknown)
t = (x■ − µ■) / (s / √n) df = n − 1
Use when: σ is unknown and/or n < 30
df = degrees of freedom = n − 1
Types of Hypothesis Tests (Tails)
Type H■ looks like Reject H■ if
Two-tailed µ ≠ µ■ |Z| > Z* or p < α
Statistics Study Guide | Page 4
Right-tailed µ > µ■ Z > Z* or p < α
Left-tailed µ < µ■ Z < −Z* or p < α
EXAMPLE 3 (Easy) — One-Sample Z Test
A factory claims its bolts have a mean length of 5 cm (µ■ = 5). A sample of 49 bolts shows x■ = 5.1 cm.
The population SD σ = 0.35 cm. At α = 0.05, is there enough evidence to reject the factory's claim?
(Two-tailed test)
Step 1 H■: µ = 5 | H■: µ ≠ 5 (two-tailed)
Step 2 α = 0.05; Critical Z* = ±1.96
Step 3 Z = (x■ − µ■)/(σ/√n) = (5.1 − 5)/(0.35/√49) = 0.1/(0.35/7) = 0.1/0.05 = 2.0
Step 4 |Z| = 2.0 > 1.96 (critical value) → Reject H■
Answer ■ There IS sufficient evidence to reject the claim. The mean bolt length is not 5 cm.
EXAMPLE 4 (Hard) — One-Sample t Test
A diet company claims its program reduces weight by more than 5 kg on average. A sample of 20
participants lost: x■ = 5.8 kg, s = 1.5 kg. Test at α = 0.05 if the claim is supported. (Right-tailed test)
Step 1 H■: µ ≤ 5 | H■: µ > 5 (right-tailed)
Step 2 n=20, σ unknown → t-test. df = 20−1 = 19. t* = 1.729 (one-tail, α=0.05, df=19)
Step 3 t = (x■ − µ■)/(s/√n) = (5.8 − 5)/(1.5/√20) = 0.8/(1.5/4.472) = 0.8/0.335 = 2.388
Step 4 t = 2.388 > t* = 1.729 → Reject H■
Answer ■ There is sufficient evidence that the diet reduces weight by MORE than 5 kg.
Statistics Study Guide | Page 5
CHAPTER 3: TWO-SAMPLE TESTS
What Are Two-Sample Tests?
We compare two groups to see if their means (or proportions) are different. Example: Do men and
women differ in average salary? Do two drugs differ in effectiveness?
■ Independent vs Paired
Independent samples: Two completely separate groups. Example: Group A gets Drug X, Group B gets
Drug Y.
Paired samples: Same people measured twice (before/after). Example: weight before and after diet.
Independent Two-Sample t-Test
t = (x■■ − x■■) / √(s■²/n■ + s■²/n■)
H■: µ■ = µ■ (no difference between groups)
H■: µ■ ≠ µ■ or µ■ > µ■ or µ■ < µ■
df (approx): Use smaller of n■−1 and n■−1 (conservative) or Welch formula
Paired t-Test (Same subjects, two measurements)
t = d■ / (s_d / √n) where d■ = mean of (x■ − x■) differences
d_i = difference for each pair = x■■ − x■■
d■ = mean of all differences
s_d = standard deviation of the differences
df = n − 1 (n = number of pairs)
EXAMPLE 5 (Easy) — Independent Two-Sample t-Test
Group A (n=10): x■■=78, s■=5. Group B (n=10): x■■=74, s■=6. Test if Group A scores significantly
higher at α=0.05. (Right-tailed)
Step 1 H■: µ■ = µ■ | H■: µ■ > µ■
Step 2 df = min(10−1, 10−1) = 9; t* = 1.833 (one-tail, α=0.05, df=9)
Step 3 t = (78−74)/√(25/10 + 36/10) = 4/√(2.5+3.6) = 4/√6.1 = 4/2.470 = 1.619
Step 4 t = 1.619 < t* = 1.833 → Fail to Reject H■
Answer ■ Not enough evidence that Group A scores significantly higher than Group B.
Statistics Study Guide | Page 6
EXAMPLE 6 (Hard) — Paired t-Test
A trainer records weights (kg) of 5 clients BEFORE and AFTER a 6-week program. Test if the program
caused significant weight loss at α=0.05. (Left-tailed: after < before)
Client Before After d = Before−After
1 85 82 3
2 90 86 4
3 78 75 3
4 95 91 4
5 88 84 4
Mean d■ = 3.6
Step 1 H■: µ_d = 0 | H■: µ_d > 0 (before > after means weight was lost)
Step 2 d values: 3, 4, 3, 4, 4. d■ = (3+4+3+4+4)/5 = 18/5 = 3.6
Step 3 s_d = √[Σ(d−d■)²/(n−1)]. Deviations: −0.6, 0.4, −0.6, 0.4, 0.4 → Σ(dev²) =
0.36+0.16+0.36+0.16+0.16 = 1.20. s_d = √(1.20/4) = √0.30 = 0.548
Step 4 t = d■/(s_d/√n) = 3.6/(0.548/√5) = 3.6/(0.548/2.236) = 3.6/0.245 = 14.69
Step 5 df=4; t*=2.132 (one-tail, α=0.05, df=4). t=14.69 >> 2.132 → Reject H■
Answer ■ Strong evidence the program caused significant weight loss!
Statistics Study Guide | Page 7
CHAPTER 4: ONE-WAY ANOVA (Analysis of Variance)
What Is ANOVA?
ANOVA tests whether 3 or more group means are equal. Instead of doing multiple t-tests (which
increases error), ANOVA does it in one test.
■ Key Concepts
H■: All group means are equal: µ■ = µ■ = µ■ = ...
H■: At least one group mean is different
Test Statistic: F-ratio = variation BETWEEN groups / variation WITHIN groups
Reject H■ if: F > F_critical (from F-table) or p-value < α
ANOVA Formulas
F = MSB / MSW
SSB (Between): Σ n■(x■■ − x■_grand)² [How far group means are from grand mean]
SSW (Within): ΣΣ (x■■ − x■■)² [How far each value is from its group mean]
MSB: SSB / (k−1) [k = number of groups]
MSW: SSW / (N−k) [N = total number of observations]
df Between: k − 1
df Within: N − k
ANOVA Table Structure
Source SS df MS = SS/df F = MSB/MSW
Between Groups SSB k−1 MSB F
Within Groups SSW N−k MSW —
Total SST N−1 — —
EXAMPLE 7 (Easy) — One-Way ANOVA
Three teaching methods are tested on students. Scores: Method A: 70, 75, 80 | Method B: 85, 90, 95 |
Method C: 60, 65, 70. Test at α=0.05 if the methods differ significantly.
Step 1 Group means: x■_A = 75, x■_B = 90, x■_C = 65. Grand mean = (225+270+195)/9 =
690/9 = 76.67
Step 2 SSB = n[(x■_A−x■)² + (x■_B−x■)² + (x■_C−x■)²] =
3[(75−76.67)²+(90−76.67)²+(65−76.67)²] = 3[2.79+177.79+136.12] = 3×316.70 = 950.10
Statistics Study Guide | Page 8
Step 3 SSW: A:(70−75)²+(75−75)²+(80−75)²=50. B:(85−90)²+(90−90)²+(95−90)²=50.
C:(60−65)²+(65−65)²+(70−65)²=50. SSW=150
Step 4 df_B=3−1=2, df_W=9−3=6. MSB=950.10/2=475.05. MSW=150/6=25.0
Step 5 F = 475.05/25.0 = 19.00. F_critical(2,6, α=0.05) ≈ 5.14
Answer ■ F=19.00 > 5.14 → Reject H■. Teaching methods have significantly different effects!
Statistics Study Guide | Page 9
CHAPTER 5: CHI-SQUARE TESTS
Two Types of Chi-Square Tests
Test Type Purpose Data Type
Goodness of Fit Does data follow an expected distribution? One categorical variable
Test of Independence Are two categorical variables related?
Two categorical variables (contingency table)
The Chi-Square Formula (Same for Both Tests!)
χ² = Σ (O − E)² / E
O = Observed frequency (actual count from data)
E = Expected frequency (what we expect under H■)
For Test of Independence: E = (Row Total × Column Total) / Grand Total
df for Goodness of Fit: k − 1 (k = number of categories)
df for Independence: (rows − 1) × (columns − 1)
EXAMPLE 8 (Easy) — Chi-Square Goodness of Fit
A die is rolled 60 times. Each face should appear 10 times (expected). Observed: 1→8, 2→12, 3→9,
4→11, 5→7, 6→13. Test if the die is fair at α=0.05.
Face O E (O−E) (O−E)² (O−E)²/E
1 8 10 −2 4 0.40
2 12 10 2 4 0.40
3 9 10 −1 1 0.10
4 11 10 1 1 0.10
5 7 10 −3 9 0.90
6 13 10 3 9 0.90
Total 60 60 — — χ²=2.80
Step 1 H■: Die is fair (each face equally likely). H■: Die is not fair.
Step 2 df = k−1 = 6−1 = 5. χ²_critical(5, α=0.05) = 11.07
Step 3 χ² = 2.80 < 11.07 → Fail to Reject H■
Answer ■ No evidence the die is unfair. The die appears to be fair.
EXAMPLE 9 (Hard) — Chi-Square Test of Independence
Statistics Study Guide | Page 10
A survey asks 200 people about their gender (Male/Female) and preference for Tea/Coffee. Is there a
relationship between gender and drink preference? (α=0.05)
Observed Tea Coffee Row Total
Male 30 70 100
Female 50 50 100
Col Total 80 120 200
Step 1 H■: Gender and drink preference are independent. H■: They are related.
Step 2 Expected values E = (Row Total × Col Total) / Grand Total: E(M,Tea)=100×80/200=40.
E(M,Coffee)=100×120/200=60. E(F,Tea)=100×80/200=40.
E(F,Coffee)=100×120/200=60.
Step 3 χ² = (30−40)²/40 + (70−60)²/60 + (50−40)²/40 + (50−60)²/60 = 100/40 + 100/60 + 100/40 +
100/60 = 2.5 + 1.667 + 2.5 + 1.667 = 8.333
Step 4 df = (2−1)(2−1) = 1. χ²_critical(1, α=0.05) = 3.841
Answer ■ χ²=8.333 > 3.841 → Reject H■. Gender and drink preference ARE related!
Statistics Study Guide | Page 11
CHAPTER 6: EXAM KEYWORD → FORMULA CHEAT SHEET
This is the most important section for exams! Learn to recognize keywords in the question that tell you
which formula to use.
Exam Keyword / Hint Use This Formula Short Formula
"Estimate the mean", "95% confident", Confidence Interval for Mean x■ ± Z*(σ/√n) or x■ ± t*(s/√n)
"construct an interval"
"Proportion", "percentage", "fraction of CI for Proportion p■ ± Z*√(p■(1−p■)/n)
people who..."
"Population SD is known" or "n ≥ 30" Use Z-test / Z-interval Z = (x■−µ■)/(σ/√n)
"Population SD unknown" or "n < 30" or Use t-test / t-interval t = (x■−µ■)/(s/√n), df=n−1
"sample SD given"
"Test if mean equals / differs from a One-Sample t or Z Test H■: µ = µ■
specific value"
"Compare two groups", "is there a Two-Sample t-Test t = (x■■−x■■)/√(s■²/n■+s■²/n■)
difference between A and B" (Independent)
"Before and after", "same subjects", Paired t-Test t = d■/(s_d/√n)
"paired", "matched pairs"
"3 or more groups", "compare One-Way ANOVA F-Test F = MSB/MSW
methods/treatments", "ANOVA"
"Follows a distribution?", "goodness of Chi-Square Goodness of Fit χ² = Σ(O−E)²/E, df=k−1
fit", "expected vs observed"
"Relationship between two Chi-Square Independence χ² = Σ(O−E)²/E, df=(r−1)(c−1)
categories?", "independent?", Test
"contingency table"
"Right-tailed" or "greater than" in H■ Right-Tailed Test Reject if test stat > +critical value
"Left-tailed" or "less than" in H■ Left-Tailed Test Reject if test stat < −critical value
"Not equal" or "different from" in H■ Two-Tailed Test Reject if |test stat| > critical value
ALL FORMULAS AT A GLANCE — Quick Reference
Formula Name Formula When to Use
CI Mean (Z) x■ ± Z*(σ/√n) σ known
CI Mean (t) x■ ± t*(s/√n) σ unknown
CI Proportion p■ ± Z*√(p■q■/n) Proportions
One-sample Z Z=(x■−µ■)/(σ/√n) σ known, test mean
One-sample t t=(x■−µ■)/(s/√n) σ unknown, test mean
Statistics Study Guide | Page 12
Two-sample t t=(x■■−x■■)/√(s■²/n■+s■²/n■) Compare 2 groups
Paired t t=d■/(s_d/√n) Before/After
ANOVA F F=MSB/MSW 3+ groups
Chi-Square χ²=Σ(O−E)²/E Categorical data
MEMORY TRICKS FOR BEGINNERS
■ Memory Tricks
Z vs t rule: "If they GIVE you σ (population SD) → use Z. If you only have s (sample SD) → use t."
CI vs HT: "CI = estimate a range. HT = test a claim about a number."
ANOVA trick: "3+ groups? Think F! F = Between ÷ Within."
Chi-Square trick: "Two categorical variables in a table? Always chi-square!"
Paired test trick: "Same people, two measurements? ALWAYS paired t-test!"
p-value rule: "p is low, H■ must go! (If p < α, reject H■)"
Statistics Study Guide | Page 13