Stat Worksheet - 2
1. A random sample of 40 new laptops is taken, and their average battery life is measured
to be 8.5 hours.
Answer the following:
a) What is the population in this scenario?
b) What is the sample?
2. Determine whether to use the z-distribution, t-distribution, or neither, and explain why.
a) Sample size n = 60, population standard deviation (σ) is unknown, sample standard
deviation (s) = 15, population distribution is skewed
b) Sample size n = 18, population standard deviation (σ) = 3, population is normally
distributed.
c) Sample size n = 25, population standard deviation (σ) is unknown, sample standard
deviation (s) = 8, population distribution is heavily skewed.
d) Sample size n = 22, population standard deviation (σ) is unknown, sample standard
deviation (s) = 6, population is approximately normally distributed.
Answer
a)
Normal (z) distribution. Reasoning: The sample size (n = 60) is greater than or
equal to 30 (n ≥ 30), so the sample standard deviation (s) can be used as an
estimate for the population standard deviation (σ), and the sampling distribution of
the mean can be approximated by the normal distribution, regardless of the
population distribution.
b)
Normal (z) distribution. Reasoning: Even though the sample size (n = 18) is less
than 30 (n < 30), the population standard deviation (σ) is known, and the population
is normally distributed.
c)
Neither. Reasoning: The sample size (n = 25) is less than 30 (n < 30), the
population standard deviation (σ) is unknown, AND the population distribution is
skewed. In this case, neither the normal distribution nor the t-distribution can be
reliably used.
d)
t-distribution. Reasoning: The sample size (n = 22) is less than 30 (n < 30), the
population standard deviation (σ) is unknown, and the population is approximately
normally distributed.
3. Find the critical value(s) for the following scenarios:
a) What is the z-score for a 95% confidence level?
b) What is the t-score for a 90% confidence level with n = 10?
4. A random sample of 50 commuters found their average daily travel time to work was 35
minutes, with a population standard deviation (σ) of 8 minutes. Construct a 90%
confidence interval for the true mean daily travel time of all commuters
a) Calculate the margin of error (E)
b) Construct the confidence interval
c) Interpret the interval in context
Answer
a) Calculate the margin of error (E).
Given: n = 50,x̄ = 35 , σ = 8, c = 90%.
For c = 90%, the critical z-score ( ) is 1.645.
zc
E= z c ∗ (σ/√n)
E=1 .645 ∗ (8/√50) = 1.645 ∗ (8/7.071) ≈ 1.645 ∗ 1.131 ≈ 1.859 minutes.
b) Construct the confidence interval.
Confidence Interval = . x̄ ± E
Left endpoint = x̄ − E = 35 − 1.859 =33.141.
Right endpoint = x̄ + E = 35 + 1.859 =36.859.
The 90% confidence interval is 33.141 < μ < 36.859.
c) Interpret the confidence interval in the context of the problem.
We are 90% confident that the true average daily travel time to work for all
commuters is between 33.14 and 36.86 minutes
5. A random sample of 15 energy drinks showed an average of 125 mg of caffeine with a
sample standard deviation (s) of 12 mg. Assume the caffeine content is normally
distributed. Construct a 95% confidence interval for the true mean caffeine content of all
such energy drinks
a) Find the degrees of freedom and critical value ) (t c
b) Calculate the margin of error (E)
c) Construct the confidence interval
d) Interpret the interval in context
Answer
a) Calculate the degrees of freedom (d.f.) and find the critical value ( ).
tc
Given: .
n = 15, x̄ = 125, s = 12, c = 95
14.
d. f . = n − 1 = 15 − 1 =
To find for a 95% confidence level and
tc , look in a t-distribution table
d. f . = 14
under the "Two tails, α = 0.05" column for the row corresponding to d.f. = 14.
tc =2.145.
b) Calculate the margin of error (E).
.
E = t c ∗ (s/√n)
E = 2.145 ∗ (12/√15) = 2.145 ∗ (12/3.873) ≈ 2.145 ∗ 3.098 ≈ 6.643 mg.
c) Construct the confidence interval.
Confidence Interval = . x̄ ± E
Left endpoint = 118.357.
x̄ − E = 125 − 6.643 =
Right endpoint = 131.643.
x̄ + E = 125 + 6.643 =
The 95% confidence interval is 118.357 < μ < 131.643.
d) Interpret the confidence interval in the context of the problem.
We are 95% confident that the true average caffeine content of all such
energy drinks is between 118.36 mg and 131.64 mg".
6. A survey of 800 registered voters found that 440 plan to vote in the upcoming local
elections.
a) Find the point estimate for the population proportion ( ) of registered voters who plan
p̂
to vote.
b) Verify that the sampling distribution can be approximated by the normal distribution.
c) Construct a 99% confidence interval for the population proportion of registered
voters who plan to vote.
d) Interpret the confidence interval in the context of the problem.
Answer
a.
Given: n = 800, x = 440 .
p̂ = x/n .
p̂ = 440/800 = 0.55.
b.
Conditions: np̂ ≥ 5 and nq̂ ≥ 5 .
p̂ = 0.55
q̂ = 1 − p̂ = 1 − 0.55 = 0.45 .
440.
np̂ = 800 ∗ 0.55 =
360.
nq̂ = 800 ∗ 0.45 =
Since both 440 and 360 are greater than or equal to 5, the sampling distribution can
be approximated by the normal distribution.
c.
For c = 99%, the critical z-score (z c ) is 2.575.
E = z c ∗ √(p̂q̂/n).
E = 2.575 ∗ √((0.55 ∗ 0.45)/800) = 2.575 ∗ √(0.2475/800) =
2.575 ∗ √(0.000309375) ≈ 2.575 ∗ 0.01759 ≈ 0.0454.
Confidence Interval .
= p̂ ± E
Left endpoint 0.5046.
= 0.55 − 0.0454 =
Right endpoint 0.5954.
= 0.55 + 0.0454 =
The 99% confidence interval is 0.5046 < p < 0.5954.
d.
Based on this survey, we are 99% confident that between 50.47% and 59.53% of all
registered voters plan to vote in the upcoming election
7. A researcher wants to estimate the mean height of a certain population with a 99%
confidence level and a margin of error of no more than 1.5 cm. From a preliminary study,
the population standard deviation (σ) is estimated to be 8 cm. What is the minimum
sample size (n) required?
Answer
Given: c = 99%, E = 1.5, σ = 8 .
For c = 99%, z c = 2.575.
n = (
∗
z ⋅σ
)
2
.
.
E
n = (2.575 ∗ 8/1.5) ² = (20.6/1.5)² ≈ (13.733)² ≈ 188.60
Always round up to the next whole number. Minimum sample size required is
189.
8. A marketing firm wants to estimate the proportion of consumers who prefer a new
product, with 95% confidence and a margin of error of no more than 3%. A preliminary
study suggests that about 60% of consumers prefer the product. What is the minimum
sample size (n) needed?
Answer
Given: c = 95%, E = 0.03, p̂ = 0.60 .
For c = 95%, z c = 1.96.
q̂ = 1 − p̂ = 1 − 0.60 = 0.40 .
n = (z c ² ∗ p̂q̂)/E² .
n = (1.96 ² ∗ 0.60 ∗ 0.40)/0.03² = (3.8416 ∗ 0.24)/0.0009 = 0.921984 / 0.0009 ≈
1024.42.
Always round up. Minimum sample size needed is 1025.
9. If no preliminary estimate for the proportion ( ) is available in question (8), what minimum
p̂
sample size ( ) would be required?
n
Answer
Given: c = 95%, E = 0.03 . No preliminary . p̂
If no preliminary estimate for is available, use
p̂ p̂ = 0.5 and .
q̂ = 0.5
For .
c = 95%, z c = 1.96
²
n = (z c ².
∗ p̂q̂)/E
n = (1.96 ² ∗ 0.5 ∗ 0.5)/0.03² = (3.8416 ∗ 0.25)/0.0009
= 0.9604/0.0009 ≈ 1067.11
Always round up. Minimum sample size needed is 1068.
10. For each claim, state:
Null hypothesis ( ₀)
H
Alternative hypothesis ( ₐ)
H
Type of test (left-tailed, right-tailed, two-tailed)
a) A company claims that the average lifespan of their new LED light bulbs is more than
20,000 hours.
b) A nutritionist believes that the average daily calorie intake for teenagers is at most
2200 calories.
c) A university states that the average age of its undergraduate students is 21 years old.
d) A politician claims that less than 35% of the population supports a new policy
Answer
a.
H₀: μ ≤ 20,000.
Hₐ: μ > 20,000 (Claim) - uses '>', so goes in Hₐ
Type of test: Right-tailed test (because Hₐ contains '>').
b.
H₀: μ ≤ 2200 (Claim)
Hₐ: μ > 2200
Type of test: Right-tailed test (because Hₐ contains '>').
c.
H₀: μ = 21 (Claim)
Hₐ: μ ≠ 21
Type of test: Two-tailed test (because Hₐ contains '≠').
d.
H₀: p ≥ 0.35
Hₐ: p < 0.35 (Claim) - uses '<', so goes in Hₐ
Type of test: Left-tailed test (because Hₐ contains '<').
Note
H₀ = what we assume (believe) is currently true - often uses equality(
)
≤, =, ≥
Hₐ = what we are trying to prove - often use inequality
11. A new drug is being tested, and the manufacturer claims that the drug cures a specific
disease in 75% of patients
a) State H₀ and Hₐ
b) Describe a Type I error in context
c) Describe a Type II error in context
Answer
a.
H₀: p = 0.75 (Claim).
Hₐ: p ≠ 0.75.
b.
A Type I error occurs if the null hypothesis is rejected when it is true. In this context, it
means concluding that the drug does NOT cure 75% of patients (i.e., p ≠ 0.75) when
the drug actually DOES cure 75% of patients (i.e., p = 0.75).
c.
A Type II error occurs if the null hypothesis is not rejected when it is false. In this
context, it means concluding that the drug DOES cure 75% of patients (i.e., p = 0.75)
when the drug actually DOES NOT cure 75% of patients (i.e., p ≠ 0.75).
12. A school principal claims that the average score on a standardized test for students in her
school is 80. A random sample of 35 students from the school is taken, and their average
score is found to be 77 with a sample standard deviation (s) of 10. Test the principal's
claim at a level of significance (α) of 0.05
Claim: Average test score is 80. Sample of 35 students: mean = 77, s = 10, α = 0.05.
a) State H₀ and Hₐ
b) Use z-test or t-test? Justify.
c) Calculate the test statistic
d) Determine the critical value(s) and sketch rejection region(s)
e) Make a decision
f) Interpret in context
Answer
a)
H₀: μ = 80 (Claim).
Hₐ: μ ≠ 80.
b)
Use a z-test.
Reasoning: The sample size (n = 35) is greater than or equal to 30 (n ≥ 30). In this
case, the sample standard deviation (s = 10) can be substituted for the population
standard deviation (σ), and the normal distribution can be used.
c)
Test statistic .
z = (x̄ − μ)/(s/√n)
z = (77 − 80)/(10/√35 )
= −3/(10/5.916) = −3/1.690 ≈ -1.775.
d)
This is a two-tailed test with α = 0.05.
The critical values ( ) for a 5% significance level in a two-tailed test are ±1.96.
z0
The rejection regions are where z < -1.96 or z > 1.96.
e)
The calculated test statistic is z = -1.775.
Since -1.775 is not in the rejection region (it is between -1.96 and 1.96), we fail to
reject H₀.
f)
"At the 5% level of significance, there is not enough evidence to reject the claim
that the average score on a standardised test for students in her school is 80".
13. A soft drink company claims that its bottles contain an average of 500 ml of liquid. A
quality control manager randomly samples 12 bottles and finds the mean volume to be
497 ml with a sample standard deviation (s) of 5 ml. Assume the volume of liquid in
bottles is normally distributed. Test the company's claim at a level of significance (α) of
0.01
Claim: Soft drink bottles contain 500 ml. Sample of 12 bottles: mean = 497 ml, s = 5 ml.
Normal distribution assumed, α = 0.01.
a) State H₀ and Hₐ
b) Calculate degrees of freedom (d.f.) and critical value (t₀)
c) Calculate test statistic (t)
d) Make a decision
e) Interpret in context
Answer
a)
H₀: μ = 500 (Claim).
Hₐ: μ ≠ 500.
b)
d.f. = n - 1 = 12 - 1 = 11.
This is a two-tailed test with α = 0.01. To find the critical t-value(s) for d.f. = 11 and α
= 0.01 (two-tailed), you would use a t-distribution table.
t 0 = ±3.106
c)
Test statistic .
t = (x̄ − μ)/(s/√n)
t = (497 − 500)/(5/√12) = −3/(5/3.464)
= −3/1.443 ≈ -2.079.
d)
Comparing the calculated test statistic t = -2.079 to the critical values:
Since:
3.106 < −2.078 <3.106
Therefore, we fail to reject H₀.
e)
"At the 1% level of significance, there is not enough evidence to reject the claim
that the average volume of liquid in the bottles is 500 ml".
14. A government official claims that the proportion of citizens who own a smartphone is
more than 85%. In a random sample of 250 citizens, 220 reported owning a smartphone.
Test this claim at a level of significance (α) of 0.05
a) State H₀ and Hₐ
b) Verify z-test conditions
c) Calculate test statistic (z)
d) Find critical value(s) and sketch rejection region(s)
e) Make a decision
f) Interpret in context
Answer
a)
H₀: p ≤ 0.85.
Hₐ: p > 0.85 (Claim).
b)
Conditions: and
np ≥ 5 , using the claimed proportion from H₀.
nq ≥ 5
n = 250, (from H₀). So,
p = 0.85 .
q = 1 − p = 1 − 0.85 = 0.15
= 212.5.
np = 250 ∗ 0.85
= 37.5.
nq = 250 ∗ 0.15
Since both 212.5 and 37.5 are greater than or equal to 5, the conditions are met,
and the normal approximation for the binomial distribution can be used.
c)
First, calculate the sample proportion: p̂ = x/n = 220/250 = 0.88.
Test statistic .
z = (p̂ − p)/√(pq/n)
z = (0.88 − 0.85)/√((0.85 ∗ 0.15)/250)
= 0.03/√(0.1275/250) = 0.03/√(0.00051)
= 0.03/0.02258 ≈ 1.328.
d)
This is a right-tailed test with α = 0.05.
The critical value ( ) for a 5% significance level in a right-tailed test is 1.645.
z0
The rejection region is where z > 1.645.
e) Make a decision to reject or fail to reject H₀.
The calculated test statistic is z = 1.328.
Since 1.328 is not in the rejection region (1.328 < 1.645), we fail to reject H₀.
f)
"At the 5% level of significance, there is not enough evidence to support the
claim that the proportion of citizens who own a smartphone is more than 85%".
15. A survey investigates pet preference (Dog, Cat, Other) vs. living situation (Apartment,
House).
Dog Cat Other Total
Apartment 40 30 10 80
House 70 50 20 140
Total 110 80 30 220
a) State the H₀ and Hₐ (independence test)
b) Calculate expected frequencies (E) for each cell
c) List conditions for chi-square test
d) Calculate degrees of freedom (d.f.)
e) (Advanced) Calculate the chi-square test statistic (χ²)
f) find critical value
g) make a decision and interpret
Answer
a)
H₀:: Pet preference is independent of living situation
Hₐ: Pet preference is not independent of living situation
(i.e., there is an association** between pet preference and living situation)
b)
Formula:
(Row total)×(Column total)
E =
calculate E for each cell:
Grand total
Dog Cat Other
Apartment 80⋅110
220
= 40
80⋅80
220
= 29.09
80⋅30
220
= 10.91
House 140⋅110
220
= 70
140⋅80
220
= 50.91
140⋅30
220
= 19.09
Dog Cat Other
Apartment 40 29.09 10.91
House 70 50.91 19.09
c)
Valid if all these are true:
1. The observed frequencies must be obtained by using a random sample.
2. Each expected frequency must be greater than or equal to 5. (All
calculated expected frequencies are >= 5, so this condition is met).
d)
df = (r − 1)(c − 1) = (2 − 1)(3 − 1) = 1 × 2 = 2
Degrees of freedom: 2
e)
Formula: 2
(O−E)
χ2 = ∑
E
Living Pet O E ² (O−E)
2
Situation (Observed) (Expected)
O − E (O − E)
E
Apartment Dog 40 40.00 0.00 0.0000 0.0000
Apartment Cat 30 29.09 0.91 0.8281 0.0285
Apartment Other 10 10.91 -0.91 0.8281 0.0759
House Dog 70 70.00 0.00 0.0000 0.0000
House Cat 50 50.91 -0.91 0.8281 0.0163
House Other 20 19.09 0.91 0.8281 0.0434
2
(O−E)
χ2 = ∑ = 0 + 0.0285 + 0.0759 + 0 + 0.0163 + 0.0434 = 0.1641
E
f)
α = 0.05
df = 2
From chi-square table:
2
χ = 5.991
critical
g)
Calculated χ
2
= 0.1641
Critical value = 2
χ critical = 5.991
Since χ
2
, we fail to reject
< χ
2
H0
Interpretation: At the 0.05 level of significance, there is not enough evidence to
critical
suggest that pet preference is related to living situation.
2
χ = 5.991
critical