Hypothesis Testing
Learning outcomes
Students should be able to:
• Formulate and test hypotheses.
• Distinguish between simple and composite hypotheses
• Conduct tests on means and variances.
• Interpret one- and two-tailed tests.
• Construct and interpret confidence intervals.
Concept of Hypothesis Testing
• Hypothesis: An assumption about a population
parameter.
• Null hypothesis (H₀): The default claim (e.g., μ = μ₀).
• Alternative hypothesis (Hₐ): Competing claim (e.g., μ
≠ μ₀).
• Goal: Use sample data to decide whether to reject H₀.
Hypothesis Testing
• hypothesis testing allows an experimenter to assess
the plausibility or credibility of a specific statement or
hypothesis.
• An experimenter may be interested in the plausibility
that the population mean is equal to a specific fixed
value, say μ = 20.
• If this fixed value is denoted by , then the
experimenter’s statement may formally be described by
a null hypothesis
•.
• The word hypothesis indicates that this statement will
be tested with an appropriate data set.
Hypothesis Testing
•.
• It is useful to associate a null hypothesis with an
alternative hypothesis, which is defined to be the
“opposite” of the null hypothesis.
• The null hypothesis above has an alternative hypothesis
Hypothesis Testing - steps
• State H₀ and Hₐ.
• Choose significance level (α).
• Compute test statistic.
• Determine p-value or critical region.
• Make decision (reject/accept H₀).
• State conclusion in context.
Hypothesis Testing - steps
Errors in Hypothesis Testing
• Type I - Rejecting H₀ when it’s true
• Type II - Accepting H₀ when it’s false
Composite vs Simple Hypotheses
• Simple hypothesis: Specifies exact value of a
parameter (e.g., H₀: μ = 50).
• Composite hypothesis: Specifies a range or inequality
(e.g., Hₐ: μ > 50).
Tests on the Mean (t-test)
• For small samples (n < 30):
• Compare . If , reject H₀.
• For large samples (n ≥ 30): Use z-test with σ known:
Tests on the variance - : Test
• Use Chi-square test for a population variance:
• Degrees of freedom = n − 1.
• Compare χ² with critical values from the χ² distribution.
Hypothesis Testing - p values
• The plausibility of a null hypothesis is
measured with a p-value, which is a
probability that takes a value between 0
and 1.
• A p-value is constructed from a data set.
• A useful way of interpreting a p-value is
to consider it as the plausibility or
credibility of the null hypothesis.
• The p-value is directly proportional to the
plausibility of the null hypothesis, the
smaller the p-value, the less plausible is
the null hypothesis.
Rejection of the Null Hypothesis
• If the p-value is very small, less than 1% say, then an
experimenter can conclude that the null hypothesis is not
a plausible statement.
• A p-value less than 0.01 indicates to the experimenter that
the null hypothesis is not a credible statement.
• The experimenter can then consider the alternative
hypothesis to be true. The null hypothesis is rejected in
favor of the alternative hypothesis.
Acceptance of the Null
Hypothesis
• A p-value larger than 0.10 is generally taken to indicate
that the null hypothesis is a plausible statement, is
therefore accepted.
• However, this does not mean that the null hypothesis has
been proven to be true, it indicates that the data set does
not provide enough evidence to reject the null hypothesis,
but it does not indicate that the null hypothesis has been
proven to be true.
Acceptance of the Null
Hypothesis
• A p-value in the range 1%–10% is in an intermediate area,
there is some evidence that the null hypothesis is not
plausible, but the evidence is not overwhelming.
• The experiment is inconclusive but suggests that perhaps
a further look at the problem is warranted.
• If possible, the experimenter may wish to collect more
information, that is, a larger data set, to help clarify the
matter.
• Sometimes a cutoff value of 0.05 is used and the null
hypothesis is accepted if the p-value is larger than 0.05
and is rejected if the p-value is smaller than 0.05.
Calculation of p-Values
• The p-value (based on an observed data) is the probability
of obtaining this data set or worse when the null hypothesis
is true, “worse” data set is one that has less affinity with the
null hypothesis.
• If a data set of n observations is obtained, and the observed
sample mean and standard deviation are and s, respectively.
The “discrepancy” between the data set and the null
hypothesis is measured through a t-statistic
• The discrepancy is smallest when , which gives t = 0, the
sample mean coincides exactly with the hypothesized value
of the population mean.
• The discrepancy between the data set and the null
hypothesis increases as the |t| of the t-statistic increases
Calculation of p-Values
• A data set is considered to be “worse” (to have less affinity
with the null hypothesis) than the observed data set if it
has a t-statistic with an absolute value larger than |t|,
Calculation of p-Values - Two-
Sided t-Test
• A Two-Sided t-Test is used to test if a sample mean
significantly differs from a hypothesized population mean.
• Null Hypothesis (H₀):
• The population mean (μ) is equal to the hypothesized value
(μ₀).
• Alternative Hypothesis (Hₐ):
• The population mean (μ) is different from the hypothesized
value (μ₀).
• Calculation of the p-value:
• Formula for the p-value: , where follows a t-distribution with
degrees of freedom.
• Calculated the test statistic (t) as:
Calculation of p-Values - Two-
Sided t-Test
• Decision Rule:
• If the p-value is less than the significance level (α), reject
the null hypothesis (H₀). Otherwise, fail to reject H₀.
• Example
• A survey team has measured the vertical deviation (in
millimeters) from the design reference plane of an interior
wall using a terrestrial laser scanner. The goal is to assess
whether the deviations are statistically different from zero
(i.e., the wall is perfectly vertical).
• Sample Data: The 20 sampled deviation measurements
(in mm) are:
Calculation of p-Values - Two-
Sided t-Test
• Null Hypothesis (H₀): , The mean deviation is zero (i.e.,
the wall is perfectly vertical).
• Alternative Hypothesis (Hₐ): , The mean deviation is
not zero (i.e., the wall is not perfectly vertical).
• Calculate the Sample Mean & Sample Standard Deviation
(s)
• Calculate the t-statistic
• Determine the Degrees of Freedom (df)
• Find the p-value
• Make a Decision
Calculation of p-Values - Two-
Sided t-Test
• At a significance level of , we compare the p-value to :
• If , reject the null hypothesis.
• If , fail to reject the null hypothesis.
One-Sided t-Test
• A one-sided t-test is used to test if the population
mean is either greater than or less than a
hypothesized value , but not both.
• Hypothesis Setup
• There are two possible forms of a one-sided test:
Case 1: Testing if the mean is greater than a
threshold
• You're checking if the mean is significantly greater
than .
• p-value = .
One-Sided t-Test
• Case 2: Testing if the mean is less than a threshold
• , You're checking if the mean is significantly less than
• p-value = .
• Test Statistic
• Compute t-statistic.
• Go to the row for df = n - 1.
• For: Right-tailed test (): Find Left-tailed test (): Find
• The result is your one-tailed p-value (no need to
double it, unlike two-sided test).
One-Sided t-Test
• Decision Rule
• If , reject .
• If , fail to reject .
Tests on Variance
• Used to test population variance (σ²).
• Test statistic: χ² = (n−1)s² / σ₀²
• Degrees of freedom = n − 1.
• Decision: Reject H₀ if χ² < χ²ₐ/₂ or χ² > χ²₁−ₐ/₂.
Confidence Intervals
• Confidence Interval provides a range for plausible
parameter values.
• 95% CI for mean (μ): x̄ ± tₐ/₂,n−1 × (s / √n)
• If μ₀ lies outside the CI → reject H₀.
• CI complements hypothesis testing results.
Decision Regions: One- and Two-
Tailed Tests
• One-tailed (Right): Reject H₀ if test statistic > critical
value.
• One-tailed (Left): Reject H₀ if test statistic < critical
value.
• Two-tailed: Reject H₀ if |test statistic| > critical value.
• Rejection region defined by significance level (α).
Summary of Hypothesis Testing
• 1. Formulate H₀ and Hₐ.
• 2. Choose α and determine the test statistic.
• 3. Compute p-value or compare with critical value.
• 4. Make statistical decision (reject/fail to reject H₀).
• 5. Interpret results in real-world context.
• 6. Use confidence intervals for additional insight.