0% found this document useful (0 votes)
5 views12 pages

Understanding Hypothesis Testing Basics

The document provides a comprehensive introduction to hypothesis testing, detailing the process of formulating null and alternative hypotheses, conducting statistical tests, and making conclusions based on sample data. It includes definitions, examples, and the significance of Type I errors, along with essential steps for hypothesis testing. The content is supported by references and aims to equip students with the necessary skills for statistical inference in various practical scenarios.

Uploaded by

jarnozildjan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views12 pages

Understanding Hypothesis Testing Basics

The document provides a comprehensive introduction to hypothesis testing, detailing the process of formulating null and alternative hypotheses, conducting statistical tests, and making conclusions based on sample data. It includes definitions, examples, and the significance of Type I errors, along with essential steps for hypothesis testing. The content is supported by references and aims to equip students with the necessary skills for statistical inference in various practical scenarios.

Uploaded by

jarnozildjan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

A Transcript of the Lectures in Hypothesis Testing

Lecture Objective:
The aim of this set of lectures is to formally introduce the concept of a statistical test of hypothesis.
Specifically, it aims to make students understand how to define the population under study, state the
particular hypotheses that will be investigated, give the significance level, perform the calculations
required for the statistical test, reach a conclusion, and talk about possible consequences of these
conclusions.

References Used:
Albert, J., Albacea, J., Ayaay, M., David, I., and de Mesa, I. (2016). Teaching Guide for Senior High School –
Statistics and Probability. Commission on Higher Education K to 12 Transition Program Management Unit.

Bluman, Allan G. (2014). Elementary Statistics: A Step by Step Approach 9th Edition. McGraw Hill Education.
New York NY 10121.

Illowsky, B. and Dean, S. (2018). Introduction to Statistics. OpenStax.

Melosantos, L., Antonio, J., Robles, S., Bruce, R., and Sacluti, J (2016). Math Connections in the Digital Age
Statistics and Probability. Quezon City: Sibs Publishing House, Inc., 2016

Mendelhall, W., Beaver, R., and Beaver, B. (2013). Introduction to Probability and Statistics.
Pacific Grove, Calif. : Brooks/Cole ; Andover : Cengage Learning [distributor], 2013.
_________________________________________________________________________________________________________________________

Lecture 5.1
Hypothesis Testing

Introduction

In practical situations, statistical inference can involve either estimating a population parameter or making
decisions about the value of the parameter. For example, a scientist might want to know whether there is
evidence of global warming; or a physician might want to know whether a new medication will lower a
person’s blood pressure; or an educator might wish to see whether a new teaching technique is better than
a traditional one. These types of questions can be addressed through statistical process called hypothesis
testing.

Lesson Proper

One of the best examples when reasoning is used in a statistical test of hypothesis is similar to the process
in a court trial. In trying a person for theft, the court must decide between innocence and guilt. As the trial
begins, the accused person is assumed to be innocent. The prosecution collects and presents all available
evidence in an attempt to contradict the innocent hypothesis and hence obtain a conviction. If there is
enough evidence against innocence, the court will reject the innocence hypothesis and declare the
defendant guilty. If the prosecution does not present enough evidence to prove the defendant guilty, the
court will find him not guilty. Notice that this does not prove that the defendant is innocent, but merely
that there was not enough evidence to conclude that the defendant was guilty. We use this same type of
1
reasoning to explain the basic concepts of hypothesis testing. These concepts will be used to test the
population parameters discussed in the previous lesson most particularly involving a single population
mean.

Def. Statistical Hypothesis


A statistical hypothesis is a claim or conjecture that may either be true or false. This claim is usually
expressed in terms of the parameter of the population.

Def. Hypothesis Testing, Null Hypothesis, and Alternative Hypothesis


Hypothesis Testing is a decision-making process of obtaining statistical evidence which is sufficiently
necessary to indicate the rejection or non-rejection of the hypothesis under study.

There are two competing hypotheses in a hypothesis testing:

1. The Null hypothesis, denoted as 𝐻0 , is a statistical hypothesis that states that there is no
difference (thus a statement of equality is involved) between a parameter and a specific value,
or that there is no difference between two parameters. It is a contradiction of the alternative
hypothesis, and
2. The Alternative hypothesis, denoted as 𝐻𝑎 , is a statistical hypothesis that states the existence of
a difference between a parameter and a specific value, or states that there is a difference between
two parameters. It is generally the hypothesis that the researcher wishes to support.

Remark. In performing a test of hypothesis, the statistical researcher begins by assuming that the null
hypothesis is true. Next, he will collect a sample data to decide whether the data supports the null
hypothesis. After doing this, he makes one of the following conclusions:

• Reject 𝐻0 and conclude that 𝐻𝑎 is true.


• Do not reject (or accept) 𝐻0 and conclude that 𝐻𝑎 is not true.

Example 5.1.1 State the null and alternative hypotheses for each conjecture using mathematical and
statement forms.
a) A researcher thinks that if expectant mothers use vitamin pills, the birth weight of the babies will
increase. The average birth weight of the population is 8.6 pounds.
b) An engineer hypothesizes that the mean number of defects can be decreased in a manufacturing
process of USB drives by using robots instead of humans for certain tasks. The mean number of
defective drives per 1000 is 18.
c) A psychologist feels that playing soft music during a test will change the results of the test. The
psychologist is not sure whether the grades will be higher or lower. In the past, the mean of the
scores was 73.

Answers to Example 5.1.1


a) 𝐻0 : 𝜇 = 8.6
𝐻𝑎 : 𝜇 > 8.6

b) 𝐻0 : 𝜇 = 18
𝐻𝑎 : 𝜇 < 18
2
c) 𝐻0 : 𝜇 = 73
𝐻𝑎 : 𝜇 ≠ 73

Remark. Note that 𝐻0 always has a equality (=) symbol on it. This is to denote that no difference is involved.
Also, we will see later on that it is more convenient to show support for the alternative hypothesis by
proving that the null hypothesis is false. Hence, the statistical researcher always begins by assuming that
the null hypothesis H0 is true.

Example 5.1.2. We wish to show that the average daily wage of carpenters in Metro Manila is different from
600 php, which is the prevailing daily average.

Solution to Example 5.1.2


𝐻0 : 𝜇 = 600
𝐻𝑎 : 𝜇 ≠ 600

Here would like to reject the null hypothesis, thus concluding that the mean is not equal to 600 php.

Example 5.1.3. The father of a senior high school student lists the expenses he will incur if he sends his
daughter to the university. At the university where he prefers his daughter to study, he hears the average
tuition fee is at least 20,000 php per semester. He wishes to prove that this is not the case.

Solution to Example 5.1.3.

𝐻0 : 𝜇 ≥ 20,000
𝐻𝑎 : 𝜇 < 20,000

In this example the father wishes to reject the null hypothesis, thereby concluding that the average tuition
fee is less than 20,000 php.

𝑯𝟎 𝑯𝒂
= ≠, <, >
≥ <
≤ >

Table 1. Symbols Used for 𝑯𝟎 and 𝑯𝒂

Remark. There’s a difference in the forms of the alternative hypotheses presented in Examples 5.2.1 and
5.1.3. Note that in Example 5.1.2, no directional difference is suggested for the mean hourly wage of
construction workers in Metro Manila. In other words, the population mean is hypothesized to be larger or
smaller than 600 php. This type of test is called a two-tailed test of hypothesis. Meanwhile, in Example
5.1.3, it is suggested that the average tuition fee is less than 20,000 php per semester. This type of test is
called a one-tailed test of hypothesis, specifically, a left-tailed one.
3
Furthermore, it was mentioned previously that the decision used to reject or accept the null hypothesis
depends on the sample drawn from the population. This information takes on these two forms: 1) test-
statistic - a single number computed from the sample data; and 2) 𝑝-value - the probability that a test
statistic is as extreme as or more extreme than the observed value. Either or both of these measures are
used as basis in rejecting or accepting the null hypothesis.

As mentioned in the previous remark, a test statistic is a number computed from a sample data gathered
for the purpose of hypothesis testing. In Example 5.1.2, a good test statistic is the 𝑧-score that corresponds
to the sample mean and sample standard deviation of 𝑛 = 100 construction workers. Now, suppose the
sample of size 𝑛 = 100 produced 𝑥̅ = 650 php with a sample standard deviation 𝑠 = 95 php, we have the
test statistic

650 − 600
𝑧= = 5.26
95
√100

which means that the sample mean 650 php is 5.26 standard deviations from 𝜇 = 600. In hypothesis
testing, we ask if this value calculated from the sample is “extreme” thereby implying that 𝐻0 should be
rejected. How do we then decide whether to reject or not reject 𝐻0 ?

The entire set of possible values of the sample mean can be divided in regions. One set of values support
the alternative hypothesis (called the rejection region) while the other support the null hypothesis
(acceptance region). In Example 5.1.2, the rejection region consists very small and very large values of the
sample mean as shown on the figure below

Rejection Region Acceptance Region Rejection Region


𝑥̅

𝜇 = 600

Critical Critical
Value Value

Figure 1: Acceptance and Rejection Regions

If the test statistic falls in the rejection region, the null hypothesis will be rejected. On the flipside, if it falls
on the acceptance region, the null hypothesis is not rejected or otherwise we say that the test is judged to
be inconclusive. Ultimately, how are the critical values that define the acceptance and rejected regions
determined? In other words, how do we decide how much statistical evidence we need in order to reject
𝐻0 ?

This depends on confidence level that the researcher wants to attach to the test conclusions, and the
significance level (𝛼)—the risk that the researcher is willing to take of making an incorrect decision.

4
Def. Type I Error and the Level of Significance (𝛼)
A Type I Error for a statistical test is an error committed when the null hypothesis is rejected when the
null hypothesis is true. Consequently, the level of significance (𝛼) is

𝛼 = 𝑃(Type I Error) = 𝑃(Reject 𝐻0 given that 𝐻0 is true)

The level of significance (𝛼) describes the maximum tolerable risk the researcher is willing to take to
commit a Type I error. Ideally, 𝛼 should be set as small as possible as it is a probability of error. And so, for
a two-tailed test, the critical values that separate the number line describing the sample mean in
acceptance and rejection regions are the standard scores that corresponds to 𝛼/2 found on the left and the
right tail of the distribution. Meanwhile, for a left-tailed test, the critical value is the standard score that
corresponds to an area equivalent to 𝛼 found on the left tail of the distribution. Lastly, for a right-tailed
test, the critical value is the standard score equivalent to 𝛼 found on the right tail of the distribution. This
is shown on the figures below:

𝛼/2 𝛼/2

𝑧 = −1.96 𝑧 = 1.96
Rejection Acceptance Rejection
Region Region Region

Figure 2: Acceptance and Rejection Regions for a Two-Tailed Test with 𝛼 = 0.05

5
𝛼

𝑧 = 1.64
Acceptance Rejection
Region Region

Figure 3: Acceptance and Rejection Regions for a Right-Tailed Test with 𝛼 = 0.05

𝑧 = −1.64

Rejection Acceptance
Region Region

Figure 4: Acceptance and Rejection Regions for a Left-Tailed Test with 𝛼 = 0.05

Essential Steps in Hypothesis Testing


1. Formulate the null and alternative hypothesis
2. Identify the test statistic to use with the given level significance and distribution; state the
decision rule and specify the rejection region.
3. Using the information based on the sample, compute the value of the test statistic.
4. Make a decision whether to reject or not reject 𝐻0 , and state the conclusion of the test.

6
Example 5.1.4. Suppose the average monthly earnings of a female executive assistant is 40,000 php. Do
men in the same position have an average monthly earnings that are lower than that of women? A random
sample of 50 male executive assistants showed 𝑥̅ = 36,500 php and 𝑠 = 7,000 php. Test the hypothesis
using 𝛼 = 0.01.

Solution to Example 5.1.4.


We would like to show that the average monthly earnings of men in the same position are lower than
40,000 php. This brings us to the following null and alternative hypotheses:

Step 1:
𝐻0 : 𝜇 = 40,000
𝐻𝑎 : 𝜇 < 40,000

Step 2: Furthermore, for this left-tailed test, the values of 𝑥̅ much smaller than 40,000 would lead us to
reject 𝐻0 , thereby implying that we accept 𝐻𝑎 . We will also use the 𝑧-distribution for the test statistic as the
sample size is large enough (that is, 𝑛 ≥ 30). Next, given that the level of significance 𝛼 = 0.01, the critical
value that separates the rejection and acceptance region so that the area on the left tail is equivalent to
𝛼 = 0.01 is 𝑧 = −2.33. The null hypothesis will be rejected if the observed value of the test statistic is less
than or equal to 𝑧 = −2.33.
Now, using the sample information, with 𝑠 being an estimate of the population standard deviation, we
calculate the observed value of the test statistic as

Step 3:
𝑥̅ − 40,000 36,500 − 40,000
𝑧= = ≈ −3.54
7,000 7,000
√50 √50

Step 4: Since the observed value of the test statistic falls into the rejection region, we reject 𝐻0 and conclude
that the average monthly earnings for male executive assistants is smaller than the average monthly
earnings of female executive assistants.

Remark. The whole hypothesis testing process could be shortened. This is demonstrated using the texts in
black color.

Example 5.1.5. The monthly yield of a certain craft distillery is 500 tons for the last twenty five years. Given
the change in climate, the quality control manager wants to know whether this average has changed in the
recent months. She randomly selects 35 months from the computer database and computes the sample
average and the sample standard deviation of the random sample as 𝑥̅ = 492, and 𝑠 = 10 respectively. Test
the appropriate hypothesis at 𝛼 = 0.05.

Solution to Example 5.1.5


In this example, we want to show that the average monthly yield has changed in the recent months. This
means that we want to test the following hypotheses:

Step 1:
𝐻0 : 𝜇 = 500
𝐻𝑎 : 𝜇 ≠ 500

7
Step 2: Further, for this two-tailed test, it should be clear that the null hypothesis will be rejected if the
value of the test statistic is less than −𝑧𝛼/2 , and greater than 𝑧𝛼/2 . Specifically, these 𝑧-scores are 𝑧 = −1.96
and 𝑧 = 1.96.

Now, recall that the most appropriate point estimate for 𝜇 is 𝑥̅ . Thus, the observed value test statistic is

Step 3:
𝑥̅ − 500 492 − 500
𝑧= = = −4.73
10 10
√35 √35

Step 4: Since the test statistic falls into the rejection region then 𝐻0 is rejected, and the manager should
believe that the average had changed in the recent months.

Remark. The probability of rejecting 𝐻0 when 𝐻0 is true (Type I Error) is 𝛼 = 0.05. Note that this is a very
small probability. And so, with regards to the previous problem, the manager should be reasonably
confident that the decision she made is correct.
In addition, when a hypothesis testing process is performed, there are actually four (4) possible outcomes
of the process depending on the actual truth (or falseness) of the null hypothesis, and the decision to reject
it or not. This is summarized in the table below:

Action 𝑯𝟎 is actually true 𝑯𝟎 is actually false

Reject 𝑯𝟎 Type I Error (𝛼) No Error Committed


Do not reject 𝑯𝟎 No Error Committed Type II Error (𝛽)

In statistics and probability, we measure the chance of committing the error so will have a basis in making
a decision.

Lecture 5.2
Hypothesis Testing using the 𝒑-value Method

Lesson Proper

Another way to make a systematic decision whether to accept or reject the null hypothesis is to compare
the preconceived 𝛼 to the 𝑝-value of the observed test statistic.

Def. 𝑝-value
The 𝒑-value (or the observed level of significance of a statistical test) is the actual risk of committing a
Type I error if the null hypothesis is rejected based on the observed value of the test statistic. In other
words, the 𝑝-value is a probability value that denotes the area in the tail beyond the test statistic.

This means that for a right-tailed test, the 𝑝-value is equivalent to the area on right of the test statistic.
Meanwhile, for a left-tailed test, the 𝑝-value is equivalent to the area on the left the test statistic. For a
two-tailed test, the 𝑝-value is equivalent to the sum of the area of its two tails.
8
Remark. A small 𝑝-value indicates that the observed value of the test statistic lies far away from the mean
of the hypothesized mean of the population. A small 𝑝-values implies that there is a strong evidence that
𝐻0 should be rejected. More precisely, if the 𝑝-value is less than or equal to a pre-assigned level of
significance (𝛼), then 𝐻0 is rejected and, we can report that results are statistically significant at level 𝛼.

Example 5.2.1. Determine the 𝑝-values for the hypothesis testing process shown on Example 5.1.4 and
5.1.5.

Solution to Example 5.2.1


Example 5.1.4 is a left-tailed type of test, and the sample data gathered in this test showed an observed test
statistic of −3.54. This means that the equivalent 𝑝-value (total area on the left of −3.54) of this observed
test statistic is

𝑝-value = 𝑃(𝑧 < −3.54) = 0.0002


(computed using the syntax =[Link](−3.54,true))

Note that the pre-assigned level of significance (𝛼) for Example 5.1.4 is 0.01, and 𝐻0 was rejected because
𝛼 ≥ 𝑝-value.

Next, Example 5.1.5 is a two-tailed type of test, and the sample data gathered in this showed an observed
test statistic of −4.73. This means therefore that the 𝑝-value of the test statistic is

𝑝-value = 𝑃(𝑧 < −4.73) + 𝑃(𝑧 > 4.73) = 0.0000012


(computed using the syntax =[Link](−4.73,TRUE)+(1−[Link](4.73,TRUE)))

Clearly, 𝐻0 is rejected since the resulting 𝑝-value is less than 0.05 (the given level of significance of the
problem).

Example 5.2.3. Suppose that the standards set by the Department of Health indicate that Filipinos should
not exceed an average daily sodium intake of 3300 milligrams (mg). To find out whether Filipinos are
exceeding this limit, a sample of 100 Filipinos is selected, and the mean and standard deviation of daily
sodium intake are found to be 3400 mg and 1100 mg, respectively. At 𝛼 = 0.05, perform a test of hypothesis
using the 𝑝-value method.

Solution to Example 5.2.3.


Step 1:
𝐻0 : 𝜇 ≤ 3300
𝐻𝑎 : 𝜇 > 3300

Step 2: For this right-tailed test, we reject 𝐻0 if the 𝑝-value is less than or equal to 𝛼.

Step 3:
3400−3300
𝑝-value = 𝑃 (𝑧 > 1100 ) = 0.1817
√100
(Note that there are other acceptable 𝑝-value answers depending on the value of the test statistic that is
used. For example, if we used 0.91 as the test statistic, then the corresponding 𝑝-value is 0.1814)

9
Step 4: The resulting 𝑝-value is greater than the pre-assigned level of significance. And so, it’s clear that
𝐻0 should not rejected, and we should say that the result is not statistically significant.

We say even further that there is not enough evidence to conclude that the average daily sodium intake of
Filipinos is greater than 3300 milligrams.

Remark. If we are reading a research report, how small should be the 𝑝-value, before we decide to reject
𝐻0 ? Researchers use a sliding scale to classify results:
• If the 𝑝-value is less than .01, 𝐻0 is rejected. The results are classified as highly significant.
• If the 𝑝-value is between .01 and .05, 𝐻0 is rejected. The results are classified as statistically
significant.
• If the 𝑝-value is between .05 and .10, 𝐻0 is usually not rejected. The results are classified as only
tending toward statistical significance.
• If the 𝑝-value is greater than .10, 𝐻0 is not rejected. The results are classified as not statistically
significant.

Remark. Similar for creating confidence intervals, the Students’ 𝑡-distribution is used for the test statistic
if the standard deviation (𝜎) is unknown and the sample size is small (that is, 𝑛 < 30).

Example 5.2.4. A physician claims that a jogger’s maximal volume oxygen uptake (MVOU) is greater than
the average of all adults. A random sample of 15 joggers is selected and it was determined that the mean
of the sample is 40.6 milliliters per kilogram (ml/kg), and the standard deviation of the sample is 6 ml/kg.
If another credible available data available indicates that the average MVOU of all adults is 36.7 ml/kg, is
there enough evidence to support the physician’s claim at 𝛼 = 0.05?

Solution to Example 5.2.4.


Step 1:

𝐻0 : 𝜇 = 36.7
𝐻𝑎 : 𝜇 > 36.7

Step 2: For this right-tailed test, we reject 𝐻0 if the 𝑝-value is less than or equal to 𝛼.

Step 3:
40.6−36.7
𝑝-value = 𝑃 (𝑡 > 6 ) = 𝑃(𝑡 > 2.52) = 0.0123
√15
(This is computed using the syntax =[Link](2.52,14,TRUE) or =[Link](2.52,14))

Step 4: The resulting 𝑝-value is less than the pre-assigned level of significance. And so, it’s clear that 𝐻0
should be rejected, and we conclude that there is enough evidence to support the claim that the joggers’
maximal volume oxygen uptake is greater than 36.7 ml/kg.

Example 5.2.6. A local report by stated that on the average, a woman visits her physician 5.8 times a year.
A researcher randomly selects 20 women and obtained the data below.

10
3 2 1 3 7 2 9 4 6 6
8 0 5 6 4 2 1 3 4 1

At 𝛼 = 0.05, can it be concluded that the average is still 5.8 visits per year? Use the 𝑝-value method.

Solution to Example 5.2.6.


Based from the given, we form the appropriate hypotheses

𝐻0 : 𝜇 = 5.8
𝐻𝑎 : 𝜇 ≠ 5.8

Step 2: For this two-tailed test, we reject 𝐻0 if the 𝑝-value is less than or equal to 𝛼.

Step 3:
Using MS Excel, we can compute the statistics of the sample through the following:

Bringing us with the test statistic 𝑡 as

3.85 − 5.8
𝑡= ≈ −3.46
2.52
√20

Now, the 𝑝-value is computed as

𝑝-value = 𝑃(𝑡 < −3.46) + 𝑃(𝑡 > 3.46) = 2 ∙ 𝑃(𝑡 < −3.46) = 2 ∙ 𝑃(𝑡 > 3.46) = 0.0026
(This is computed using the syntax =2*[Link](3.46,19))

Step 4: The resulting 𝑝-value is less than 0.05 (the pre-assigned level of significance). And so, it’s clear that
𝐻0 should be rejected, and we say therefore that there is enough evidence to conclude that the average
visiting frequency of women to their physician is not 5.8 times a year.

11
Supplementary Exercises

1. A random sample of 100 observations from a quantitative population produced a sample mean of
26.8 and a sample standard deviation of 6.5. Use the 𝑝-value approach to determine whether the
population mean is different from 28. Explain your conclusions.

2. The Old Farmer’s Almanac stated that the average consumption of water per person per day was
123 gallons. To test the hypothesis that this figure may no longer be true, a researcher randomly
selected 16 people and found that they used on average 119 gallons per day and s = 5.3. At alpha
equal to 0.05, is there enough evidence to say that the almanac’s figure might no longer be correct?
Use the 𝑝-value method.

3. Some sports that involve a significant amount of running, jumping, or hopping put participants at
risk for Achilles tendinopathy (AT), an inflammation and thickening of the Achilles tendon. A study
in The American Journal of Sports Medicine looked at the diameter (in mm) of the affected tendons
for patients who participated in these types of sports activities. Suppose that the Achilles tendon
diameters in the general population have a mean of 5.97 millimeters (mm). When the diameters of
the affected tendon were measured for a random sample of 31 patients, the average diameter was
9.80 with a standard deviation of 1.95 mm. Is there sufficient evidence to indicate that the average
diameter of the tendon for patients with AT is greater than 5.97 mm? Test at the 5% level of
significance.

- End of Lecture Transcript -

12

You might also like