Understanding Hypothesis Testing Basics
Understanding Hypothesis Testing Basics
SCHOOL OF ENGINEERING
SCHOOL OF SYSTEMS
ESTOCASTICA II
Hypothesis Testing
Mérida, 2015
1
Hypothesis Testing
Type I Error
If you reject the null hypothesis when it is true, you make a mistake.
type I. The probability of committing a type I error is α, which is the level of
significance that you establish for your hypothesis test. An α of 0.05 indicates
that you are willing to accept a 5% chance that you are wrong
when rejecting the null hypothesis. To reduce this risk, you must use a value
lower for α. However, if you use a lower value for alpha, it means that
will be less likely to detect a true difference, if
it really exists.
2
Type II error
When the null hypothesis is false and you do not reject it, you commit a Type II error.
II. The probability of making a Type II error is β, which depends on the power.
from the test. You can reduce your risk of making a Type II error by ensuring
that the test has sufficient power. To achieve this, ensure that the size
the sample should be large enough to detect a difference
practice when it really exists.
The probability of rejecting the null hypothesis when it is false is equal to 1–β.
This value is the power of the test.
3
A type I error occurs if the researcher rejects the null hypothesis and
it concludes that the two medications are different when, in reality, they are not.
If the medications have the same effectiveness, the researcher could consider that
this error is not very serious, because nevertheless the patients would benefit
with the same level of effectiveness regardless of the medication they take.
However, if a type II error occurs, the researcher does not reject the hypothesis.
null when it should reject it. That is, the researcher concludes that the
medications are the same when in reality they are different. This mistake can
put the lives of patients at risk if the medication is put on sale less
cash instead of the most effective medication.
4
Figure A shows that the results of a one-tailed Z test are
significant if the test statistic is equal to or greater than 1.64, the critical value
In this case, the shaded area represents 5% (α) of the area below the
curve. Figure B shows that the results of a two-tailed Z test are
significant if the absolute value of the test statistic is equal to or greater than
1.96, the critical value in this case. The two shaded areas sum up to 5% (α) of
area under the curve.
This gives a critical value of 1.83311. If the absolute value of the t statistic is
greater than this critical value, then you can reject the null hypothesis, H0,
at the significance level of 0.10.
5
Use of an analysis of variance (ANOVA) to calculate a critical value
This gives a critical value of 4.25649. If the F statistic is greater than this
critical value, then you can reject the null hypothesis, H0, at the level of
significance of 0.05.
Generally, the critical region for the alternative hypothesis θ > θ 0lies in the queue
right of the distribution of the test statistic; while the critical region
for the alternative hypothesis θ < θ0it lies in the left queue.
6
Example of One-Tailed Hypothesis Test
The critical region is divided into two parts, which often have probabilities
equal values placed at each tail of the test statistic distribution.
The alternative hypothesis θ ≠ θ0it states that either θ < θ0what θ > θ0.
7
Example of Two-Tailed Hypothesis Test
Value P
8
Interpretation
The null hypothesis is rejected if the p-value associated with the observed result is
equal to or less than the established significance level, conventionally 0.05 or
0.01. That is to say, the p-value shows us the probability of having obtained the result.
what we have obtained if we assume that the null hypothesis is true.
If the p-value is lower than the significance level, it indicates that it is most likely
that the starting hypothesis is false. However, it is also possible that
we are facing an atypical observation, so we would be making the mistake
statistic for rejecting the null hypothesis when it is true based on
we have had the misfortune of encountering an atypical observation. This type of
errors can be corrected by lowering the p-value; a p-value of 0.05 is used in
usual sociological investigations while p-values of 0.01 are used
in medical research, where making a mistake can lead to
more serious consequences. It can also be addressed to rectify this error
increasing the size of the obtained sample, which reduces the possibility that
the obtained data is randomly rare.
It is important to emphasize that a null hypothesis test does not allow for acceptance.
a hypothesis; simply accepts it or rejects it, that is to say that it labels it as
plausible (which does not necessarily mean that it is true, just that it is
more likely to be so) or implausible.
9
Exercises
The probability that the average of the samples exceeds 20.75 minutes due to
randomness is calculated as follows:
With this abscissa, the probability (area to the right) is calculated, resulting in
0.0304. Graphically:
10
Now suppose that the actual average drying time is µ=21 min. Then,
the probability of obtaining a sample mean less than or equal to 20.75 (and thus
such a mistake in acceptance) is given by:
which leads to an area (to the left) of 0.2660. That is to say: the probability
The error in accepting µ=20 (despite it being µ=21) is 26.6%. Graphically:
11
The average duration of a sample of 100 fluorescent tubes produced
for a company, it amounts to 1570 hours, with a standard deviation of 120
horas. Si µ es la duración media de todos los tubos producidos por la compañía,
check the hypothesis µ = 1600 against the alternative hypothesis µ <> 1600 hours with
a significance level of 0.05.
Nivel de significancia:α=0.05.
On the other hand, Zα/2Could it be that the area under the normal to its right is α/2 and
It will be such that the area under the normal to its left is α/2. These two values define
the areas of acceptance and rejection of the Null Hypothesis. Depending on where the value falls
From the calculated z by the previous expression, acceptance or rejection will occur.
We calculate:
12
Given that -2.5 < -Z0.025The Null Hypothesis is rejected, then the average duration
the lifespan of the tubes is significantly less than 1600 hours. As can be seen
In the following graph, the sample mean falls outside the acceptance zone:
In general, the following table summarizes the different tests of null hypotheses.
µ=µ0What can be done about an average:
13
A transport company distrusts the claim that the useful life
the average of certain tires is at least 28000. To verify, 40 are placed.
tires on trucks and an average lifespan of 27463 is obtained with a
S=1348. What can be concluded from that data if the probability of Type I Error is
at most 0.01?
We calculate:
Since -2.52 < -Z0.01the Null Hypothesis is rejected, then the useful life of the
Tires are significantly lower than 28000. As can be seen in the
the next graph, the sample mean falls outside the acceptance zone:
14
4) The average lifespan of the light bulbs produced by a company has been
in the past of 1120 hours with a standard deviation of 125 hours. A sample
Of the 8 light bulbs from the current production, the average lifespan was 1070 hours.
Test the hypothesis µ=1120 hours against the alternative hypothesis µ<1120 hours
using a significance level of α=0.05.
We calculate:
Since -1.131 > -t0.05The Null Hypothesis is accepted, then the useful life of the
Tires are significantly equal to 1120 hours. As can be seen in the
the following graph, the sample mean falls within the acceptance zone:
15