Lecture Note
Introductory Statistics(Stat 2161)
December 15, 2025
Chapter 7
Sampling and sampling distribution
Sampling
Most researchers come to a conclusion of their study by
studying a small sample from the huge population or
universe.
To draw conclusions about population from sample, there
are two major requirements for a sample:
1 the sample size should be adequately large.
2 the sample has to be selected appropriately so that it will
be a representative of the population.
Sampling techniques is concerned with the selection of
representative sample, especially for the purposes of
statistical inference.
2 / 44
Some definitions:
Target population (reference population): is the
population about which an investigator wishes to draw a
conclusion.
Sampled population (population sampled): a population
from which the actual sample was drawn and about which a
conclusion can be made.
Sampling unit: the ultimate unit to be sampled or
elements of the population to be sampled.
Sampling frame: is the list of all elements in a population.
Sampling errors: are errors arising due to drawing
inferences about the population on the basis of few
observations. Thus, it is the discrepancy between the
population value and sample value.
It involved in the collection, processing and analysis of a
data.
It may arise due to inappropriate sampling techniques.
3 / 44
Non Sampling errors: are errors that arise at the stages
of observations, compilation and analysis of data.
It can happen in both sample surveys as well as complete
population enumeration survey. Thus, the sample survey
would be subject to both the sampling errors as well as
non-sampling errors.
Non sampling errors occur at every stage of planning and
execution because of faulty planning, errors in response by
the respondents, compilation errors etc.
Reasons for Sampling:
1 Reduced cost; Greater speed; Greater accuracy
2 Greater scope
3 Avoids destructive test
Sometimes taking a census makes more sense than
using a sample. Some of the reasons include:
Universality; Qualitativeness
Detailedness; Non-representativeness
4 / 44
Methods of sampling/Sampling techniques
Sampling can be classified into two categories, namely,
probability sampling and non-probability sampling.
Probability sampling: is a method of sampling in which
all elements in the population have a pre-assigned non zero
probability to be included in to the sample. That is,
sampling units are selected on the basis of chance.
Non probability sampling: is a sampling technique in
which the choice of individuals for a sample depends on the
basis of convenience, personal choice or interest.
The most common examples of probability sampling
include Simple random sampling, stratified random
sampling, cluster sampling,systematic sampling and
multistage sampling. However, Judgment sampling,
Convenience sampling and Quota Sampling are some
examples for non probability sampling.
5 / 44
Probability Sampling
Simple random sampling(SRS)
In simple random sampling, each unit in the population has
equal chance or probability to be selected in the sample.
There are two types: SRS with replacement and SRS
without replacement.
In SRS with replacement, the selected unit is replaced back
to the population and again has the chance of getting
[Link] in SRS without replacement, which is the
usual method in medical research, the selected unit is not
put back in the population and hence the population size
reduces by one at each selection.
Random samples can be drawn by lottery method or by
using random number tables(Reading Assignment).
It is applied when the population is homogeneous.
6 / 44
Stratified random sampling
It is preferred when the population is heterogeneous with
respect to characteristic under study.
In this method, the complete population is divided into
homogenous sub groups called ”Strata” and then a stratified
sample is obtained by independently selecting a separate
simple random sample from each population stratum.
Some of the criteria for dividing a population into strata
are: Sex (male, female); Age (under 18, 18 to 28, 29 to 39);
Occupation (blue-collar, professional, other).
Random samples taken within a stratum will have much
less variability than a random sample taken across all
strata. This is true because sample units within each
stratum tend to have characteristics that are similar.
7 / 44
Systematic Random Sampling
Systematic sampling is a commonly employed technique,
when complete and up to date list of sampling units is
available.
A systematic random sample is obtained by selecting one
unit on a random basis and then choosing additional units
at evenly spaced intervals until the desired number of
sample size is obtained.
Let N=population size; n=sample size and k = N/n is
sampling interval.
Then choose randomly a number between 1 and k. Suppose
the randomly chosen number is j (1 ≤ j ≤ k).
The j th unit is selected at first and then (j + k)th ,
(j + 2k)th , (j + 3k)th ...,etc until the required sample size is
reached.
8 / 44
Cluster sampling
It is obtained by selecting clusters from the population on
the basis of simple random sampling so that each and every
units in the selected clusters will be included in the sample.
Clusters are formed by grouping units on the basis of their
geographical [Link], elements within a cluster are
heterogeneous.
The advantage of cluster sampling is that sampling frame is
not required and in practice when complete lists are rarely
available, cluster sampling is suitable.
Multistage Sampling
In this method, the whole population is divided in first stage
sampling units from which a random sample is selected.
The selected first stage is then subdivided into second stage
units from which another sample is selected. Third and
fourth stage sampling is done in the same manner if
necessary. For example, in an urban survey in a state, a
sample of towns may be taken first and then in each of the
selected towns, a second stage sample of households may be
taken. 9 / 44
Sampling distribution
The distribution of all possible values that can be assumed
by some statistic, computed from samples of the same size
randomly drawn from the same population, is called the
sampling distribution of that statistic.
Sampling distributions may be constructed empirically
when sampling from a discrete, finite population.
To construct a sampling distribution we proceed as follows:
From a finite population of size N, randomly draw all
possible samples of size n.
Compute the statistic of interest for each sample.
List in one column the different distinct observed values of
the statistic, and in another column list the corresponding
frequency/probability of occurrence of each distinct
observed value of the statistic.
10 / 44
There are commonly three properties of interest for a given
sampling distribution.
Its Mean
Its Variance
Its Functional form.
Example: Suppose we have a population of size N = 5,
consisting of the age of five children: 6, 8, 10, 12, and 14.
PN
i=1 Xi (6 + 8 + 10 + 12 + 14)
P opulation mean = µ = = = 10
N 5
PN
(X −µ)2
P opulation variance = σ 2 = i=1 N i =8
Take samples of size 2 with replacement and construct
sampling distribution of the sample mean.
11 / 44
Solution: Since the sampling is with replacement, there is
N n = 52 possible ways of getting a sample of size 2. Thus, the
possible samples and the corresponding sample means are
presented in open and closed braces respectively as follows.
6 8 10 12 14
6 (6,6) [6] (6,8)[7] (6,10)[8] (6,12)[9] (6,14[10])
8 (8,6)[7] (8,8)[8] (8,10)[9] (8,12)[10] (8,14)[11]
10 (10,6)[8] (10,8)[9] (10,10)[10] (10,12)[11] (10,14)[12]
12 (12,6)[9] (12,8)[10] (12,10)[11] (12,12)[12] (12,14)[13]
14 (14,6)[10] (14,8)[11] (14,10)[12] (14,12)[13] (14,14)[14]
Therefore, the sampling distribution of the mean will be
constructed by listing the different values in one column and
their probability/frequency of occurrence like as follows.
Sample mean(X̄) 6 7 8 9 10 11 12 13 14
P(X̄ = x̄) 1/25 2/25 3/25 4/25 5/25 4/25 3/25 2/25 1/25
12 / 44
We are usually interested in the functional form of a
sampling distribution, its mean, and its variance. To
illustrate these characteristics, lets we again consider the
sampling distribution of the sample mean(X̄).
Mean:
X
E(X̄) = P (X̄i = x̄i )X̄i = 1/25∗6+2/25∗7+...+1/25∗14 = 10
=> µX̄ = E(X̄) = µ
Variance:
X
V ar(X̄) = E(X̄−µ)2 = P (X̄i = x̄i )(X̄i −µ)2 = 1/25∗(6−10)2 +
2/25 ∗ (7 − 10)2 + ... + 1/25 ∗ (14 − 10)2 = 4 = 8/2
=> V ar(X̄) = σ 2 /n
Functional form: the distribution of the sample mean
plotted as a histogram, along with the distribution of the
population, both of which are shown as follows.
13 / 44
From the plots we can observe that the parent population is
uniformly distributed, while the sampling distribution of the
mean gradually rises to a peak and then drops off with perfect
symmetry.
14 / 44
Remarks:
In any case (i.e, sampling with and without replacement),
the sample mean is unbiased estimator of the population
mean.
=> E(X̄) = µ
For sampling with replacement: V ar(X̄) = σ 2 /n
σ2 N −n
For sampling without replacement: V ar(X̄) = n ∗
N −1
Note:
The square root of the variance of the sampling distribution
is called the standard error of the mean or, simply, the
standard error.
When sampling is from an infinite population, the standard
errors under both sampling with and without replacement
will close each other.
15 / 44
When sampling is from a normally distributed population,
the distribution of the sample mean will also be normal
with mean µ and variance σ 2 /n. That means,
X ∼ N (µ, σ 2 ) => X̄ ∼ N (µ, σ 2 /n)
X̄ − µ
=> Z = √ ∼ N (0, 1)
σ/ n
Example: If the uric acid values in normal adult males are
approximately normally distributed with mean 5.7 mgs and
standard deviation 1mg, find the probability that a sample
of size 9 will yield a mean: (1) greater that 6; (2) between 5
and 6 (3) less than 5.2.
Solution: Let X is the amount of uric acid in normal adult
males. Thus, X ∼ N (µ, σ 2 ) = N (5.7, 1).
16 / 44
X̄ − µ 6 − 5.7
1 P (X̄ > 6) = P ( √ > √ ) = P (Z > 0.9) = 0.1841
σ/ n 1/ 9
5 − 5.7 X̄ − µ 6 − 5.7
2 P (5 < X̄ < 6) = P ( √ < √ < √ )=
1/ 9 σ/ n 1/ 9
P (−2.1 < Z < 0.9) = 0.7981
X̄ − µ 5.2 − 5.7
3 P (X̄ > 5.2) = P ( √ < √ ) = P (Z < −1.5) =
σ/ n 1/ 9
0.0668
Central Limit Theorem
Given a population of any nonnormal functional form with
a mean and finite variance, the sampling distribution of X̄
computed from samples of size n from this population, will
have mean µ and variance σ 2 /n and will be approximately
normally distributed when the sample size is large. i.e,
If n is large then X̄ ∼ N (µ, σ 2 /n)
17 / 44
Example: If the mean and standard deviation of serum
iron values for healthy men are 120 and 15 micro-grams per
100 ml, respectively, what is the probability that a random
sample of 50 normal men will yield a mean between 115
and 125 micro-grams per 100 ml?
Solution: The functional form of the population of serum
iron values is not specified, but since we have a sample size
greater than 30, we make use of the central limit theorem.
115 − 120 X̄ − µ
Thus, P (115 < X̄ < 125) = P ( √ < √ <
1/ 50 σ/ n
125 − 120
√ ) = P (−2.36 < Z < 2.36) = 0.9818
1/ 50
18 / 44
Chapter 8
Inferential Statistics
Inference is the process of making interpretations or
drawing conclusions from sample data to the population.
Any researcher collects information with the aim to draw
valid conclusions regarding the research question.
In statistics, inference can be done in two ways:
Statistical estimation
Statistical hypothesis testing.
Statistical Estimation
Estimation: is one way of making inference about the
population parameter where the investigator does not have
any prior notion about values or characteristics of the
population parameter.
19 / 44
There are two methods of making estimation: Point
Estimation and Interval Estimation.
Point Estimation: is a procedure that results in a single
value as an estimate for a parameter.
Interval estimation: is the procedure that results in the
interval of values as an estimate for a parameter.
It deals with identifying the upper and lower limits of a
parameter.
Estimator Vs Estimate
Estimator: is the rule or random variable that helps us to
approximate a population parameter.
Estimate: is the different possible values in which an
estimator can assume. Thus, A point estimate is a specific
numerical value estimate of a parameter.
20 / 44
Properties of a Good Estimator:
The estimator should be an unbiased estimator. That is,
the expected value or the mean of the estimates obtained
from samples of a given size is equal to the parameter being
estimated.
The estimator should be consistent. For a consistent
estimator, as sample size increases, the value of the
estimator approaches the value of the parameter estimated.
The estimator should be a relatively efficient estimator.
That is, of all the statistics that can be used to estimate a
parameter, the relatively efficient estimator has the smallest
variance.
Point estimation of the population mean: µ
The sample mean is a better estimator of the population
mean than the sample
Pn median or sample mode.
X
¯ = i=1 i is a point estimator of the
That is, (X)
n
population mean µ.
21 / 44
Example: An investigator is interested in finding out the
mean duration of hospital stay by patients undergoing
cesarean section. Ideally the investigator should go through
the case details of all patients who have undergone
cesarean section. But the investigator decides to examine a
sample of these patients from which he computes the
average duration of hospital stay.
Interval Estimation
Due to sampling error, there may be some doubt on the
accuracy of point estimates as there is no way of knowing
how close a particular point estimate is to the population
mean.
Consequently, statisticians prefer another type of estimate,
called an interval estimate.
An interval estimate of a parameter is an interval or a
range of values used to estimate the parameter. This
estimate may or may not contain the value of the parameter
being estimated.
22 / 44
The confidence level of an interval estimate of a
parameter is the probability that the interval estimate will
contain the parameter, assuming that a large number of
samples are selected and that the estimation process on the
same parameter is repeated.
A confidence interval is a specific interval estimate of a
parameter determined by using data obtained from a
sample and by using the specific confidence level of the
estimate.
Interval Estimation of Mean
To calculate confidence interval we make use of the
knowledge of sampling distributions.
Assumption: Either the population is normally
distributed or n >= 30.
23 / 44
In constructing confidence interval, knowledge of
population variance (σ 2 ) from which the sample is taken is
required.
Case 1: σ 2 is known
A 100(1 − α)% confidence interval for mean:
σ σ
X̄ − Z α2 √ < µ < X̄ + Z α2 √
n n
Example: A dentist wished to estimate with 95%
confidence, the mean marginal displacement in the teeth
taking place by applying a particular treatment modality.
He assumes that the marginal displacement values are
normally distributed with a mean of 6.2 units after studying
100 people. The population
√ variance is 9 units.
Solution: x̄ = 6.2, σ = 9 = 3, n=100 and z α = 1.96.
2
Thus, 95% confidence interval for µ is:
24 / 44
X̄ +̄Z α2 √σn = 6.2+̄1.96 ∗ √ 3 =(5.61,
100
6.79)
Interpretation: In repeated sampling from the study
population, we are 95% confident that the mean marginal
displacement for population lies between 5.61 and 6.79.
Case 2: σ 2 is unknown
When sample size is large (i.e. n > 30), we use sample
standard deviation as a replacement for the unknown
population standard deviation even if it is from non normal
distribution by virtue of central limit theorem. That is,
S S
X̄ − Z α2 √ < µ < X̄ + Z α2 √
n n
25 / 44
Example: Suppose a researcher interested in finding the serum
TSH in healthy adult females; studied 100 subjects and found
that the mean serum TSH was 2 units with a standard
deviation (SD) of 0.2. Calculate 95% confidence interval.
Solution: x̄ = 2, s = 0.2, n=100 and z α = 1.96.
2
Thus, a 95% confidence interval for population mean is given by:
s 0.2
X̄ +̄Z α2 √ = 2+̄1.96 ∗ √ = (1.9608 , 2.0392)
n 100
Interpretation: We are 95% confident that the reality or truth
that exists in the ‘total population’ (of millions of adult healthy
females) would be that mean serum TSH would be between
1.9608 to 2.0392.
26 / 44
When sample size is small but the population is normal,
less than 30, the procedure remains same except that we
use t distribution with n-1 degrees of freedom instead of
standard normal distribution ‘z’.
The confidence interval for population mean in such cases
is given by:
S S
X̄ − t α2 ;(n−1) √ < µ < X̄ + t α2 ;(n−1) √
n n
Note: If the population is not normal and the sample size
is small (i.e, n< 30), then we can not construct the interval
due to the fact that the assumption is not met.
27 / 44
Hypothesis testing
Hypothesis testing is also one way of making inference
about population parameter, where the investigator has
prior notion about the value of the parameter.
The researcher states a hypothesis to be tested,
formulates an analysis plan, analyzes sample data
according to the plan, and accepts or rejects the
hypothesis, based on results of the analysis.
A statistical hypothesis is a conjecture about a population
parameter. This conjecture may or may not be true.
There are two types of statistical hypotheses: null
hypothesis and alternative hypothesis.
Null hypothesis (H0 )
It is the hypothesis to be tested.
It is the hypothesis that often states ”there is no difference
between a parameter and a specific value”, or that ”there is
no difference between two parameters”.
28 / 44
Alternative hypothesis (H1 or Ha )
It is the hypothesis available when the null hypothesis has
to be rejected.
It is the hypothesis that states the existence of a difference
between a parameter and a specific value, or states that
there is a difference between two parameters.
A statistical test uses the data obtained from a sample to
make a decision about whether the null hypothesis should
be rejected.
The numerical value obtained from a statistical test is
called the test statistic value.
29 / 44
Example: Suppose we are interested to study the effect of
a new drug in reducing cholesterol levels.
The research question is formally converted into a formal
scientific hypothesis, which has two parts: the null
hypothesis and the alternative hypothesis.
In the settings where two treatments (new drug and
placebo) are administered to two different samples, the null
hypothesis would be there is no difference between
cholesterol levels in the two groups i.e. ”Persons treated
with new drug will have same cholesterol levels as persons
not treated with new drug”.
If the null hypothesis gets rejected then the hypothesis that
gets accepted is called ”Alternate hypothesis”.
Thus, the alternate hypothesis would be phrased as,
“Persons treated with a new drug have different (higher or
lower) cholesterol levels than persons not treated with new
drug”.
30 / 44
Types and size of errors:
In reality, the null hypothesis may or may not be true,
and a decision made to reject or not reject it is on the basis
of the sample data which may involve sampling and non
sampling errors.
In hypothesis-testing, there are four possible outcomes as
shown below:
Decision
Reject H0 Don’t Reject H0
H0 Type I Error Correct Decision
Truth
H1 Correct Decision Type II Error
A type I error occurs when you reject the true null
hypothesis.
A type II error occurs when you fail to reject the false
null hypothesis.
31 / 44
Exercise: For each of the following situations, identify the type
I and type II errors and the correct actions.
H0 : ”A new treatment is not more effective than the traditional
one”.
Adopt the new treatment when the new one is more
effective.
Continue with the traditional treatment when the new one
is more effective.
Continue with the traditional treatment when the new one
is not more effective.
Adopt the new treatment when the new one is not more
effective.
32 / 44
The level of significance, denoted by α, is the maximum
probability of committing a type I error. i.e,
P (T ype I error) ≤ α.
The probability of making type II error often denoted by β.
i.e, P (T ype II error) = β.
It is natural to aim first for a test whose type I and type II
error probabilities are minimum. However, Type I error
and Type II error have an inverse relationship and
therefore, can not be minimized at the same time.
The most powerful test is a test that fixes the level of
significance and minimizes the probability of type II error.
Power of a test is defined as the probability of rejecting
the null hypothesis when it is actually false. i.e,
power = 1 − β
33 / 44
Note: Type I error is often considered to be more serious,
and therefore more important to avoid, than a type II error.
Steps in hypothesis testing:
State hypotheses.
Select test statistics.
Determine distribution of test statistics.
State decision rule or obtain the critical/table value.
Calculate test statistics
Make statistical decision (Reject or do not reject H0 ) by
comparing test statistic value and critical value.
Draw conclusion
34 / 44
Testing a single population mean(µ):
Hypothesis:
H1 : µ ̸= µ0 (1)
H0 : µ = µ0 Vs H1 : µ > µ0 (2)
H1 : µ < µ0 (3)
Note: (1) is for a two sided test whereas (2) and (3) are for
a one sided test.
Here we consider two situations about a population mean:
When population variance is known (σ 2 is known)
When population variance is unknown.
Situation 1: When the population variance is known, the
test statistic would be:
X̄ − µ0
Z= √ ∼ N (0, 1)
σ/ n
Note: The sample size should be large (i.e, n ≥ 30) when
sampling is not from a normally distributed population.
35 / 44
The decision rule will depends on the type of test (i.e, one sided
or two sided), and is summarized as follows.
Reject H0 Do not reject H0 Inconclusive
H1 : µ ̸= µ0 |zcal | > Zα/2 |zcal | < Zα/2 |zcal | = Zα/2
H1 : µ > µ 0 zcal > Zα zcal < Zα zcal = Zα
H1 : µ < µ 0 zcal < −Zα zcal > −Zα zcal = −Zα
Situation 2: When the population variance is unknown
In this case, we use the sample standard deviation (s) as an
estimate of the population variance (σ 2 ). However, this will
adds another element of uncertainty to our inference.
To account the additional uncertainty that comes from
estimating the population variance, we use a modification of Z
called t-distribution.
36 / 44
Note: t distributions are similar to z distribution, but have
broader tails and less peaked at the center. As ‘n’ increases, t
distribution approaches normal distribution.
Thus, the test statistic would be:
X̄ − µ0
T = √ ∼ t( n − 1)
S/ n
The decision rule will also again depends on the type of test like
summarized as follows.
Reject H0 Do not reject H0 Inconclusive
H1 : µ ̸= µ0 |tcal | > tα/2;n−1 |tcal | < tα/2;n−1 |tcal | = tα/2
H1 : µ > µ0 tcal > tα;n−1 tcal < tα;n−1 tcal = tα;n−1
H1 : µ < µ0 tcal < −tα;n−1 tcal > −tα;n−1 tcal = −tα;n−1
37 / 44
Note: When the sample size is large (i.e, n > 30),
X̄ − µ0
T = √ ∼ t(n−1) ≈ N (0, 1)
S/ n
Example: Researchers claim that the mean age of
population having a certain disease ‘A’ is 35 years. To
prove their claim, a researcher collected information from a
random sample of 20 individuals drawn from a normally
distributed population. Population variance is known and
is equal to 25 and the study found that the mean age of 20
individuals is as 29. Test the claim at 5% level of
significance.
Solution
Hypothesis: H0 : µ = 35 V s H1 : µ ̸= 35, (i.e, µ0 = 35).
Test Statistic : Since population variance is known, our
statistic will be given by: Z = X̄−µ
√0
σ/ n
38 / 44
Thus, the test statistic value will be:
x̄ − µ0 (29 − 35)
zcal = √ = √ = −5.36
σ/ n 5/ 20
Critical/Table value: Zα/2 = 1.96
Decision: Since |Zcal | = 5.36 > 1.96, reject H0 .
Conclusion: We conclude that the mean age of the
population with a specific disease ”A” is not equal to 35
years.
Exercise: A research team is willing to assume that
systolic blood pressures in a certain population of males
are approximately normally distributed with a standard
deviation of 16. A simple random sample of 64 males from
the population had a mean systolic blood pressure reading
of 133. At the .05 level of significance, do these data
provide sufficient evidence for us to conclude that the
population mean is greater than 130?
39 / 44
Example: A study was made of a sample of 25 records of
patients seen at a chronic disease hospital on an outpatient
basis. The mean number of outpatient visits per patient
was 4.8, and the sample standard deviation was 2. Can it
be concluded from these data that the population mean is
greater than four visits per patient? Let the probability of
committing a type I error be .05. What assumptions are
necessary?
Solution:
Assumption: Since the sample size is small, it is necessary
to assume the samples are drawn from a normal population.
Hypothesis: H0 : µ = 4 V s H1 : µ ̸= 4, (i.e, µ0 > 4)
Test statistic Value:
x̄ − µ0 (4.8 − 4)
tcal = √ = √ =2
s/ n 2/ 25
40 / 44
Table value: tα;n−1 = t0.05,24 =1.71
Decision: Reject H0 since tcal > tα,n−1 .
Conclusion: Yes, we can conclude from the data that the
population mean greater than four visits per patients.
Exercise: A sample of eight patients admitted to a
hospital with a diagnosis of biliary cirrhosis had a mean
IgM level of 160.55 units per milliliter. The sample
standard deviation was 50. Do these data provide sufficient
evidence to indicate that the population mean is greater
than 150? Use 5% level of significance. What assumption is
required? Determine the p value.
41 / 44
Inferences with Population Proportion(s).
In clinical trials one may count the number of times an
event occurs such as number of successful outcomes,
number of failures or number of patients recovered after
administration of drug etc.
Fore instance, patients in one group may receive new
treatment drug and another independent group may
receive existing conventional treatment. We may be
interested in comparing the proportion of patients attacked
by disease after administration of the treatment in the two
populations.
Hypothesis Testing : A Single Population Proportion
Hypothesis:
H1 : π ̸= π0 (1)
H0 : π = π0 Vs H1 : π > π0 (2)
H1 : π < π0 (3)
42 / 44
Test statistic
In testing a single population proportion denoted by π
against a hypothesized value of π0 , approximate normality
assumptions holds true if the sample size is large.
If the sample size is large, then the test statistic to be used
in this procedure will be:
p − π0
Z=r ∼ N (0, 1)
π0 (1 − π0 )
n
The decision will be the same like in testing one population
mean.
Example: In clinical studies of an anti-allergy drug, 70 of 781
subjects experienced drowsiness. A competitor claims that 8%
of users of his drug experience drowsiness. Use a 0.05
significance level to test this claim.
43 / 44
Solution:
Assumption: The random sample is drawn from a
normally distributed population.
Hypotheses: H0 : π = 0.08 V s H1 : π ̸= 0.08
Test statistic: The data obtained on drug says 70 out of
70
781 subjects experienced drowsiness. Hence, 781 = 0.089.
Therefore, The test statistic value is:
p − π0 0.089 − 0.08
Z=r =r = 0.9271
π0 (1 − π0 ) 0.08 ∗ (1 − 0.08)
n 781
Decision: At α= 0.05, the standard normal table value is
1.96. Since our calculated test statistic value is less than
the table value, we fail to reject the null hypothesis.
Conclusion: There is not sufficient evidence to warrant
rejection of the claim that drowsiness will be less among
users of the competitors drug.
44 / 44