0% found this document useful (0 votes)
13 views6 pages

Estimation and Confidence Intervals Guide

Uploaded by

islamee
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views6 pages

Estimation and Confidence Intervals Guide

Uploaded by

islamee
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 8: Estimation

Dr. Noha Youssef


Department of Mathematics and Actuarial Science, AUC

1 Estimation

• The purpose of statistics is to permit the user to make an inference about a population based
on information contained in a sample.

• Populations are characterized by numerical descriptive measures called parameters, the objec-
tive of many statistical investigations is to make an inference about one or more population
parameters.

• The parameter of interest might be the population mean or the population variance or the
proportion of occurrence of a certain phenomenon or any other measure of interest which will
be called the target parameter.

• Point Estimates: Use a single value that is intended to be close to the true value of the
target parameter;

• Interval Estimates: Two values are used to construct an interval that is intended to enclose
the parameter of interest.

• An estimator is a rule (formula) that tells how to calculate the value of an estimate based
on the measurements contained in a sample.

• After you select a specific sample and calculate this measure, the resulting value will be called
an estimate and not an estimator.

• Let θ̂ be a point estimator for a parameter θ. Then θ is an unbiased estimator if E(θ̂) = θ. If


E(θ̂ 6= θ), then θ̂ is said to be biased.

• A good estimator is unbiased and has a variance as small as possible.

2 Confidence intervals

• Instead of just using one single point estimate to draw conclusions about the unknown popu-
lation parameter, we can find a range of values using this point estimate.

• This range of values is called a confidence interval.

1
• Generally, a confidence interval is constructed using this rule estimate ± tabulated value
× the standard error(estimator)
• The tabulated value is a value from a statistical table and reflects our confidence level.
• The tabulated value means how many standard errors the lower limit or the upper limit of
the range are far the estimate.
• The width of the confidence interval is 2 × tabulated value × the standard error(estimator).
• The margin of error is defined as tabulated value × the standard error(estimator).
• The narrow confidence interval is preferred to the wide one.
• In general the bigger the sample size the narrower the confidence interval.
• In general the smaller the population variance or the sample variance the narrower the interval.
• The common tabulated values used for CI are 1.645 for a 90%CI, 1.96 for a 95%CI and 2.576
for a 99%CI.
• The following table helps in constructing different confidence intervals,

3 Confidence interval for the population mean µ


• For constructing a confidence interval for the population mean we use the following rule
Ȳ ± Z α2 √σn

• The term √σ is called the standard error of the sample mean ȳ.
n

• In case the population variance σ 2 is unknown we use the sample variance


n
1 X
s2 = (yi − ȳ)2 .
n − 1 i=1

The sample variance will be an approximation to σ 2 .

2
• The sample variance s2 is an unbiased estimator for the population variance σ 2 , i.e. E(s2 ) =
σ2.

• We use the normal table in almost all cases except in one case when the sample comes from
a normal distribution, the population standard deviation is unknown and the sample size is
less than 30.

• Again the bigger the variance the wider the CI given the same level of confidence.

• The smaller the sample size the wider the CI given the same level of confidence.

• Clearly once you increase your confidence level your CI will be wider as you include more
values into the CI.

• To interpret the confidence interval you have two interpretations:

1. I am a 90% or 95% or 99% confident that the population mean will lie within the upper
and the lower limit of the confidence interval.
2. If we are able to sample a huge number of samples of the same sample size n then 90%
or 95% or 99% of those samples will include the true mean.

4 Confidence Intervals for the population proportion P

• Proportion is usually used with a binary variable which has just two outcomes, e.g. smokers
or non-smokers

• The binary variable is usually coded using 0 and 1.

• As with population mean we construct the confidence interval for the population proportion
using the general rule mentioned above.

• This means we have to find an estimate for the population proportion from the sample. This
can be computed using the following rule:
the number of individuals of our interest in our sample
p=
the sample size
.
q
p(1−p)
• The standard error used with the sample proportion is n .
q
p(1−p)
• The confidence interval is then given by p ± Z α2 n

• The interpretation of the confidence interval will be the same as the CI of the population
mean but instead we will use the term population proportion instead.

3
5 Confidence Intervals for the difference between two population
means

• For finding a confidence interval


r for the difference between two population means we use this
σ 2 σ 2
main rule Y¯1 − Y¯2 ± Zα/2 n11 + n22 .

• We use S1 and S2 if the population standard deviations are unknown and the samples sizes
are bigger than 30.

• If the population standard deviations are unknown, the samples sizes are less than 30 and
both samples are coming from the normal distribution then we use the t table with degrees of
freedom n1 +n2 −2. In this course we assume that both population variances are equal although
(n −1)S12 +(n2 −1)S22
we don’t know their values, therefore we calculate the pooled variance Sp2 = 1 n1 +n 2 −2
and use it instead of S12 and S22 .

6 Confidence Intervals for the difference between two population


proportions

• When the sample sizes are bigger than 30 we can use the central limit theorem and o construct
the confidence
q interval for the difference between two population proportions as follows p1 −
p1 (1−p1 )
p2 ± Z n1 + p2 (1−p
n2
2)
.

7 Examples

Example 1

The shopping times of 64 randomly selected customers at a local supermarket were recorded. The
average and variance of the 64 shopping times were 33 minutes and 256, respectively. Estimate the
true average shopping time per customer, using a 90% confidence interval. Interpret your results.
Sol:
Since n > 30 then Ȳ is approx normal by CLT. Then the CI is given by
16
33 ± 1.645 √ → 33 ± 3.92,
64
then the confidence interval is (29.71,36.29). This means that we are 90% confident that the average
shopping hours is ranging from 29.71 to 36.29 hours.

Example 2

The manager of a TV station must determine what percentage of households in the city have more
than one TV set. A random sample of 500 homes reveals that 275 have two or more sets. What is

4
the 90 percent confidence interval for the proportion of all homes with two or more sets?
Sol:
n = 500, then p = 275
500 = 0.55. The 90% CI is given by
s
0.55(1 − 0.55
0.55 ± 1.645 → 0.55 ± 0.0366,
500

this gives the following (0.51340106,0.58659894).

Example 3

Charles Schwab, the discount brokerage service, recently instituted two training programs for newly
hired telephone marketing representatives. To test the relative effectiveness of each program, 45
representatives trained by the first program were given a proficiency test. The mean score was
76 points with standard deviation 13.5 points. The 40 people trained under the second program
reported a mean score of 77.97 and standard deviation of 9.05 points. Management wants to
know if one training program is more effective than the other. As the one selected to make this
determination, you decide to construct a 99 percent confidence interval for the difference between
the mean proficiency scores of the employees trained under each program. You are also charged with
the responsibility of recommending which training program the company should use exclusively.
Sol:
n1 = 45, Y¯1 = 76, s1 = 13.5, n2 = 40, Y¯2 = 77.97, s2 = 9.05. The 99% CI is obtained as
s
13.52 9.052
76 − 77.97 ± 2.576 + → −1.97 ± 2.5762.47
45 40

then the 99% CI is (-8.330979398, 4.390979398). We can’t really judge which is more effective since
the two limits are of different sign.

Example 4

Two brands of refrigerators, denoted A and B, are each guaranteed for 1 year. In a random sample
of 50 refrigerators of brand A, 12 were observed to fail before the guarantee period ended. An inde-
pendent random sample of 60 brand B refrigerators also revealed 12 failures during the guarantee
period. Estimate the true difference between proportions of failures during the guarantee period,
with confidence coefficient 0.98 and interpret your result.
Sol:
The estimate for the difference is given by 12 12
50 − 60 = 0.04. The 98% CI is given by (-0.145153163,0.225153163).
We are 98% sure that the difference between two population proportions is ranging between these
two limits.

5
Example 5

A manufacturer of gun powder has developed a new powder, which was tested in eight shells. The re-
sulting muzzle velocities, in feet per second, were as follows:3005,2925,2935,2965,2995,3005,2937,2905.
Find a 95% confidence interval for the true average velocity for shells of this type. Assume that
muzzle velocities are approximately normally distributed.
Sol:
ȳ = 2959 and s = 39.1, the 95% CI is obtained using the t table with degrees freedom 8-1=7, then
3.91
2959 ± 2.365 √ → 2959 ± 32.7,
8
the CI is

Example 6

Suppose Gulf Dairy Association wants to estimate the difference in the mean amounts of protein
in two brands of flavored yogurt. A sample of 16 100-gram pots of Al Ain yogurt found the mean
amount of protein to be 4.5 g with a standard deviation of 1.3 g. Another sample of 16 100-gram
pots of Alrawabi yogurt found the mean amount of protein to be 4.3 g with a standard deviation of
1.2 g. ? Assume that the protein content per pot of yogurt is normally distributed and the standard
deviations for the two populations are equal. Construct a 90% CI for the difference between the
two means.
Sol:

(16−1)1.32 +(16−1)1.22
• We calculate the pooled standard deviation Sp2 = 16+16−2 = 1.5625.
q
1.5625 1.5625
• The 90% CI is given by (4.5 − 4.3) ± tα/2,n1 +n2 −2 × 16 + 16 then 0.2 ± 1.697 × 0.44

• The 90% CI ranges from −0.55 to 0.95.

You might also like