STATISTICAL INFERENCE
STATISTICAL INFERENCE
• Preamble: Methods used in making decisions or drawing conclusions about
a population constitute the field of statistical inference
• Statistical inference is categorized into 2 major areas:
• Parameter estimation
• Hypothesis testing
• Definitions:
• Statistical inference is the process by which we infer population properties from the
properties of the sample
• The process of drawing conclusions about population from samples that are subject
to random variation
• Population in statistics refers to the collection of all objects or elements under study in a
given situation
• This population property must be observable and measurable
Estimation
• Estimation theory is the aspect or branch of statistics that deals with
estimating the values of parameters based on empirical or measured
data that has a random component
• The objective of estimation is to approximate the value of a
population parameter on the basis of a sample statistics
• Eg: The sample mean 𝑋ത could be used to estimate the population mean 𝜇
• Note: The entire purpose of estimation is to arrive at an estimator (a
function that maps the sample space to a set of estimates), which
takes the sample as input and produces an estimate (a single value
calculated based on samples and used to estimate a population value)
of the parameters with the corresponding accuracy.
TYPES OF ESTIMATE
• Point Estimator:
• Here, inference is drawn by using a
single value to estimate an unknown
parameter
• Interval Estimator :
• Here, an interval of values is used in
estimating an unknown parameter
Point Estimator
• A point estimator draws inference about a population by estimating
the value of an unknown parameter using a single value or point
• A point estimate is one of the possible values a point estimator can
assume
• Mathematically, if there is a fixed parameter 𝜃 that needs to be
estimated and 𝑋 is a random variable corresponding to the observed
data, then an estimator of 𝜃, usually denoted by 𝜃, መ is a function of
the random variable 𝑋 and as such, is in itself a random variable.
Characteristics of a good Point Estimator
• Unbiasedness
• Efficiency
• Consistency
Unbiasedness
• An estimator should be close to the true value of the unknown
parameter
• An estimator 𝜃መ is said to be an unbiased estimator of a parameter 𝜃 if
the expected value of the estimator equals the parameter
• 𝐸 𝜃 = 𝜃
• When there exists a difference between the expected value of the
estimator 𝐸 𝜃መ and the parameter 𝜃, the estimator is said to be
biased or not unbiased and the difference is called bias
• 𝐸 𝜃 − 𝜃 = 𝑏𝑖𝑎𝑠
• For unbiased estimator, the bias value is zero
Example 1: If 𝑋 is a random variable with mean 𝜇 and variance 𝜎 2 , and
𝑋1 , 𝑋2 ,…, 𝑋𝑛 is a random sample of size 𝑛 from a population
represented by 𝑋, show that the sample mean 𝑋ത and sample variance
𝑠 2 are unbiased estimators of 𝜇 and 𝜎 2 respectively
• Solution:
• Recall that from the definition of mean of a continuous random variable,
• 𝐸 𝑋 = 𝑥 . 𝑓 𝑥 𝑑𝑥 = 𝜇
• Similarly, from the definition of variance of a continuous random variable
• var 𝑋 = 𝑥 − 𝜇 2 . 𝑓 𝑥 𝑑𝑥
• =𝐸 𝑥−𝜇 2
• = E 𝑋2 − 𝐸 𝑋 2
𝑋1 +𝑋2 +𝑋3 +⋯+𝑋𝑛
ത
• If 𝑋 = with 𝐸 𝑋𝑖 = 𝜇 for 𝑖 = 1, 2, 3, … , 𝑛
𝑛
• Then 𝐸 𝑋ത = 𝜇
σ𝑛 ത 2
𝑖−1 𝑋𝑖 −𝑋 1
• 𝐸 𝑠2 =𝐸 = 𝐸 σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2
𝑛−1 𝑛−1
1
• = 𝐸 σ𝑛𝑖=1 𝑋𝑖 2 + 𝑋ത 2 − 2𝑋𝑋
ത 𝑖
𝑛−1
1
• = 𝐸 σ𝑛𝑖=1 𝑋𝑖 2 − 𝑛𝑋ത 2
𝑛−1
1
• = σ𝑛𝑖=1 𝐸(𝑋𝑖 )2 − 𝑛𝐸(𝑋)ത 2 {from Mean of a linear combination}
𝑛−1
2 𝜎2
• But 𝐸(𝑋𝑖 ) = 𝜇 + 𝜎 while 𝐸 𝑋ത = 𝜇 +
2 2 2 2
𝑛
2 1 𝑛 2 2 2 𝜎2
• ∴ 𝐸𝑠 = σ𝑖=1 𝜇 + 𝜎 − 𝑛 𝜇 +
𝑛−1 𝑛
1
• = 𝑛𝜇2 + 𝑛𝜎 2 − 𝑛𝜇2 − 𝜎 2 = 𝜎 2
𝑛−1
• Therefore, 2the sample variance 𝑆 2 is an unbiased estimator of the population
variance 𝜎
Efficiency
• Given two unbiased estimators for a parameter, the estimator with a
smaller variance is more efficient
• For the same parameter 𝜃, an unbiased point estimator 𝜃መ1 , is more
efficient than another unbiased point estimator 𝜃መ2 , if
• 𝑉𝑎𝑟 𝜃1 (𝑥) < 𝑉𝑎𝑟 𝜃2 (𝑥)
• Therefore, if we have several estimators, a typical principle of
estimation is to choose the estimator that has minimum variance.
• Such estimator is known as Minimum Variance Unbiased Estimator (MVUE)
• MVUE is most likely among all unbiased estimators to produce an
መ that is close to the true value of 𝜃
estimate, 𝜃,
MVUEs of different distributions
• For a normal distribution with unknown mean and variance:
• Sample mean 𝑥ҧ is the MVUE for population mean 𝜇
• 𝐸 𝑋ത − 𝜇 = 𝐸 𝑋ത − 𝜇 = 0
• Sample variance 𝑆 2 is the MVUE for population variance 𝜎 2
• 𝐸 𝑆2 − 𝜎 2 = 𝐸 𝑆2 − 𝜎 2 = 0
• For other distributions, the sample mean and sample variance are not
in general, MVUEs
Illustrative example: Suppose we have a random sample of n
observations 𝑋1 , 𝑋2 , 𝑋3 ,…, 𝑋𝑛 and we wish to compare two
possible unbiased estimators for 𝜇: the sample mean 𝑥ҧ and a
single observation from the sample, say 𝑥𝑖 .
𝜎2
• For the sample mean: 𝑉 𝑥ҧ =
𝑛
• For the single observation: 𝑉 𝑥𝑖 = 𝜎 2
𝜎2
• Since < 𝜎 2 , we can conclude that for sample sizes 𝑛 ≥ 2, the
𝑛
sample mean is a better estimator of 𝜇 than a single observation 𝑥𝑖 .
Consistency
• An unbiased estimator is said to be consistent if the difference
between the estimator and the target population parameter becomes
smaller as the sample size increases.
• For example:
𝜎2
• The variance of the sample mean is . Thismeans that as the sample size n
𝑛
increases, the variance decreases. Therefore the sample mean is a consistent
estimator of 𝜇
Reporting a Point Estimate:
Standard Error
• When the numerical value or point estimate of a parameter is reported, it
is usually desirable to give some idea of the precisions of estimation.
• The measure of precision usually employed is the standard error of the
estimator that has been used.
መ is its standard deviation, given by:
• The standard error of an estimator, 𝜃,
• 𝜎𝜃 = 𝑉 𝜃
• If the standard error involves unknown parameter that can be estimated,
substitution of these values into 𝜎𝜃 produces an estimated standard error
denoted by 𝜎ො 𝜃
Mean Square Error
• When an estimator follows a normal distribution, one can be reasonably confident that
the true value of the parameter lies within 2 standard errors of the estimate
• Sometimes, it is necessary to use a biased estimator and in such cases, the mean square
error of the estimator is employed
• Mean Square Error (MSE) of an estimator 𝜃 of the parameter 𝜃 is defined as:
2
• 𝑀𝑆𝐸 𝜃 = 𝐸 𝜃 − 𝜃
• Rewritten as
2 2 2
• 𝑀𝑆𝐸 𝜃 = 𝐸 𝜃 − 𝜃 = 𝐸 𝜃 − 𝐸 𝜃 + 𝜃 − 𝐸 𝜃
• =𝑉 𝜃 + 𝑏𝑖𝑎𝑠 2
• MSE is, therefore, the variance of the estimator plus the squared bias
• A good estimator should have a small mean square estimation error because this implies
that the estimator values are clustered around the parameter 𝜃
• If 𝜃 is an unbiased estimator, 𝑏𝑖𝑎𝑠 = 0 and then 𝑀𝑆𝐸 𝜃 = 𝑉 𝜃
• In comparing two estimators, the relative efficiency is calculated
• The relative efficiency of 𝜃መ2 to 𝜃መ1 is defined as:
1
𝑀𝑆𝐸 𝜃
• 2
𝑀𝑆𝐸 𝜃
• If this relative efficiency is less than 1, then 𝜃መ1 is a more efficient
estimator of 𝜃 than 𝜃መ2 in the sense that it has a smaller mean square
error
Interval Estimator
• An interval estimator of a population parameter under random
sampling consists of two random variables which are called the
UPPER and LOWER limits of the interval estimators
• These upper and lower limits determine the intervals expected to
contain the parameter estimated
• Interval estimates are all the ranges an interval estimator can assume.
Each interval estimate states a range within which a population
parameter probably lies.
Assessing Interval estimators
• Accuracy (confidence level)
• Precision (Margin of errors)
Confidence level of the estimator
• Refers to the probability that an interval estimator obtained will
contain the value of the population parameter
• Any possible outcome of an interval estimate and the confidence level
defines the type of estimate
• Eg: An interval estimate with confidence level (1 − 𝛼) is called an (𝟏 −
𝜶)−confidence interval
Margin of error
• This is measured by the half of the width of the interval estimates
• That is the difference between the upper or lower limits of the
confidence interval
Types of Interval estimate
• Confidence interval
• We cannot be certain that the interval contains the true, unknown population
parameter—we only use a sample from the full population to compute the point
estimate and the interval.
• The confidence interval is constructed so that we have high confidence that it does
contain the unknown population parameter
• A confidence interval bounds population or distribution parameters
• Tolerance interval
• We need to account for the potential error in each point estimate to form a tolerance
interval for the distribution
• A tolerance interval bounds a selected proportion of a distribution
• Prediction interval
• A prediction interval bounds future observations from the population or distribution