0% found this document useful (0 votes)

42 views8 pages

Sampling Distributions and Statistics

The lecture notes cover the concepts of random sampling and sampling distributions, emphasizing the importance of statistical inference when the joint distribution of random variables is unknown. It defines random samples, sample statistics, and explains the sampling distribution of the sample mean, including the Central Limit Theorem, which states that as sample size increases, the sampling distribution of the sample mean approaches a normal distribution. Examples illustrate how to calculate probabilities using these concepts.

Uploaded by

Roman Andrews

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views8 pages

Sampling Distributions and Statistics

Uploaded by

Roman Andrews

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Probability and statistics

Lecture notes

Thanos Mergoupis

9.3.25

Part 2.1: Random sampling and

sampling distributions
We have seen that the investigation into how random variables relate to each other

often focuses on how features of conditional distributions change as the values of the

conditioning variable(s) change. If the joint distribution of two random variables is

known, we have developed tools to carry out such an investigation. More often

however, we do not know the exact joint distribution of the variables we are interested

in. We then have to infer this joint distribution using pieces of information drawn

from it. Statistical inference gives us the tools to do this.

We start the study of statistical inference by studying how pieces of

information or samples from a distribution, are related to the distribution they are

drawn from. To do this, we initially assume that we know the distribution or

population from which these pieces of information, or samples, are drawn from.

Random samples

Definition:

Let X1, X2, X3, …, Xn represent independent drawings from the distribution, or

population, of the random variable X. The ordered set {X1, X2, X3, …, Xn} is called a

random sample of size n on the random variable X.

1
Note

Every Xi for i=1, 2, 3, …., n is a r.v. because its values are determined by the random

experiment of drawing a value from the distribution of X. In fact, the likelihood of

each value of the Xi equals the likelihood of the same value of X, since the draws are

from the distribution of X. Therefore the distribution of each Xi is identical to the

distribution of X. Moreover, because the random draws are independent from each

other, the Xi are distributed independently of each other. Therefore the Xi are

independently and identically distributed or i.i.d. When a specific sample is drawn, we

have a realisation of this set of variables.

Because a random sample is an ordered set, we can represent it using vector notation.

We use bold to denote vectors so that X denotes the random vector (X1, X2, X3, …, Xn)

and x denotes the realisation of this random vector, i.e. (x1, x2, x3, …, xn).

Because by definition the Xi are identically and independently distributed, if they are

drawn from the distribution f(x) of a random variable X, their joint distribution gn(x) is

given by:

gn (x) = gn ( x1 , x2 ,..., xn ) = f1 ( x1 )  f 2 ( x2 )  ...  f n ( xn ) =

= f ( x1 )  f ( x2 )  ...  f ( xn ) =  f ( xi )
i

The first line uses the fact that the Xi are independently distributed, so that their joint

distribution equals the product of the marginal distributions. The second line uses the

2
fact that the Xi are all identically distributed with their distribution identical to the

distribution they are drawn from.

Example

The exponential parametric family is a one-parameter family of distributions. A

random variable X follows an exponential distribution with parameter λ if its pdf is

given by:

e− x for x  0
f ( x) = 
0 otherwise

Then a random sample size n from this distribution has joint probability density

function given by:

g n (x) = f ( x1 )  f ( x2 )  ...  f ( xn ) =  e −  x1   e −  x2  ...   e −  xn =

−  xi
= n  e−  x1 − x2 −...− xn =  n  e−  ( x1 + x2 +...+ xn ) =  n  e i

for x with all xi positive, and gn (x) = 0 otherwise.

Sample statistics

Let T = h(X1, X2, X3, …, Xn) = h(X). That is, the values of T are determined by a

function h(  ) of the random sample. Then T is called sample statistic.

Because sample statistics are functions of random vectors, they are random variables

themselves. Their values are determined by the n independent draws of a random

sample.

Examples of sample statistics

Sample Mean

3
The sample mean is defined as the arithmetic average of a random sample. That is:

1 n
X = ( X1 + X 2 + X 3 + ... + X n ) / n =  Xi
n i =1

Sample Variance

The sample variance is defined as:

1 n
S2 =  ( X i − X )2
n − 1 i =1

Examples of other sample statistics are:

Sample maximum

Xmax = max(X1, X2, X3, …, Xn)

Sample minimum

Xmin = min(X1, X2, X3, …, Xn)

Sample range

Xmax-Xmin

Sample midrange

(Xmax+Xmin)/2

Sampling distributions

The distribution of a sample statistic is called sampling distribution. In general, a

sampling distribution depends on:

4
1. The function h( , , , …, ) that determines the values of the sample statistic.

2. The distribution the sample is drawn from, i.e. the pmf or the pdf f(X).

3. The size n of the random sample.

The derivation of sampling distributions has variable degrees of difficulty. Certain

features of sampling distributions however, can be derived quite easily.

The sampling distribution of the sample mean

The sample mean theorem

Given a random sample size n { X1, X2, X3, …, Xn} from a population with E(X) = ,

and V(X) = 2, the sampling distribution of the sample mean has:

E( X ) = 

V (X ) =  2 / n

That is, the expected value of the sample mean is equal to the expected value of the

population from which the sample was drawn, and the variance of the sample mean is

equal to the variance of the population divided by the sample size.

Note that the mean of the sampling distribution of the sample mean does not vary

with the sample size, but the variance of the sample mean does. As sample size

increases, the distribution of the sample mean becomes more concentrated about its

mean.

When sampling from some distributions we can have more precise results on the

sampling distribution of the sample mean.

5
Sampling from the normal distribution

When sampling, with sample size n, on the r.v. X with X ~ N(,2), then:

X ~ N (, 2 / n)

The fact the sample mean has mean μ and variance σ2/n follows directly from the

sample mean theorem. The fact that the sample mean is also a normal random

variable follows from the property that linear combinations of independent normal

random variables are also normally distributed.

Example

In a British city the (natural) logarithm of annual family income before taxes follows

a normal distribution with mean 9.680 and variance 9.105. We draw a random sample

of 10 families from this population and ask for their family income. What is the

probability that the mean income of this sample is greater than £26,000?

Let Y = log(annual family income before taxes)

Then it is given that Y ~ N(9.680, 9.105)

We have that log(26,000)=10.166

Because the population the sample was drawn from is normal, the sample mean is

distributed as:

Y ~ N (9.680,9.105/10)  Y ~ N (9.680,0.9105)

Then:

 Y − 9.680 10.166 − 9.680 

P(Y  10.166) = P    = P( Z  0.5093) =
 0.9542 0.9542 

6
=1 – P( Z  0.5093) = 1 – 0.695 = 0.305

Clearly it is necessary to know the exact distribution of a sample mean in order to

calculate probabilities like the one in this example. As suggested earlier, although the

sample mean is a fairly simple function of the random sample, its distribution will

vary with the population sampled. The distributions of sample means from some well

known parametric families are known, but in general one would have to work it out.

This is true if we want to evaluate probabilities of a sample mean exactly. If, however,

we are prepared to accept a small margin of error, then there is a remarkable result

that identifies the distribution of any sample mean, regardless of the population

sampled, as long as we have a large enough sample. This is the Central Limit

Theorem.

Central Limit Theorem

In random sampling size n on any random variable X with E(X) = , and V(X) = 2 ,

as the size of a random sample increases, the sampling distribution of the sample

mean approaches (in some sense we have not defined) a normal distribution. In

particular, the standardised sample mean

X −
Z=
/ n

approaches the standard normal distribution N(0,1).

7
The use of the expression “approaches in some sense” is vague, but it refers to

something well defined mathematically. The sense in which “approaches” is used here

is called “convergence in distribution” but we do not need to define this concept here.

Example

Suppose it is known that the average distance a student travels to the University of

Bath is 2 miles, with standard deviation 1.2 miles. You survey a random sample of 50

students on the distance they travelled to get to the University. What is the probability

that the sample mean will be at most 1.75 miles?

Here we do not know the exact population we are sampling from. The only things we

know are that it has a mean of 2 miles and a standard deviation of 1.2 miles. But

because the sample is somewhat large, we can appeal to the CLT. So:

Let X = distance travelled to the University of Bath.

Then:

 X − 2.0 1.75 − 2.0 

P( X  1.75) = P    = P( Z  −1.473)  (−1.473) =
 1.2 / 50 1.2 / 50 
= 1 − 0.929 = 0.071

Common questions

The joint distribution of a random sample can be derived by using the fact that each sample is drawn independently from the population distribution of the random variable X. This independence implies that the likelihood of any particular set of outcomes for the samples is the product of their individual probabilities. Therefore, if X1, X2, ..., Xn are independently drawn from the distribution f(x) of X, their joint distribution gn(x) is the product of these marginal distributions. Independence is crucial because it ensures that the probabilistic behavior of each Xi is unaffected by the others .

Increasing the sample size generally leads to a decrease in the standard error of the sample statistics, thereby increasing the precision of estimations derived from the sampling distributions. As the sample size grows, the sampling distribution of sample statistics, like the sample mean, becomes narrower and more concentrated around the population parameter. This effect arises because the variance of the sample mean decreases with larger sample sizes, specifically the variance being σ^2/n for the sample mean. Therefore, larger samples provide more reliable estimates and allow for more accurate hypothesis testing and confidence interval construction, which enhances the credibility and robustness of statistical conclusions .

The Central Limit Theorem (CLT) is significant because it states that, regardless of the population distribution, the distribution of the sample mean will approach a normal distribution as the sample size increases. This holds true as long as the samples are independent and identically distributed with finite variance, even when the population distribution is not known or is not normal. The theorem provides a foundational result that justifies using the normal distribution for inference about the sample mean in many practical situations, which simplifies the evaluation of probabilities and the calculation of confidence intervals. The standardization of the sample mean leads to the Z-distribution, a key element in hypothesis testing and statistical inference .

The variance of the sample mean, calculated from a random sample of n observations, is equal to the variance of the population (σ^2) divided by the sample size (n). This reduction in variance with increasing sample size implies that larger samples result in more precise estimates of the population mean. This characteristic is critical for statistical analysis as it guides researchers in determining the appropriate sample size needed to achieve a desired level of accuracy. More specifically, as the sample size increases, the sample mean's distribution becomes more concentrated around the population mean, thereby reducing the standard error and improving estimation reliability .

Consideration of the underlying population distribution is critical when calculating probabilities for sample means because the exact distribution of the sample mean depends on the distribution of the population from which the sample is drawn. While the Central Limit Theorem suggests that the sample mean distribution approaches normality for large samples, it does not specify how large the sample size must be for the approximation to be adequate. Misconceptions about the population distribution can lead to inaccurate estimates of probabilities. Therefore, knowledge of the population distribution aids in more precise calculations, especially in small samples where deviations from normality might significantly impact results .

Random sampling is integral to statistical inference because it allows us to draw conclusions about the population from which the sample is taken. Each random sample is an independent drawing from the population, meaning the occurrence of one sample does not affect the outcome of another. This independence ensures that the set of samples {X1, X2, ..., Xn} is independently and identically distributed (i.i.d.), allowing the joint distribution of the samples to be the product of the individual marginal distributions of each sample . This relationship is critical for forming accurate statistical inferences about population parameters based on sample statistics.

Sample statistics such as the sample minimum, maximum, and range offer unique insights about the distribution of data that the sample mean alone cannot provide. The sample minimum and maximum help identify the extremities or outliers within the data set, revealing variability and potential anomalies. The sample range, calculated as the difference between the maximum and minimum, measures the spread of the data, giving an indication of the dispersion and variability within the sample. These statistics complement the sample mean by providing a fuller picture of the dataset's distribution, highlighting aspects such as skewness and the presence of outliers, which are not captured by the mean alone .

Understanding convergence in distribution, as articulated by the Central Limit Theorem, enhances the interpretation of results from large samples by providing a theoretical foundation for approximating the sampling distribution with a normal distribution. This understanding facilitates the use of standard normal probabilities for inference, offering practical benefits such as simplifying the calculation of confidence intervals and hypothesis tests. It assures researchers that as sample size increases, even non-normally distributed populations will yield sample means that approximate normality, thus legitimizing the application of parametric tests that assume normality. This convergence significantly broadens the scope and applicability of inferential statistics, ensuring that results derived from large samples are robust and reliable .

Sample statistics, such as the sample mean or sample variance, are estimates derived from a random sample of observations and are used to infer the characteristics of the entire population. Population parameters, such as the mean (μ) or variance (σ^2), are fixed values that describe the entire population. Sample statistics are random variables because their values vary from sample to sample. They play a crucial role in statistical inference, allowing for the estimation of population parameters, testing of hypotheses, and making predictions based on sampled data. Statistical inference techniques use sample statistics to draw conclusions about population parameters with known degrees of uncertainty .

The exponential distribution is characterized by the parameter λ, and its probability density function (pdf) is f(x) = λe^{-λx} for x > 0, and 0 otherwise. When a random sample of size n is drawn from this distribution, the joint probability density function of the sample is represented by the product of the individual pdfs of each sample: g(x) = λ^n e^{-λ(x1+x2+...+xn)}, where x denotes the vector of sample values. This formulation shows the product law of the exponential function applied across independent samples, demonstrating the particular structure of sampling distributions within the exponential family .

Understanding Sampling Distribution
No ratings yet
Understanding Sampling Distribution
24 pages
Lecture 5 STA32101 Intro To Statistics Sampling Distributions Estimation P1
No ratings yet
Lecture 5 STA32101 Intro To Statistics Sampling Distributions Estimation P1
20 pages
Sampling Distributions & Estimation
No ratings yet
Sampling Distributions & Estimation
8 pages
Sampling Distributions Explained
No ratings yet
Sampling Distributions Explained
29 pages
Random Variables: Discrete vs Continuous
No ratings yet
Random Variables: Discrete vs Continuous
13 pages
Understanding Null and Alternative Hypotheses
No ratings yet
Understanding Null and Alternative Hypotheses
58 pages
Conditional Probability and Bayes' Theorem
No ratings yet
Conditional Probability and Bayes' Theorem
44 pages
Probability Problems and Solutions
100% (1)
Probability Problems and Solutions
2 pages
Understanding Normal Distributions
100% (1)
Understanding Normal Distributions
28 pages
Probability Problems and Solutions
No ratings yet
Probability Problems and Solutions
1 page
Four Steps of Hypothesis Testing
No ratings yet
Four Steps of Hypothesis Testing
22 pages
Angular Momentum Conservation Project
No ratings yet
Angular Momentum Conservation Project
26 pages
Probability Questions and Solutions Guide
100% (1)
Probability Questions and Solutions Guide
6 pages
Functions and Graphs in Calculus
No ratings yet
Functions and Graphs in Calculus
52 pages
1.8 Exercises and Solutions On Calculus AB
No ratings yet
1.8 Exercises and Solutions On Calculus AB
16 pages
Integration by Parts Exercises
No ratings yet
Integration by Parts Exercises
15 pages
In-Depth Guide to Circular Motion Concepts
No ratings yet
In-Depth Guide to Circular Motion Concepts
219 pages
Basic Probability Theory Tutorial Sheet
No ratings yet
Basic Probability Theory Tutorial Sheet
6 pages
Newton's Laws and Momentum Explained
No ratings yet
Newton's Laws and Momentum Explained
28 pages
Numerical Analysis Homework Solutions
No ratings yet
Numerical Analysis Homework Solutions
3 pages
Maxima, Minima, and Curve Sketching
No ratings yet
Maxima, Minima, and Curve Sketching
36 pages
Rocket Velocity and Acceleration Analysis
No ratings yet
Rocket Velocity and Acceleration Analysis
4 pages
Calculus Worksheet: Derivatives & Graphs
100% (1)
Calculus Worksheet: Derivatives & Graphs
3 pages
Y-Axis Rotation Matrix Derivation
No ratings yet
Y-Axis Rotation Matrix Derivation
125 pages
Centre of Mass Assessment Questions
No ratings yet
Centre of Mass Assessment Questions
6 pages
Centre of Mass in Particle Systems
No ratings yet
Centre of Mass in Particle Systems
14 pages
2nd Year HL Math Integration Quiz
No ratings yet
2nd Year HL Math Integration Quiz
5 pages
Differential Equations: Growth & Decay
No ratings yet
Differential Equations: Growth & Decay
22 pages
Unit Vectors and Vector Properties
No ratings yet
Unit Vectors and Vector Properties
9 pages
Conceptual Questions on Rotational Motion
100% (1)
Conceptual Questions on Rotational Motion
7 pages
Vector Geometry: Lines and Angles
No ratings yet
Vector Geometry: Lines and Angles
11 pages
Differentiation MCQ
100% (1)
Differentiation MCQ
4 pages
Centripetal Force in Circular Motion
No ratings yet
Centripetal Force in Circular Motion
58 pages
DocScanner 17 Jan 2025 1-35 PM
100% (1)
DocScanner 17 Jan 2025 1-35 PM
6 pages
MCQs on Moment of Inertia Concepts
No ratings yet
MCQs on Moment of Inertia Concepts
15 pages
Vector Algebra Problems and Solutions
No ratings yet
Vector Algebra Problems and Solutions
2 pages
Complex Numbers: Algebra & Geometry
No ratings yet
Complex Numbers: Algebra & Geometry
63 pages
Probability Theory Practice Questions
No ratings yet
Probability Theory Practice Questions
18 pages
Trigonometry Derivatives: Challenging MCQs
100% (1)
Trigonometry Derivatives: Challenging MCQs
1 page
Understanding the Central Limit Theorem
No ratings yet
Understanding the Central Limit Theorem
5 pages
Determinants MCQ Practice Questions
No ratings yet
Determinants MCQ Practice Questions
11 pages
Toppling Dynamics of Rigid Bodies
No ratings yet
Toppling Dynamics of Rigid Bodies
13 pages
Key Statistical Concepts Explained
No ratings yet
Key Statistical Concepts Explained
27 pages
Force and Moment Analysis in Mechanics
No ratings yet
Force and Moment Analysis in Mechanics
35 pages
Session3 - Moment Vector - Compressed
No ratings yet
Session3 - Moment Vector - Compressed
22 pages
Physics Problems on Force and Energy
No ratings yet
Physics Problems on Force and Energy
5 pages
Bayes' Theorem Explained
No ratings yet
Bayes' Theorem Explained
37 pages
Fundamentals of Probability Explained
No ratings yet
Fundamentals of Probability Explained
23 pages
Dynamics: Momentum and Newton's Laws
No ratings yet
Dynamics: Momentum and Newton's Laws
5 pages
Binomial Theorem MCQs and Solutions
No ratings yet
Binomial Theorem MCQs and Solutions
1 page
Understanding Toppling in JEE Physics
No ratings yet
Understanding Toppling in JEE Physics
4 pages
Recurrence Relation for String Counting
No ratings yet
Recurrence Relation for String Counting
203 pages
Combinatorial Problems and Solutions
No ratings yet
Combinatorial Problems and Solutions
42 pages
Functions and Domains Analysis
No ratings yet
Functions and Domains Analysis
356 pages
Analyzing Functions with Derivatives
No ratings yet
Analyzing Functions with Derivatives
19 pages
Understanding Sampling Distributions
No ratings yet
Understanding Sampling Distributions
8 pages
Convergence of Random Variables Explained
No ratings yet
Convergence of Random Variables Explained
7 pages
Understanding Sampling Distributions
100% (1)
Understanding Sampling Distributions
20 pages
Central Limit Theorem Explained
No ratings yet
Central Limit Theorem Explained
25 pages
Understanding Sampling Distributions
No ratings yet
Understanding Sampling Distributions
27 pages
Regression Analysis in Bivariate Data
No ratings yet
Regression Analysis in Bivariate Data
75 pages
Investigating H+ Concentration Effects
No ratings yet
Investigating H+ Concentration Effects
3 pages
Sodium Thiosulfate and HCl Reaction Analysis
No ratings yet
Sodium Thiosulfate and HCl Reaction Analysis
2 pages
Iodine Clock Reaction Data Analysis
No ratings yet
Iodine Clock Reaction Data Analysis
4 pages
Effect of Concentration on Reaction Rate
No ratings yet
Effect of Concentration on Reaction Rate
2 pages
Iodine Clock Reaction Analysis
67% (3)
Iodine Clock Reaction Analysis
2 pages
PTBD Procedure Overview and Steps
100% (1)
PTBD Procedure Overview and Steps
14 pages
UM0438 Variablelength Encoding Vle Extension Programming Interface Manual Stmicroelectronics
No ratings yet
UM0438 Variablelength Encoding Vle Extension Programming Interface Manual Stmicroelectronics
50 pages
Units 9-10 Quiz Assessment Material
No ratings yet
Units 9-10 Quiz Assessment Material
2 pages
2024 World MathFusion Olympiad Invite
No ratings yet
2024 World MathFusion Olympiad Invite
7 pages
English Aptitude Physics MCQ Practice Test
No ratings yet
English Aptitude Physics MCQ Practice Test
14 pages
NEO Bank Software Development Overview
No ratings yet
NEO Bank Software Development Overview
26 pages
Protecting Indigenous Tribes in Rainforests
100% (1)
Protecting Indigenous Tribes in Rainforests
2 pages
Etika dan Hukum Keperawatan Lansia
No ratings yet
Etika dan Hukum Keperawatan Lansia
111 pages
Sass 3 C
No ratings yet
Sass 3 C
32 pages
KMB Kimberly Clark 2018
No ratings yet
KMB Kimberly Clark 2018
34 pages
Sarojini Naidu's "An Indian Love Song" Analysis
100% (1)
Sarojini Naidu's "An Indian Love Song" Analysis
2 pages
ViewSonic VA240A-H 24" Full HD 120Hz Monitor With Fast 1ms Response Time - Datas 2
No ratings yet
ViewSonic VA240A-H 24" Full HD 120Hz Monitor With Fast 1ms Response Time - Datas 2
2 pages
Grade 10 Midterm English Test 2024-2025
No ratings yet
Grade 10 Midterm English Test 2024-2025
6 pages
Malaysia Marine Water Quality Criteria and Standard: Bahasa Malaysia Intranet Site Site Admin RSS Help/Manual
100% (3)
Malaysia Marine Water Quality Criteria and Standard: Bahasa Malaysia Intranet Site Site Admin RSS Help/Manual
2 pages
Ss1 Beauty and Cosmetology Practical Manual
No ratings yet
Ss1 Beauty and Cosmetology Practical Manual
52 pages
ZF 6S 850 Gearbox Service Manual
100% (2)
ZF 6S 850 Gearbox Service Manual
38 pages
Analyzing Harper Lee's Techniques in Mockingbird
No ratings yet
Analyzing Harper Lee's Techniques in Mockingbird
4 pages
Class 12 Geography: Sustainable Development
No ratings yet
Class 12 Geography: Sustainable Development
22 pages
Join Our New Year's Resolution Marathon
No ratings yet
Join Our New Year's Resolution Marathon
9 pages
Proinflammatory Proteins in Hylesia metabus
No ratings yet
Proinflammatory Proteins in Hylesia metabus
10 pages
Understanding TE Modes and Microwave Components
No ratings yet
Understanding TE Modes and Microwave Components
20 pages
Teacher I Application for Pasig Schools
No ratings yet
Teacher I Application for Pasig Schools
2 pages
Sicam Power Quality and Measurement: Catalog
100% (1)
Sicam Power Quality and Measurement: Catalog
42 pages
MIT Vishwashanti Gurukul School Achievements
No ratings yet
MIT Vishwashanti Gurukul School Achievements
14 pages
Txt.07 - Std'11 - Accountancy - Financial Accounting Part-I
100% (1)
Txt.07 - Std'11 - Accountancy - Financial Accounting Part-I
334 pages
Dietary Factors Affecting GDM Risk
No ratings yet
Dietary Factors Affecting GDM Risk
5 pages
Sani Molds Inspection Report
No ratings yet
Sani Molds Inspection Report
16 pages
AI & Big Data for Business Growth
No ratings yet
AI & Big Data for Business Growth
47 pages
General Anesthesia in Pediatric Dentistry
No ratings yet
General Anesthesia in Pediatric Dentistry
131 pages
Solar Water Heating System in India
No ratings yet
Solar Water Heating System in India
45 pages

Sampling Distributions and Statistics

Uploaded by

Sampling Distributions and Statistics

Uploaded by

Probability and statistics

Part 2.1: Random sampling and

conditioning variable(s) change. If the joint distribution of two random variables is

from it. Statistical inference gives us the tools to do this.

We start the study of statistical inference by studying how pieces of

drawn from. To do this, we initially assume that we know the distribution or

random sample of size n on the random variable X.

experiment of drawing a value from the distribution of X. In fact, the likelihood of

from the distribution of X. Therefore the distribution of each Xi is identical to the

independently and identically distributed or i.i.d. When a specific sample is drawn, we

have a realisation of this set of variables.

gn (x) = gn ( x1 , x2 ,..., xn ) = f1 ( x1 )  f 2 ( x2 )  ...  f n ( xn ) =

distribution they are drawn from.

The exponential parametric family is a one-parameter family of distributions. A

random variable X follows an exponential distribution with parameter λ if its pdf is

function given by:

g n (x) = f ( x1 )  f ( x2 )  ...  f ( xn ) =  e −  x1   e −  x2  ...   e −  xn =

for x with all xi positive, and gn (x) = 0 otherwise.

function h(  ) of the random sample. Then T is called sample statistic.

themselves. Their values are determined by the n independent draws of a random

Examples of sample statistics

The sample variance is defined as:

Examples of other sample statistics are:

Xmax = max(X1, X2, X3, …, Xn)

Xmin = min(X1, X2, X3, …, Xn)

The distribution of a sample statistic is called sampling distribution. In general, a

sampling distribution depends on:

3. The size n of the random sample.

The derivation of sampling distributions has variable degrees of difficulty. Certain

features of sampling distributions however, can be derived quite easily.

The sampling distribution of the sample mean

The sample mean theorem

equal to the variance of the population divided by the sample size.

sampling distribution of the sample mean.

random variables are also normally distributed.

Let Y = log(annual family income before taxes)

Then it is given that Y ~ N(9.680, 9.105)

We have that log(26,000)=10.166

 Y − 9.680 10.166 − 9.680 

Clearly it is necessary to know the exact distribution of a sample mean in order to

Central Limit Theorem

particular, the standardised sample mean

approaches the standard normal distribution N(0,1).

that the sample mean will be at most 1.75 miles?

Let X = distance travelled to the University of Bath.

 X − 2.0 1.75 − 2.0 

Common questions

In the context of random variables, explain how the joint distribution of a sample can be derived and why independence is crucial for this derivation.

In the context of random variables, explain how the joint distribution of a sample can be derived and why independence is crucial for this derivation.

Evaluate the impact of increasing the sample size on the precision of estimations derived from sampling distributions.

Evaluate the impact of increasing the sample size on the precision of estimations derived from sampling distributions.

Explain the significance of the Central Limit Theorem (CLT) in understanding the distribution of the sample mean, especially when the population distribution is unknown.

Explain the significance of the Central Limit Theorem (CLT) in understanding the distribution of the sample mean, especially when the population distribution is unknown.

How can the variance of the sample mean be influenced by the sample size, and what implications does this have for statistical analysis?

How can the variance of the sample mean be influenced by the sample size, and what implications does this have for statistical analysis?

Explain why it is critical to consider the underlying distribution of populations when calculating probabilities for sample means.

Explain why it is critical to consider the underlying distribution of populations when calculating probabilities for sample means.

How does the concept of random sampling contribute to statistical inference and what role does independence play in this context?

How does the concept of random sampling contribute to statistical inference and what role does independence play in this context?

How can sample statistics, such as sample minimum, maximum, and range, provide different insights about the data compared to the sample mean?

How can sample statistics, such as sample minimum, maximum, and range, provide different insights about the data compared to the sample mean?

How does understanding the concept of convergence in distribution enhance the interpretation of results derived from large samples?

How does understanding the concept of convergence in distribution enhance the interpretation of results derived from large samples?

How do sample statistics differ from population parameters, and what role do sample statistics play in statistical inference?

How do sample statistics differ from population parameters, and what role do sample statistics play in statistical inference?

Discuss the characteristics of the exponential distribution and describe how random sampling from this distribution is represented mathematically.

Discuss the characteristics of the exponential distribution and describe how random sampling from this distribution is represented mathematically.

You might also like