0% found this document useful (0 votes)

15 views64 pages

Understanding Statistical Inference

Uploaded by

Akhil Murali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views64 pages

Understanding Statistical Inference

Uploaded by

Akhil Murali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Statistical Inference

Statistical inference is the process of using data analysis to infer properties of an

underlying distribution of a population. It is a branch of statistics that deals with
making inferences about a population based on data from a sample.

Statistical inference is the process of drawing conclusions about a population

based on data from a sample.

Statistical inference is based on probability theory and probability

distributions. It involves making assumptions about the population and the
sample, and using statistical models to analyze the data.

It involves using probability theory to estimate parameters, test hypotheses, and

make predictions while accounting for uncertainty.

There are two main branches of statistical inference:

1. Test of Hypothesis

2. Theory of Estimation(or Parameter Estimation)
Parameter Estimation

Parameters are capable of being deduced; they are quantified traits or properties
related to the population you are studying.
Some instances comprise the population mean, population variance, and so
on-the-list.
Imagine measuring each person in a town to realize the mean. This is a daunting if
not an impossible task. Thus, most of the time, we use estimates.
There are two broad methods of parameter estimation:
● Point Estimation
● Interval Estimation

Hypothesis Testing

Hypothesis testing is used to make decisions or draw conclusions about a

population based on sample data.
It involves formulating a hypothesis about the population parameter, collecting
sample data, and then using statistical methods to determine whether the data
provide enough evidence to reject or fail to reject the hypothesis.

Statistical Inference Methods

There are various methods of statistical inference, some of these methods are:
● Parametric Methods
● Non-parametric Methods
● Bayesian Methods

Parametric Methods

In this scenario, the parametric statistical methods will assume that the data is
drawn from a population characterized by a probability distribution. It is mainly
believed that they follow a normal distribution thus can allow one to make guesses
about the populace in question .
For example, the t-tests and ANOVA are parametric tests that give accurate results
with the assumption that the data ought to be
● Example: A psychologist may ask himself if there is a measurable
difference, on average, between the IQ scores of women and men. To test
his theory, he draws samples from each group and assumes they are both
normally distributed. He can opt for a parametric test such as t-test and
assess if the mean disparity is statistically significant.

Non-Parametric Methods

These are less assumptive and more flexible analysis methods when dealing with
data out of normal distribution. They are also used to conduct data analysis when
one is uncertain about meeting the assumption for parametric methods and when
one has less or inadequate data .
Some of the non-parametric tests include Wilcoxon signed-rank test and
Kruskal-Wallis test among others.
● Example: A biologist has collected data on plant health in an ordinal
variable but since it is only a small sample and normal assumption is not
met, the biologist can use Kruskal-Wallis testing.

Bayesian Methods

Bayesian statistics is distinct from conventional methods in that it includes prior

knowledge and beliefs. It determines the various potential probabilities of a
hypothesis being genuine in the light of current and previous knowledge.
Thus, it allows updating the likelihood of beliefs with new data.
● Example: Consider a situation where a doctor is investigating a new
treatment and has the prior belief about the success rate of the treatment.
Upon conducting a new clinical trial, the doctor uses Bayesian method to
update his “prior belief” with the data from the new trials to estimate the
true success rate of the treatment.

Applications of Statistical Inference

Statistical inference has a wide range of applications across various fields. Here are
some common applications:
● Clinical Trials: In medical research, statistical inference is used to analyze
clinical trial data to determine the effectiveness of new treatments or
interventions. Researchers use statistical methods to compare treatment
groups, assess the significance of results, and make inferences about the
broader population of patients.
● Quality Control: In manufacturing and industrial settings, statistical
inference is used to monitor and improve product quality. Techniques such
as hypothesis testing and control charts are employed to make inferences
about the consistency and reliability of production processes based on
sample data.
● Market Research: In business and marketing, statistical inference is used
to analyze consumer behavior, conduct surveys, and make predictions
about market trends. Businesses use techniques such as regression analysis
and hypothesis testing to draw conclusions about customer preferences,
demand for products, and effectiveness of marketing strategies.
● Economics and Finance: In economics and finance, statistical inference is
used to analyze economic data, forecast trends, and make decisions about
investments and financial markets. Techniques such as time series analysis,
regression modeling, and Monte Carlo simulations are commonly used to
make inferences about economic indicators, asset prices, and risk
management.
Statistics and Parameters: Understanding the Difference

A parameter is a numerical value that describes a characteristic of a

population.

It is usually unknown because it's difficult or impossible to measure an entire

population.

Examples of parameters include:

● Population mean (μ)

● Population standard deviation (σ)
● Population proportion (p)

If we want to know the average height of all adults in a country, the true mean
height (μ) is a parameter, but it is unknown unless we measure every single adult.

A statistic is a numerical value that describes a characteristic of a sample (a

subset of the population).

It is known because we calculate it from collected data.

Examples of statistics include:

● Sample mean (xˉ)

● Sample standard deviation (s)
● Sample proportion (p^)

If we take a random sample of 1,000 adults and find their average height, that
sample mean (xˉ) is a statistic, which we use to estimate the population mean (μ).

● Parameters describe the true characteristics of a population but are often

unknown.
● Statistics help us make informed guesses about parameters using sampling
and estimation methods.
● The accuracy of a statistic in estimating a parameter depends on sample
size, randomness, and variability.

Estimation

To determine the value of parameters

To determine the interval in which parameters may lie

Point Estimation: Provides a single best guess for a population parameter (e.g.,
using the sample mean to estimate the population mean).

Interval Estimation: Provides a range of values within which the population

parameter is likely to lie (e.g., confidence intervals).

A good estimator has the following properties

Unbiasedness, Efficiency, Consistency, Sufficiency

Point Estimation

Point estimation is the process of using sample data to estimate an unknown

population parameter (such as mean, variance, or proportion).

A point estimator is a statistic (e.g., sample mean or sample proportion) used to

estimate the corresponding population parameter.

For example, to compute the population mean and population standard deviation,
the corresponding sample statistics: the sample mean x ̄ and sample standard
deviation s are calculated.

We refer to the sample mean x ̄ as the point estimator of the population mean and
the sample standard deviation s as the point estimator of the population standard
deviation. The numerical value obtained for x ̄ or s is called the point estimate.
Example of Point Estimation

Consider a factory that produces light bulbs, and the manager wants to estimate the
average lifespan of all light bulbs. Since testing all bulbs is impractical, the
manager randomly selects 50 bulbs, records their lifespans, and calculates the
sample mean.

● Population parameter to estimate: The true mean lifespan (μ) of all bulbs.
● Point estimator used: The sample mean (xˉ).

If the average lifespan of the 50 sampled bulbs is 1,200 hours, the point estimate
of the true average lifespan of all bulbs is also 1,200 hours.

Problem

The director of personnel for Electronics Associates, Inc. (EAI), has been assigned
the task of developing a profile of the company’s 2500 managers. The
characteristics to be identified include the mean annual salary for the managers and
the proportion of managers having completed the company’s management training
program.

Using the 2500 managers as the population for this study, we can find the annual
salary and the training program status for each individual by referring to the firm’s
personnel records. The data set containing this information for all 2500 managers
in the population is in the file named EAI.
Interval Estimation

Interval estimation is a statistical technique used to estimate a population

parameter (such as a mean or proportion) using a range of values rather than a
single point estimate.

This range is constructed to likely contain the true population parameter with a
specified level of confidence.(we will find an interval rather than a single point).

Confidence Interval Estimation

A confidence interval (CI) is a specific type of interval estimate that provides a

range of values within which the true population parameter is expected to lie, based
on sample data.

Instead of saying: “The average height is 165 cm.”

We can say: “We are 95% confident the average height is between 160 cm and
170 cm.”
It is expressed as:

Estimate ± Margin of Error

where the margin of error depends on the sample variability and confidence level
(e.g., 90%, 95%, 99%).

Let's say we take a sample of 50 students and calculate a 95% confidence interval
for their average height which turns out to be 160–170 cm.

This means If we repeatedly take similar samples 95% of those intervals would
contain the true average height of all students in the population.
Confidence level tells us how sure we are that the true value is within a
calculated range. If we have to repeat the sampling process many times we expect
that a certain percentage of those intervals will include the true value.

Confidence Level Meaning

90% 90 out of 100 intervals will include the true value
95% 95 out of 100 intervals will include the true value (most
commonly used)
99% 99 out of 100 intervals will include the true value (more
conservative)

Advantages of Confidence Interval Estimation

More Informative – Provides a range of plausible values instead of a single point

estimate.

Reflects Uncertainty – Accounts for sample variability and provides a measure of

estimation accuracy.

Supports Decision Making – Helps assess the reliability of estimates for policy,
business, and research decisions.

Applicable to Various Parameters – Can estimate means, proportions, variances.

Confidence Interval Table
Suppose a sample of 50 students has an average test score of 75 with a standard
deviation of 10. A 95% confidence interval for the population mean is calculated
as:
Confidence Interval for a Proportion
If a survey finds that 60% of 200 people prefer coffee over tea, the 95% confidence
interval for the true proportion is:

A researcher takes a sample of 40 students and finds an average IQ of 110 with a

standard deviation of 15. Find the 95% confidence interval for the population mean
IQ.
A survey finds that 300 out of 500 people prefer brand A. Find the 95% confidence
interval for the true proportion of people who prefer brand A.

Q. What does a 90% confidence interval of (0.4, 0.6) mean for a proportion?

We are 90% confident that the true population proportion lies between 40%
and 60%. This means that in 90 out of 100 samples, the true proportion will fall
within this range.

Q. If a confidence interval for the mean income of workers is (45,000, 55,000),

does this mean that 95% of workers earn within this range?
No. The confidence interval applies to the population mean, not individual
values. It means we are 95% confident that the true average income falls within
this range, but individual salaries could vary widely.

Q. A confidence interval for the mean weight of apples is (150g, 170g). If we

take a new sample, will the CI be exactly the same?

No. A new sample might give a slightly different mean and CI because of sample
variability, but if we repeatedly sample 95% of the time, the true mean should fall
within the computed intervals.
Test of Hypothesis

Hypothesis testing is a statistical method used to make decisions or inferences

about a population based on sample data.

Hypothesis testing compares two opposite ideas about a group of people or things
and uses data from a small part of that group (a sample) to decide which idea is
more likely true.

It helps determine whether an assumption (hypothesis) about a population

parameter is likely to be true.

It is used to determine if there is enough evidence to reject a certain claim

(hypothesis) about a population based on sample data.

A statistical hypothesis is a statement about a population parameter (e.g., mean μ,

proportion p, variance σ2).
Classification of Hypothesis Testing

Hypothesis testing can be classified based on different criteria:

Simple VS Compostite

(a) Simple Hypothesis

● A hypothesis is called simple when it makes a specific statement about a

population parameter.
● It defines an exact value of the parameter.
● Normally, form and parameters will be given
● Example:
○ H0:μ=50 (The population mean is exactly 50.)
○ H1:μ≠50 (The population mean is not 50.)

(b) Composite Hypothesis

● A hypothesis is composite when it does not specify the exact value of the
population parameter but instead provides a range.
● Either form or parameter will be given

● Example:
○ H0:μ≥50 (The population mean is at least 50.)
○ H1:μ<50 (The population mean is less than 50.)

Null VS Alternative

(a) Null Hypothesis (H0)

● Represents the status quo or assumption of no effect.

● It assumes no significant difference or relationship exists.(no effect, no
difference, or no relationship)
● We do not prove H0; we either reject or fail to reject it.
● Example:
○ H0:μ=100 (The population mean is 100.)
○ H0:p≥0.5 (The proportion is at least 50%.)

(b) Alternative Hypothesis (H1or Ha)

● Represents what we want to prove or detect.

● It suggests that a significant effect, difference, or relationship
exists.(significant effect, difference, or relationship)
● If H0 is rejected, we accept H1.
● Example:
○ H1:μ≠100 (The population mean is different from 100.)
○ H1:p<0.5 (The proportion is less than 50%.)

Test statistics

Test statistics is a branch of statistics that deals with hypothesis testing.

It involves using sample data to make inferences about a population.

Hypothesis Testing – The process of making decisions about a population based

on sample data. It includes:

● Null Hypothesis (H₀): Assumes no effect or no difference.

● Alternative Hypothesis (H₁ or Ha): Represents what you want to prove.

Test Statistic – A numerical value calculated from sample data that helps
determine whether to reject the null hypothesis. Common test statistics include:

● Z-test: Used when population variance is known and the sample size is
large.
● T-test: Used when the population variance is unknown and the sample size
is small.
● Chi-square test (χ²): Used for categorical data and independence testing.
● F-test: Used to compare variances between groups.
P-value – The probability of obtaining a test statistic as extreme as the one
observed, assuming the null hypothesis is true. A small p-value (e.g., <0.05)
suggests rejecting H₀.

The chance of seeing the data if the null hypothesis is true. If this is less than α, we
say the claim is probably false. In simpler words, it is used to reject or support the
null hypothesis during hypothesis testing.

Significance Level (α) – The threshold for rejecting H₀, commonly set at 0.05 or
0.01.

Confidence Intervals – A range of values within which the true population

parameter is likely to fall, often used alongside hypothesis testing.

Level of Significance (α)

The level of significance (α) is the probability of rejecting a true null hypothesis
(H₀).

It represents the risk of making a Type I error—incorrectly concluding that there

is an effect when there isn’t one.

Common values of α:

● 0.05 (5%): Most commonly used. Means there's a 5% chance of wrongly

rejecting H₀.
● 0.01 (1%): More strict, used in cases requiring high accuracy (e.g., medical
trials).
● 0.10 (10%): Less strict, used in exploratory research.

Example 1: Coin Toss Fairness

● H₀: The coin is fair (50% heads, 50% tails).

● H₁: The coin is biased.
○ If we set α = 0.05, we accept a 5% chance that we wrongly conclude
the coin is biased when it’s actually fair.
Example 2: New Drug Effectiveness

● H₀: The new drug has no effect.

● H₁: The new drug is effective.
○ If α = 0.01, we only allow a 1% chance of mistakenly concluding that
the drug works when it actually doesn’t.

Degree of Freedom (df) in Statistics

The degree of freedom (df) refers to the number of independent values that can
vary in a statistical calculation.

It helps determine the critical values of test statistics (e.g., t-test, chi-square test).

Formula for Degrees of Freedom

● For a single sample: df=n−1

● For two samples (independent t-test): df=(n1−1)+(n2−1)
● For Chi-square test: df=(rows−1)×(columns−1)

Example 1: Average of 5 Numbers

Imagine you have 5 numbers, and their average is 10. If you know 4 of them (e.g.,
8, 12, 9, 11), the 5th number is automatically determined to keep the average at
10.

● You can freely choose 4 numbers, but the 5th is not independent.
● So, df = 5 - 1 = 4.

Example 2: t-test for a Sample Mean

Suppose a teacher collects test scores from 20 students to compare against the
expected average. Since we estimate the mean from the sample, one data point is
dependent.

● The degrees of freedom = n−1=20−1=19

Example 3: Chi-square Test for a 3×2 Table

If we conduct a chi-square test on a table with 3 rows and 2 columns, the degrees
of freedom are:

● df=(3−1)×(2−1)=2×1=2

Procedure for Hypothesis Testing (6 Steps)

Step 5 : We compare the test statistic to a critical value from a statistical table or
use the p-value:
1. Using Critical Value:
● If test statistic > critical value → reject H0.
● If the test statistic ≤ critical value → fail to reject H0.

2. Using P-value:
● If p-value ≤ α → reject H0.
● If p-value > α → fail to reject H0.

Example: If p-value is 0.03 and α is 0.05, we reject the null hypothesis because
0.03 < 0.05.
Errors in Hypothesis Testing

Accepts H0, when H0 is true

Reject H0, when H0 is false

Reject H0, when Ho is true (Type I Error)

Accepts H0, when H0 is false(Type II Error)

Type I Error (False Positive)

● Occurs when we reject a true null hypothesis (H₀).

● We conclude that there is an effect when, in reality, there is none.
● The probability of making a Type I error is α (significance level).

Example:

● H₀: A person is innocent.

● H₁: A person is guilty.
● Type I Error: Convicting an innocent person.

Type II Error (False Negative)

● Occurs when we fail to reject a false null hypothesis (H₀).

● We conclude that there is no effect when, in reality, there is one.
● The probability of making a Type II error is β (related to statistical power).

Example:

● H₀: A person is innocent.

● H₁: A person is guilty.
● Type II Error: Letting a guilty person go free.
Null Hypothesis is True Null Hypothesis is False
Null Hypothesis is True Correct Decision Type II Error (False
(Accept) Negative)
Alternative Hypothesis is True Type I Error (False Correct Decision
(Reject) Positive)

Example 1: Medical Testing

● H₀: A patient does not have cancer.

● H₁: A patient has cancer.
● Type I Error: Diagnosing a healthy person with cancer (false alarm).
● Type II Error: Failing to detect cancer in a sick patient (missed diagnosis).

Example 2: Fire Alarm System

● H₀: No fire.
● H₁: Fire is present.
● Type I Error: Alarm goes off when there's no fire (false alarm).
● Type II Error: No alarm when there is a fire (dangerous mistake).

A factory produces light bulbs, and the average lifespan of a bulb is known to be
1,000 hours. A new manufacturing process is introduced, and the factory claims
that the average lifespan has increased. We take a sample of 30 bulbs, which have
an average lifespan of 1,050 hours with a standard deviation of 120 hours.

Test this claim at a 5% significance level (α = 0.05) using a one-tailed t-test.

Regions in Hypothesis Testing

In hypothesis testing, we divide the range of possible test statistic values into two
regions:

1. Acceptance Region – The range where we fail to reject the null hypothesis
(H₀).
2. Rejection (Critical) Region – The range where we reject the null
hypothesis (H₀) in favor of the alternative hypothesis (H₁).
Critical Region (Rejection Region)

● This is the range of values where, if the test statistic falls within it, we reject
H₀.
● Defined based on the significance level (α).
● It represents extreme values that are unlikely if H₀ is true.

Acceptance Region

● This is the range of values where we fail to reject H₀.

● If the test statistic falls in this region, we do not have enough evidence to
support H₁.
● Typically, this is the central part of the probability distribution.

One-Tailed and Two-Tailed Test Regions in Hypothesis Testing

In hypothesis testing, the type of test (one-tailed or two-tailed) determines how we

set up the rejection (critical) region.

Used when we expect a change in only one direction either up or down, but not
both. For example, if testing whether a new algorithm improves accuracy, we only
check if accuracy increases.
One-Tailed Test
Two-Tailed Test

Used when we want to see if there is a difference in either direction higher or

lower.
Chi-Square Test

A Chi-Square test is a statistical test used to determine if there is a significant

association between two categorical variables.

It compares the observed frequencies in a dataset with the expected frequencies to

see if any differences are due to chance.

Types of Chi-Square Tests

Chi-Square Goodness of Fit Test – Determines if a sample distribution fits an

expected distribution.

Chi-Square Test for Independence – Checks if two categorical variables are

independent.

Chi-Square Test of Independence

The Chi-Square Test of Independence is a statistical test used to determine

whether there is a significant association between two categorical variables.

It helps to assess whether the distribution of sample data differs from what we
would expect if the variables were independent.
Testing Procedure for Chi-Square Test of Independence
Smoking and Lung Disease
Chi-Square Test for Goodness of Fit
A sample of an examination result of 200 students was made. It was found that 46
students had failed, 68 secured III class, 62 second class and the rest were placed in
the first division. Are these figures commensurate with the general examination
results which is in the ratio 2:3 : 3; 2 for various categories respectively. Test at 5%
alpha Level.

Step 2: Compute Expected Frequencies (EEE)

Using the given ratio:

● Failed: E = 2 / 10 × 200 = 40
● III Class: E = 3 / 10 × 200 = 60
● II Class: E = 3 / 10 × 200 = 60
● I Class: E = 2 / 10 × 200 = 40
Step 3: Compute Chi-Square Value

Χ2 = 0.9 + 1.07 + 0.07 + 6.4 = 8.44

F Test

An F-test is a statistical test that compares the variances of two or more groups to
determine if they are significantly different.

It is commonly used in ANOVA (Analysis of Variance) and regression analysis.

If F is significantly large or small, it suggests a difference in variances.

A teacher wants to compare the variance in test scores between two different
classes to see if they have equal variability.
Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA) is a statistical method used to compare the means
of three or more groups to determine if there are significant differences among
them.

It helps to test the hypothesis that all group means are equal versus at least one
mean being different.

Types of ANOVA

One-Way ANOVA (Single Factor ANOVA)

● Compares means of one independent variable across multiple groups.

● Example: Comparing students' test scores in three different teaching
methods.

Two-Way ANOVA

● Compares means with two independent variables (factors).

● Example: Studying the effect of diet type and exercise level on weight loss.

Steps to Solve ANOVA

Understanding Statistical Inference Techniques
No ratings yet
Understanding Statistical Inference Techniques
4 pages
Statistical Inference
No ratings yet
Statistical Inference
6 pages
Understanding Statistical Inference Methods
No ratings yet
Understanding Statistical Inference Methods
8 pages
Statistical Inference for Data Science
No ratings yet
Statistical Inference for Data Science
11 pages
Statistical Inference Course Overview
No ratings yet
Statistical Inference Course Overview
91 pages
Inferential Statistics - An Easy Introduction & Examples
No ratings yet
Inferential Statistics - An Easy Introduction & Examples
18 pages
Inferential Biostatistics Overview
No ratings yet
Inferential Biostatistics Overview
34 pages
Inferential Statistics
No ratings yet
Inferential Statistics
6 pages
Understanding Statistical Inference
No ratings yet
Understanding Statistical Inference
134 pages
Understanding Probability and Estimation
No ratings yet
Understanding Probability and Estimation
92 pages
Understanding Statistical Inference
No ratings yet
Understanding Statistical Inference
3 pages
Statistical Inference: Hypothesis Testing & CI
No ratings yet
Statistical Inference: Hypothesis Testing & CI
18 pages
Statistical Estimation & Hypothesis Testing
No ratings yet
Statistical Estimation & Hypothesis Testing
20 pages
Six Sigma Analyze Phase Overview
No ratings yet
Six Sigma Analyze Phase Overview
4 pages
Statistical Inference: Methods and Applications
No ratings yet
Statistical Inference: Methods and Applications
9 pages
Understanding Estimation in Statistics
No ratings yet
Understanding Estimation in Statistics
22 pages
Lecture Note Statistical Inference
No ratings yet
Lecture Note Statistical Inference
89 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
3 pages
Introduction to Statistics Overview
No ratings yet
Introduction to Statistics Overview
50 pages
Introduction to Statistics and Analysis
No ratings yet
Introduction to Statistics and Analysis
125 pages
Understanding Inferential Statistics
100% (3)
Understanding Inferential Statistics
38 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
3 pages
Parameter Estimation in Statistics
No ratings yet
Parameter Estimation in Statistics
6 pages
Statistical Inference Basics
100% (1)
Statistical Inference Basics
53 pages
Confidence Intervals in Epidemiology
No ratings yet
Confidence Intervals in Epidemiology
119 pages
Estimation and Hypothesis Testing Guide
No ratings yet
Estimation and Hypothesis Testing Guide
119 pages
Understanding Statistical Inference Techniques
No ratings yet
Understanding Statistical Inference Techniques
3 pages
Population Inference & Proportion Methods
No ratings yet
Population Inference & Proportion Methods
37 pages
4 Inferentials
No ratings yet
4 Inferentials
53 pages
Statistical Inference Concepts Explained
No ratings yet
Statistical Inference Concepts Explained
4 pages
Data Science Statistics Overview
No ratings yet
Data Science Statistics Overview
21 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
38 pages
Understanding Statistics and Sampling
No ratings yet
Understanding Statistics and Sampling
47 pages
Introduction to Probability & Statistics
No ratings yet
Introduction to Probability & Statistics
20 pages
Understanding Statistics and Analysis
No ratings yet
Understanding Statistics and Analysis
26 pages
D3L1 GB Inference 20190626
No ratings yet
D3L1 GB Inference 20190626
16 pages
Statistical Inference and Hypothesis Testing
No ratings yet
Statistical Inference and Hypothesis Testing
43 pages
Understanding Estimators in Statistics
No ratings yet
Understanding Estimators in Statistics
28 pages
D. Inferential Statistics
No ratings yet
D. Inferential Statistics
5 pages
Understanding Statistical Inference
No ratings yet
Understanding Statistical Inference
38 pages
Statistical Estimation Techniques Explained
No ratings yet
Statistical Estimation Techniques Explained
130 pages
Statistics: Descriptive & Inferential Basics
No ratings yet
Statistics: Descriptive & Inferential Basics
11 pages
Introduction to Inferential Statistics
No ratings yet
Introduction to Inferential Statistics
29 pages
Overview of Statistical Concepts
No ratings yet
Overview of Statistical Concepts
6 pages
Confidence Intervals in Estimation
No ratings yet
Confidence Intervals in Estimation
85 pages
Estimation and Hypothesis Testing in Statistics
No ratings yet
Estimation and Hypothesis Testing in Statistics
20 pages
Hypothesis Testing and Estimation Techniques
No ratings yet
Hypothesis Testing and Estimation Techniques
116 pages
Unbiased Estimators and Confidence Intervals
No ratings yet
Unbiased Estimators and Confidence Intervals
40 pages
Statistical Inference: Estimating Population Parameters
No ratings yet
Statistical Inference: Estimating Population Parameters
23 pages
TMP 210635187
No ratings yet
TMP 210635187
54 pages
Statistical Estimation and Hypothesis Testing
No ratings yet
Statistical Estimation and Hypothesis Testing
21 pages
Statistical Inference: Estimation & Hypothesis Testing
No ratings yet
Statistical Inference: Estimation & Hypothesis Testing
37 pages
Estimating Cloth Diaper Usage in Parents
No ratings yet
Estimating Cloth Diaper Usage in Parents
18 pages
Introduction to Inferential Statistics
No ratings yet
Introduction to Inferential Statistics
11 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
16 pages
Essentials of Statistics in SPSS
No ratings yet
Essentials of Statistics in SPSS
6 pages
Chapter Eight
No ratings yet
Chapter Eight
37 pages
Estimating Population Parameters Guide
No ratings yet
Estimating Population Parameters Guide
15 pages
Business Analytics Cheat Sheet
No ratings yet
Business Analytics Cheat Sheet
16 pages
Bayesian Statistical Inference Overview
No ratings yet
Bayesian Statistical Inference Overview
49 pages
Statistical Inference and Estimation Methods
No ratings yet
Statistical Inference and Estimation Methods
101 pages
Bayesian DCF for Valuation Uncertainty
No ratings yet
Bayesian DCF for Valuation Uncertainty
11 pages
Characteristics of Point Estimators
No ratings yet
Characteristics of Point Estimators
2 pages
Point Estimation in Statistics
No ratings yet
Point Estimation in Statistics
35 pages
AP Statistics Unit 5 Quiz Overview
No ratings yet
AP Statistics Unit 5 Quiz Overview
15 pages
Statistical Analysis of Various Metrics
No ratings yet
Statistical Analysis of Various Metrics
2 pages
Understanding Student t Distribution
No ratings yet
Understanding Student t Distribution
41 pages
Confidence Intervals and Sample Size
No ratings yet
Confidence Intervals and Sample Size
84 pages
STQS1913 Assignment 02
No ratings yet
STQS1913 Assignment 02
3 pages
Confidence Intervals for Population Mean
No ratings yet
Confidence Intervals for Population Mean
16 pages
Interval Estimation for Population Mean
No ratings yet
Interval Estimation for Population Mean
27 pages
Applied Statistics Concepts and Methods
No ratings yet
Applied Statistics Concepts and Methods
16 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
174 pages
Interval Estimates for Population Proportions
No ratings yet
Interval Estimates for Population Proportions
24 pages
APMA 1655: Honors Statistical Inference
No ratings yet
APMA 1655: Honors Statistical Inference
5 pages
Interval Estimation in Statistics
No ratings yet
Interval Estimation in Statistics
18 pages
Assam University Statistics Curriculum 2023
No ratings yet
Assam University Statistics Curriculum 2023
96 pages
Marketing Rainfall Insurance in India
No ratings yet
Marketing Rainfall Insurance in India
18 pages
Unit-I Machine Learning Basics
No ratings yet
Unit-I Machine Learning Basics
85 pages
Sampling and Estimation Theory Explained
No ratings yet
Sampling and Estimation Theory Explained
8 pages
Statistical Inference Revision Notes
No ratings yet
Statistical Inference Revision Notes
21 pages
Confidence Intervals for Engineers
No ratings yet
Confidence Intervals for Engineers
13 pages
Estimating Population Mean in Statistics
No ratings yet
Estimating Population Mean in Statistics
5 pages
Estimating Population Parameters in Statistics
No ratings yet
Estimating Population Parameters in Statistics
7 pages
Point and Interval Estimation in Statistics
No ratings yet
Point and Interval Estimation in Statistics
21 pages
Summary of Graduation Requiements
No ratings yet
Summary of Graduation Requiements
12 pages
AP Statistics Study Plan Overview
No ratings yet
AP Statistics Study Plan Overview
1 page
Bayesian Methods in Clinical Trials Guide
No ratings yet
Bayesian Methods in Clinical Trials Guide
29 pages
Kelley & Rausch, 2006, AIPE For STD Mean Diff
No ratings yet
Kelley & Rausch, 2006, AIPE For STD Mean Diff
23 pages

Understanding Statistical Inference

Uploaded by

Understanding Statistical Inference

Uploaded by

Statistical Inference

Statistical inference is the process of using data analysis to infer properties of an

Statistical inference is the process of drawing conclusions about a population

Statistical inference is based on probability theory and probability

It involves using probability theory to estimate parameters, test hypotheses, and

There are two main branches of statistical inference:

1.​ Test of Hypothesis

Hypothesis testing is used to make decisions or draw conclusions about a

Statistical Inference Methods

Bayesian statistics is distinct from conventional methods in that it includes prior

Applications of Statistical Inference

A parameter is a numerical value that describes a characteristic of a

It is usually unknown because it's difficult or impossible to measure an entire

Examples of parameters include:

●​ Population mean (μ)

A statistic is a numerical value that describes a characteristic of a sample (a

It is known because we calculate it from collected data.

Examples of statistics include:

●​ Sample mean (xˉ)

●​ Parameters describe the true characteristics of a population but are often

To determine the value of parameters

To determine the interval in which parameters may lie

Interval Estimation: Provides a range of values within which the population

A good estimator has the following properties

Unbiasedness, Efficiency, Consistency, Sufficiency

Point estimation is the process of using sample data to estimate an unknown

A point estimator is a statistic (e.g., sample mean or sample proportion) used to

Interval estimation is a statistical technique used to estimate a population

Confidence Interval Estimation

A confidence interval (CI) is a specific type of interval estimate that provides a

Instead of saying: “The average height is 165 cm.”

Estimate ± Margin of Error

Confidence Level Meaning

Advantages of Confidence Interval Estimation

More Informative – Provides a range of plausible values instead of a single point

Reflects Uncertainty – Accounts for sample variability and provides a measure of

Applicable to Various Parameters – Can estimate means, proportions, variances.

A researcher takes a sample of 40 students and finds an average IQ of 110 with a

Q. If a confidence interval for the mean income of workers is (45,000, 55,000),

Q. A confidence interval for the mean weight of apples is (150g, 170g). If we

Hypothesis testing is a statistical method used to make decisions or inferences

It helps determine whether an assumption (hypothesis) about a population

It is used to determine if there is enough evidence to reject a certain claim

A statistical hypothesis is a statement about a population parameter (e.g., mean μ,

Hypothesis testing can be classified based on different criteria:

(a) Simple Hypothesis

●​ A hypothesis is called simple when it makes a specific statement about a

(b) Composite Hypothesis

(a) Null Hypothesis (H0)

●​ Represents the status quo or assumption of no effect.

(b) Alternative Hypothesis (H1​or Ha​)

●​ Represents what we want to prove or detect.

Test statistics is a branch of statistics that deals with hypothesis testing.

It involves using sample data to make inferences about a population.

Hypothesis Testing – The process of making decisions about a population based

●​ Null Hypothesis (H₀): Assumes no effect or no difference.

Confidence Intervals – A range of values within which the true population

Level of Significance (α)

It represents the risk of making a Type I error—incorrectly concluding that there

●​ 0.05 (5%): Most commonly used. Means there's a 5% chance of wrongly

Example 1: Coin Toss Fairness

●​ H₀: The coin is fair (50% heads, 50% tails).

●​ H₀: The new drug has no effect.

Degree of Freedom (df) in Statistics

Formula for Degrees of Freedom

●​ For a single sample: df=n−1

Example 1: Average of 5 Numbers

Example 2: t-test for a Sample Mean

●​ The degrees of freedom = n−1=20−1=19

Procedure for Hypothesis Testing (6 Steps)

Accepts H0, when H0 is true

Reject H0, when H0 is false

Reject H0, when Ho is true (Type I Error)

Accepts H0, when H0 is false(Type II Error)

Type I Error (False Positive)

●​ Occurs when we reject a true null hypothesis (H₀).

1. Test of Hypothesis

● Population mean (μ)

● Sample mean (xˉ)

● Parameters describe the true characteristics of a population but are often

● A hypothesis is called simple when it makes a specific statement about a

● Represents the status quo or assumption of no effect.

(b) Alternative Hypothesis (H1or Ha)

● Represents what we want to prove or detect.

● Null Hypothesis (H₀): Assumes no effect or no difference.

● 0.05 (5%): Most commonly used. Means there's a 5% chance of wrongly

● H₀: The coin is fair (50% heads, 50% tails).

● H₀: The new drug has no effect.

● For a single sample: df=n−1

● The degrees of freedom = n−1=20−1=19

● Occurs when we reject a true null hypothesis (H₀).

● H₀: A person is innocent.

● Occurs when we fail to reject a false null hypothesis (H₀).

● H₀: A person is innocent.

● H₀: A patient does not have cancer.

● This is the range of values where we fail to reject H₀.

● Compares means of one independent variable across multiple groups.

● Compares means with two independent variables (factors).