0% found this document useful (0 votes)

9 views19 pages

Statistical Inference for Two Populations

Uploaded by

Hannah Lorraine Santos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views19 pages

Statistical Inference for Two Populations

Uploaded by

Hannah Lorraine Santos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Statistical inference concerning

two populations
Inferences concerning the difference
between two means
Independent random samples

- Samples that are completely

unrelated to one another
- The samples are clearly delineated
- No way influenced by the selection of
the other
- Battery lives between brand A and
Brand B
- Difference between male and female
salaries

Sample statistics

- Used to estimate population

parameter
- 𝑋 ̅ is the point estimator for the
population mean μ
- Has to be normally distributed or
approximately normally distributed
(sample data from normal population
and large sample size > 30)

Two means sample stats

Population variances

Confidence interval

d_0=hypothesized
difference
- Before and after studies
- Pairing of observations

Inference concerning mean differences

- Crucial assumption for inferences
about μ_1-μ_2 is the samples are
drawn independently.
- A common case of dependent
sampling is matched pairs
(comparison between apples and
apples)
- The estimator for 𝑝_1−𝑝_2 is
𝑃 ̅_1−𝑃 ̅_2.
- If 𝑛1 and 𝑛2 are sufficiently large, the
sampling distribution of 𝑃̅1 − 𝑃̅2 can
be approximated by the normal
distribution. (the general guideline is
that n1p1, n1 (1 − p1 ), n2p2, and n2(1
− p2) must all be greater than or equal
to 5)
- For independent samples, the
standard error is

- Since the population proportions are

not known, use the sample
proportions to estimate se(P ̅_1-P ̅_2 )
in the margin of error.

Inference concerning the difference

between two proportions
- Like with inferences for μ_1-μ_2, we
consider inferences for p_1-p_2 under
independent sampling.
Statistical inference concerning Skewness
variance
Introduction

- We use the sample variance 𝑆 2 as an

estimator of the population variance
𝜎 2.
- To make inferences regarding 𝜎 2 , we
first need the sampling distribution of
𝑆 2.
- This requires a new distribution.

Statistical inferences regarding 𝜎^2are based

on the 𝜒^2 or chi-square distribution

The 𝜒 2 distribution is a family of distributions.

2
- Like 𝑡𝑑𝑓 , 𝜒𝑑𝑓 depends on 𝑑𝑓.
- Probability distribution for the sum of
several independent squared
standard normal random variables
- 𝑑𝑓 is the number of squared standard
normal random variables in the
summation - We sometimes need values in the left
tail of the distribution
S2 is based on the squared differences 2
between the sample values and the sample - 𝜒𝑑𝑓 is not symmetric (unlike 𝑧 or 𝑡𝑑𝑓 ),
mean. values in the lower tail are not the
negative of values in the upper tail.
2
- Let 𝜒1−𝛼,𝑑𝑓 represent a value such
that the area to the right is 1 − 𝛼 and
the area to the left is 𝛼.
2 2
- 𝑃(𝜒𝑑𝑓 ≥ 𝜒1−𝛼,𝑑𝑓 )=1−𝛼
2 2
- Unlick pass chapters we use the - 𝑃(𝜒𝑑𝑓 < 𝜒1−𝛼,𝑑𝑓 )=𝛼

notation to represent a random

variable as well as its value
Confidence interval for population
variance

- Take a sample of size from a normal

population with finite variance.
2 (𝑛−1)𝑆 2 2
- 𝜒𝑑𝑓 = 𝜎2
has a 𝜒𝑑𝑓 distribution
with 𝑑𝑓 = 𝑛 − 1.
2
- Since 𝜒𝑑𝑓 is not symmetric, the
confidence interval does not follow
the form of point estimate ± margin of
error. Hypothesis testing
2 2
- Start with 𝑃(𝜒1−𝛼 ⁄2,𝑑𝑓 ≤ 𝜒𝑑𝑓 ≤
𝜒𝛼2⁄2,𝑑𝑓 ) = 1 − 𝛼.

- For SD
Inference concerning the ratio of two
population variances
- We compare two population
variances, 𝜎12 and 𝜎22 , through the
ratio 𝜎12 /𝜎22 .
- If 𝜎12 = 𝜎22 , then 𝜎12 /𝜎22 = 1.

Sample distribution

The sampling distribution of 𝑆_1^2/𝑆_2^2 the

𝐹 distribution.

The 𝐹 distribution is characterized by a family

of distributions.
2
- Like 𝑡𝑑𝑓 and 𝜒𝑑𝑓 , 𝐹 depends on two
degrees of freedom
- 𝑑𝑓1 is the numerator degrees of
freedom
- 𝑑𝑓2 is the denominator degrees of
freedom

is the probability distribution of the

ratio of two independent chi-squared
variables divided by their degrees of freedom.
Chi-square tests
Goodness of fit test for multinomial
experiment

Bernoulli process

- Also known as binomial experiment

- There is a series of n independent and
identical trials of an experiment.
- Each trial has only two outcomes:
success and failure.
- The probability of success is denoted
as 𝑝 and the probability of failure is
denoted as 1 − 𝑝.
- Let 𝑝1 and 𝑝2 represent these
probabilities, 𝑝1 + 𝑝2 = 1.

Goodness of fit

- About the probabilities/proportions of

the multinomial experiment

Choices for competing hypotheses

- Set all the population proportions

equal to the same specific value so
they are equal to one another.
- Set set population proportion equal to
a different predetermined
(hypothesized value).

Ex. Suppose you have four different

candidates.

- Is the proportion of voters who favor

the candidates not the same

- Contest specific value

Regression analysis Hypothesis testing

Hypothesis test for the correlation

coefficient

Sample covariance
- To determine whether the linear
- Measures the direction of the linear
relationship implied by the sample
relationship between two variables x
correlation coefficient is real or due to
and y
chance
- Negative: negative linear relationship
- Let 𝜌𝑥,𝑦 denote the population
- Positive: positive linear relationship
coefficient.
- Zero: no linear relationship
- Further interpretation is difficult Test statistic
because it is sensitive to units.
- Cannot be used to determine strength
of the linear relationships
- The sample correlation coefficient 𝑟𝑥𝑦
is easier to interpret.

Sample correlation coefficient rxy (R)

- Describes both the direction and

strength of the linear relationship
between x and y
- The correlation coefficient captures
only a linear relationship.
- The correlation coefficient may not be
a reliable measure in the presence of
outliers.
- Correlation does not imply causation.
- Even if two variables are highly
- The correlation is unit-free correlated, one does not necessarily
- Negative: negative linear relationship cause the other.
- Positive: positive linear relationship - The correlation cannot be used for
- Zero: no linear relationship prediction
- The correlation is between -1 and 1 - To predict values, we need a model.
- Correlation is -1: perfect negative
The linear regression model
linear relationship
- Correlation is 0: not linearly related - Regression analysis is one of the most
- Correlation is 1: perfect positive linear widely used methodologies in
relationship business.
- One variable, called the response
variable, is influenced by other
variables, called the explanatory
variables.
- The input or predictor variables are
denoted as 𝑥1 , 𝑥2 , ⋯ , 𝑥𝑘

- Use information on the explanatory Components of a linear regression model

variables to predict/or describe - Deterministic: approximates the
changes in the response variable. relationship we want to model
- Allows us to make predictions - Stochastic: random error term
regarding the response variable.
- Just like correlation, regression has Simple linear regression model
limitations. - Uses only one predictor/explanatory
- A regression model appears to search
variable
for causality when it basically detects
- Multiple linear regression model for
correlation. many explanatory variable
- We cannot use standard regression
analysis to establish cause-and- (y = mx + B)
effect relationships.
- Causality can only be established
through randomized experiments
and/or advanced statistical models,
which are outside the scope of this
text.
- The expected value of y for a given
- No matter the response, we cannot
value of x lies on a straight line:
expect to predict its exact value.
- If the value of the response variable is
uniquely determined by the values of - The slope parameter 𝛽1 determines
the explanatory variables, we say that whether the linear relationship is
the relationship is deterministic. positive or negative.
- In most fields, we find that the - The population parameters 𝛽0 and 𝛽1
relationship between the explanatory are unknown and must be estimated.
variables and the response is
stochastic due to the omission of
relevant factors (sometimes not
measurable) that influence the
response variable

- Develop a mathematical model that

captures the relationship between the
response variable and explanatory
variables
- The target or response is denoted as y
- (read as y-hat) is the predicted - 𝑏_1 is the change in 𝑦 ̂ when 𝑥
value of the response variable given a increases by one unit
value of x. - 𝑏_0 is the predicted value when x has
- The difference between the observed a value of zero, not always meaningful
and the predicted values is the
residual: 𝑒 = 𝑦 − 𝑦̂.

Method of least squares

- Also referred to as ordinary least

squares (OLS)
- Common approach to fitting a line to
a scatterplot
- used to obtain estimates of 𝛽_0 and
𝛽_1
- OLS chooses the line (𝑏0 and 𝑏1 ) to
minimize the error sum of squares,
𝑆𝑆𝐸 = ∑(𝑦 − 𝑦̂)2 = ∑ 𝑒 2 .

Sum of the squared differences between

the observed values and their predicted
values.

- Sum of squared differences from the

regression equation.
- Desirable properties if certain
Multiple linear regression model
assumptions hold
- Gives an equation “closest” to the - Having only one explanatory variable
data might reduce the usefulness of the
model
- A multiple linear regression model
allows us to examine how the
response is influenced by two or more
explanatory variables.
- The choices of the explanatory
variables are based on economic
theory, intuition, and/or prior
research.
- Has at least 2 predictor/explanatory
variables

- 𝑏𝑗 measures the change in the

predicted value of the response given Goodness of fit measures
a unit increase in 𝑥𝑗 , holding all other
predictor variables constant. - By simply observing the sample
- These now represent the partial regression equation, we cannot
influence of 𝑥𝑗 on 𝑦̂. assess how well the explanatory
variables explain the variation in the
response variable
- We rely on several objective
“goodness-of-fit” measures that
summarize how well the sample
regression equation fits the data.
- If each predicted value is equal to its
observed value, then we have a
perfect fit.
- Since that almost never happens, we
evaluate the models on a relative
basis. Basis of the relative magnitude
of the residuals
- The sample regression equation
provides a good fit when the
dispersion of the residuals is relatively
small

The standard error of the estimate

measures the standard deviation of the
residuals:
Coefficient of determination (𝑅^2)

- The coefficient of determination

quantifies the sample variation in the
response variable that is explained by
- Interpret as the “average” squared
the sample regression equation,
residual.
denoted by 𝑅^2.
- Can take any value between 0 and
- How well a statistical model predicts
infinity.
an outcome
- The less dispersion, the smaller the 𝑠𝑒
and the better the model fits the data.
- we use the standard error of the
estimate in conjunction with other
measures to judge the overall
usefulness of a model.

For example, 𝑅^2=0.72.

- 72% of the variation in the response is

explained by the sample regression
- For a given 𝑛 sample size, increasing equation.
the number of explanatory variables 𝑘 - Other factors not included in the
reduces 𝑆𝑆𝐸 and the denominator model explain 28%
(𝑛 − 𝑘 − 1).
- The net effect, shown by the value of Use ANOVA in the context of the linear
𝑠_𝑒, allows us to determine if the regression model to derive 𝑅^2.
added explanatory variables improve - Take the total sum of squares and
the fit of the model break it into two parts.
- Explained variation (model)
- Unexplained variation (error)
- 𝑅 2 cannot be used for comparing
models that do not include the same
number of explanatory variables.
- 𝑅 2 never decreases as you add more
predictors.
- Increase 𝑅 2 by including a group of
explanatory variables that have no
foundation in the model.

Adjusted 𝑹𝟐

- accounts for the number of

explanatory variables in the model.
𝑛−1
- Adjusted 𝑅 2 = 1 − (1 − 𝑅 2 ) ( )
𝑛−𝑘−1
- Penalizes for adding additional
explanatory variables in the model
- Is used to compare linear regressions
with different numbers of explanatory
variables.
- The higher the adjusted 𝑅 2, the better
the model.
Inferencing with regression
models
Test of significance

- We can conduct hypothesis tests

about the unknown parameters

- Joint test about all of the parameters

- Individual tests about a single
parameter
- For the tests to be valid, certain
conditions about the model must be
met.
- If all of the coefficients equal zero,
then all of the explanatory variables
drop out.
- If at least one coefficient does not
equal zero, then at least one
explanatory variable has a linear
relationship with the response.
- A test of joint significance is regarded
as a test of the overall usefulness of a
regression.
Confidence interval

We use the anova significance F for P value

- Typically 𝛽_𝑗0=0, but it could be a

- If the confidence interval for the slope
nonzero value
coefficient contains zero, then the
explanatory variable is not significant.
- If the confidence interval does not
contain zero, then the explanatory
variable is statistically significant.
General test of linear restrictions
- The significance tests can also be
referred to as tests of linear
restrictions.
- The two-tailed t-test is a test of one
linear restriction about a single slope
coefficient.
- The F test is a test of k linear
restrictions that determines about all
the slope coefficients
- The partial F test is a general test of
linear restrictions.
A test for a nonzero slope coefficient
- We can apply this test to any subset
of the regression coefficients.

- The test statistic measures how well

the regression equation explains the
variability in the response variable
Interval estimates for the response Adjusted R-squared
variable
- modified version of R-squared that has
been adjusted for the number of
predictors in the model
- always lower than the R-squared
- useful for comparing different
regression models with different
- Prediction intervals are always wider amount of variables
than confidence interval.
Standard Error of the Regression

- the average distance that the observed

values fall from the regression line
- lower the better
- 0 to infinity

Testing

F-statistics

Cheat sheet - indicates whether the regression

model provides a better fit to the data
Fit of the model than a model that contains no
Multiple R (correlation coefficient) r independent variables.
- tests if the regression model as a
- measures the strength of the linear whole is useful
relationship between the predictor - Generally if none of the predictor
variables and the response variable variables in the model are statistically
- A multiple R of 1 indicates a perfect significant, the overall F statistic is
linear relationship while a multiple R of also not statistically significant.
0 indicates no linear relationship
whatsoever. Significance of F (P-value)

R-squared (Coefficient of determination) r2 - To see if the overall regression model is

significant, you can compare the p-
- It is the proportion of the variance in value to a significance level
the response variable that can be - If the p-value is less than the
explained by the predictor variable. significance level, there is sufficient
- A value of 1 indicates that the evidence to conclude that the
response variable can be perfectly regression model fits the data better
explained without error by the than the model with no predictor
predictor variable. variables
- 0-1 can be % - In this example, the p-value is 0.033,
which is less than the common
significance level of 0.05. This
indicates that the regression model as
a whole is statistically significant

Business Stats Notes
No ratings yet
Business Stats Notes
10 pages
Applied Quantitative Analysis Techniques
No ratings yet
Applied Quantitative Analysis Techniques
51 pages
One-Sample Hypothesis Testing Guide
No ratings yet
One-Sample Hypothesis Testing Guide
2 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
9 pages
Biostatistics: Differences & Correlations
100% (2)
Biostatistics: Differences & Correlations
9 pages
Correlation vs. Regression Analysis Explained
No ratings yet
Correlation vs. Regression Analysis Explained
14 pages
Continuous Data Analysis in Biostatistics
No ratings yet
Continuous Data Analysis in Biostatistics
49 pages
Understanding OLS Estimators and ANOVA
No ratings yet
Understanding OLS Estimators and ANOVA
42 pages
Chi-Square Test and ANOVA Explained
No ratings yet
Chi-Square Test and ANOVA Explained
34 pages
Understanding Population Variance Inference
No ratings yet
Understanding Population Variance Inference
19 pages
Simple Regression and ANOVA Overview
No ratings yet
Simple Regression and ANOVA Overview
12 pages
Parametric and Non Test Note COM218
No ratings yet
Parametric and Non Test Note COM218
15 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
53 pages
Statistical Inference: Parametric vs Nonparametric
No ratings yet
Statistical Inference: Parametric vs Nonparametric
8 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
49 pages
02 Data Treatment
No ratings yet
02 Data Treatment
15 pages
Business Analytics Cheat Sheet
No ratings yet
Business Analytics Cheat Sheet
16 pages
Small Sample Tests for Mean & Variance
No ratings yet
Small Sample Tests for Mean & Variance
5 pages
Applied Statistics: Confidence Intervals & Hypothesis Testing
No ratings yet
Applied Statistics: Confidence Intervals & Hypothesis Testing
22 pages
Causal Reasoning and Statistical Inference
No ratings yet
Causal Reasoning and Statistical Inference
27 pages
CME 106 - Statistics Cheatsheet
No ratings yet
CME 106 - Statistics Cheatsheet
9 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
42 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
2 pages
Correlation and Regression Analysis Guide
100% (1)
Correlation and Regression Analysis Guide
39 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
42 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
24 pages
Regression Analysis in Econometrics
100% (1)
Regression Analysis in Econometrics
54 pages
Understanding Probability Distributions and Estimation
No ratings yet
Understanding Probability Distributions and Estimation
8 pages
Linear Regression and Chi-Square Analysis
No ratings yet
Linear Regression and Chi-Square Analysis
113 pages
T-Tests, ANOVAs, and Regression Analysis
No ratings yet
T-Tests, ANOVAs, and Regression Analysis
35 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
6 pages
Statistics Concepts for Economists
No ratings yet
Statistics Concepts for Economists
34 pages
Business Statistics Exam Formulas Guide
No ratings yet
Business Statistics Exam Formulas Guide
7 pages
Statistical Methods for Environmental Research
No ratings yet
Statistical Methods for Environmental Research
37 pages
Correlation and Regression Concepts Explained
No ratings yet
Correlation and Regression Concepts Explained
23 pages
Multiple Regression Analysis Insights
No ratings yet
Multiple Regression Analysis Insights
11 pages
Hypothesis Testing in Linear Regression
No ratings yet
Hypothesis Testing in Linear Regression
43 pages
Correlation and Regression Analysis Guide
100% (2)
Correlation and Regression Analysis Guide
44 pages
Final Exam Cheat Sheet: Multiple Regression
No ratings yet
Final Exam Cheat Sheet: Multiple Regression
6 pages
Sampling, Statistics, and Hypothesis Testing
No ratings yet
Sampling, Statistics, and Hypothesis Testing
2 pages
Key Statistical Concepts Explained
No ratings yet
Key Statistical Concepts Explained
1 page
Correlation
No ratings yet
Correlation
44 pages
Regression and Correlation Analysis Guide
100% (1)
Regression and Correlation Analysis Guide
32 pages
Regression and Correlation Analysis Guide
No ratings yet
Regression and Correlation Analysis Guide
17 pages
Statistics Formula Cheat Sheet
No ratings yet
Statistics Formula Cheat Sheet
10 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
60 pages
Statistics Help Card: Key Formulas
No ratings yet
Statistics Help Card: Key Formulas
3 pages
Statistics Help Card: Key Concepts
No ratings yet
Statistics Help Card: Key Concepts
3 pages
Understanding ANOVA and Chi-Square Tests
100% (1)
Understanding ANOVA and Chi-Square Tests
8 pages
2021 Stat Notes
No ratings yet
2021 Stat Notes
162 pages
Understanding Statistical Significance and Regression
No ratings yet
Understanding Statistical Significance and Regression
47 pages
Understanding Linear Regression and Correlation
No ratings yet
Understanding Linear Regression and Correlation
21 pages
Statistical Analysis in Chemistry
No ratings yet
Statistical Analysis in Chemistry
44 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
35 pages
Statistics Revision Notes
No ratings yet
Statistics Revision Notes
9 pages
Optimizing Lennard-Jones Parameters in AI3/MM
No ratings yet
Optimizing Lennard-Jones Parameters in AI3/MM
10 pages
CBSE Class 12 Chemistry 2015 Paper
No ratings yet
CBSE Class 12 Chemistry 2015 Paper
7 pages
Panasonic EC-N Automotive Relays Catalog
No ratings yet
Panasonic EC-N Automotive Relays Catalog
8 pages
BCA BSC CA Theory Exam Time Table Dec2025
No ratings yet
BCA BSC CA Theory Exam Time Table Dec2025
2 pages
Factors Influencing India's Climate
No ratings yet
Factors Influencing India's Climate
19 pages
Use Case Diagram for Library System
No ratings yet
Use Case Diagram for Library System
23 pages
Anchor Bolt Design Calculation Tool
No ratings yet
Anchor Bolt Design Calculation Tool
11 pages
Rimc June Paper 2025 (Maths) Question Paper - 67925100 - 2026 - 01 - 22 - 23 - 23
No ratings yet
Rimc June Paper 2025 (Maths) Question Paper - 67925100 - 2026 - 01 - 22 - 23 - 23
5 pages
SVD Explained with Numerical Examples
No ratings yet
SVD Explained with Numerical Examples
2 pages
M5500 SFRA Tool Overview and Benefits
No ratings yet
M5500 SFRA Tool Overview and Benefits
2 pages
Star Seiki Robot Unloader Manual
100% (1)
Star Seiki Robot Unloader Manual
92 pages
Ot9-12-F1 2
No ratings yet
Ot9-12-F1 2
2 pages
Dr. Ivric Valaire Yatat Djeumen CV
No ratings yet
Dr. Ivric Valaire Yatat Djeumen CV
4 pages
Overview of Electronic Apex Locators
No ratings yet
Overview of Electronic Apex Locators
7 pages
1006tag2 PDF
100% (4)
1006tag2 PDF
6 pages
Compact Energy-Saving Medium-Voltage Drive
No ratings yet
Compact Energy-Saving Medium-Voltage Drive
24 pages
Introduction to Micrometeorology for Wind Energy
No ratings yet
Introduction to Micrometeorology for Wind Energy
104 pages
Database Management System Course Overview
No ratings yet
Database Management System Course Overview
3 pages
Tool Condition Monitoring Using Machine Tool Spindle Current
No ratings yet
Tool Condition Monitoring Using Machine Tool Spindle Current
13 pages
Compact ITX Case M-100 Features
No ratings yet
Compact ITX Case M-100 Features
2 pages
Argument Evaluation Checklist
No ratings yet
Argument Evaluation Checklist
3 pages
Sunshine Daycare Terminal Exam 2025
No ratings yet
Sunshine Daycare Terminal Exam 2025
23 pages
Experimental Insights on Third Law
No ratings yet
Experimental Insights on Third Law
2 pages
Knowledge Representation in AI Systems
No ratings yet
Knowledge Representation in AI Systems
186 pages
Flex Analog Data for L90 Transformer
No ratings yet
Flex Analog Data for L90 Transformer
46 pages
Machine Learning on Car Market Data
No ratings yet
Machine Learning on Car Market Data
10 pages
Evolution of the Internet: From ARPANet to Today
No ratings yet
Evolution of the Internet: From ARPANet to Today
14 pages
Understanding Rock Joints in Geology
No ratings yet
Understanding Rock Joints in Geology
17 pages
Aden - Kerker. Scattering Efficiency For A Layered Sphere. 1951
100% (2)
Aden - Kerker. Scattering Efficiency For A Layered Sphere. 1951
6 pages
Lesson Plan: Statistics & Probability 11
100% (2)
Lesson Plan: Statistics & Probability 11
3 pages

Statistical Inference for Two Populations

Uploaded by

Statistical Inference for Two Populations

Uploaded by

Statistical inference concerning

- Samples that are completely

- Used to estimate population

Two means sample stats

Inference concerning mean differences

- Since the population proportions are

Inference concerning the difference

- We use the sample variance 𝑆 2 as an

Statistical inferences regarding 𝜎^2are based

The 𝜒 2 distribution is a family of distributions.

notation to represent a random

- Take a sample of size from a normal

The sampling distribution of 𝑆_1^2/𝑆_2^2 the

The 𝐹 distribution is characterized by a family

is the probability distribution of the

- Also known as binomial experiment

- About the probabilities/proportions of

Choices for competing hypotheses

- Set all the population proportions

Ex. Suppose you have four different

- Is the proportion of voters who favor

- Contest specific value

Hypothesis test for the correlation

Sample correlation coefficient rxy (R)

- Describes both the direction and

- Use information on the explanatory Components of a linear regression model

- Develop a mathematical model that

Method of least squares

- Also referred to as ordinary least

Sum of the squared differences between

- Sum of squared differences from the

- 𝑏𝑗 measures the change in the

The standard error of the estimate

- The coefficient of determination

For example, 𝑅^2=0.72.

- 72% of the variation in the response is

- accounts for the number of

- We can conduct hypothesis tests

- Joint test about all of the parameters

We use the anova significance F for P value

- Typically 𝛽_𝑗0=0, but it could be a

- The test statistic measures how well

- the average distance that the observed

Cheat sheet - indicates whether the regression

R-squared (Coefficient of determination) r2 - To see if the overall regression model is

You might also like