0% found this document useful (0 votes)

8 views15 pages

Correlation and Regression Analysis Guide

The document provides an overview of correlation and regression analysis, focusing on their definitions, applications, and interpretations in marketing strategy. It outlines the objectives of understanding correlation, regression, scatter plots, and the significance of dummy variables and multicollinearity. Additionally, it includes examples of simple and multiple regression models, their statistical outputs, and the importance of testing hypotheses regarding the effects of independent variables.

Uploaded by

emiliakuschka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views15 pages

Correlation and Regression Analysis Guide

Uploaded by

emiliakuschka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

KU | Correlation and Regression Analysis | Prof.

Taewan Kim

Marketing Strategy
3960, 3961

Correlation and Regression Analysis

Introduction
What if we are interested in the relationship between two variables?
If we change one variable (x) does another variable also change (y)?
We could also possibly use a variable (x) to predict another variable
(y). These types of analysis, correlation and regression, are useful
and powerful tools for using your data to make decisions.

Objectives
1. Define correlation.
2. Define regression.
3. Interpret scatter plots.
4. Explain simple regression.
5. Explain multiple regression.
6. Interpret a simple regression output.
7. Interpret a multiple regression output.
8. Define dummy variable.
9. Define multicollinearity.

Resources
Primary
 Churchill and Brown, Basic Marketing Research, Chapter 20

1
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Correlation and Regression

Correlation: closeness of the relationship between two variables. Notes

“Is x correlated to y?” is the same as “Is y correlated to x?”

Corr(X,Y) = Corr(Y,X)

*Correlation does not mean causation!

Regression: Derive an equation that relates the criterion variable

to one or more predictor variables.

“Is x useful in predicting y?” (This is not the same as “Is y useful
in predicting x?”)

1. Variable of Interests: Y

2. Why is Y fluctuating?

a. Scatter plots

b. Pick one Xi from theory or model

c. Fitted model (Ordinary Least Squares)

n
min∑ (Y i−Y^ )
2

i=1

3. Is the model any good?

How close does the fitted line come to the actual scatter plot?

TSS

RSS

ESS

2 RSS
0≤ R = ≤1
TSS

2
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Bivariate (two-variable) Scatter Plots

3
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Pearson’s Correlation Coefficient

Association between two variables – How one variable changes (from its Notes
average) with the change (from its average) in another variable.

r=
∑ ( x i − x̄ )( y i− ȳ )
√ ∑ (xi − x̄ )2 √∑ ( y i − ȳ )2
Cov ( x , y )
r=
sx s y

−1≤ r≤+1

4
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Regression: Assessing the Effects of Marketing Mix Variables

Simple Regression Model Notes

Simple (or bivariate) regression examines the effect of one independent
variable (X) on a dependent variable (Y).

The Population Model:

Y =α + βX + ε

Where,
Y = Dependent variable (or Outcome)
X = Independent variable (Input)
α = Intercept; value of Y when X=0
β= Slope, the change in Y due to one unit change in X.

β is a measure of the effect of X on Y.

When β= 0, X has no effect on Y.

When β >0, X has a positive effect on Y.
(e.g., Effects of advertising, # sales persons on sales)
When β <0, X has a negative effect on Y.
(e.g., Effects of a product's own price on its own sales)

The Fitted Model:

Based on data on X and Y, we fit the regression line as:

Y^ =a+bX
Where,
Y=Predicted (or fitted) value of the dependent variable for a given value of
X. The fitted value Y may not match with the actual value of Y. This
results in error =Y −Y
^

a= estimate of the interceptα , based on the sample data on X and Y

b= estimate of the intercept β , based on the sample data on X and Y
a , b are also referred to as regression parameters.

5
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Simple Regression (Continued)

Is the Fitted Regression Model any good? Notes

Note that the analyst had picked the independent variable (X) based on
his/her conjecture. S/he assumed that this X (e.g., number of sales visit
made by a salesperson) might have an effect on Y (e.g., actual sales
generated). But s/he might be wrong in his/her assumption.

The problem is that any data on X and Y will produce a fitted regression
^
line (i.e., OLS parameter a and b): Y =a+bX

Therefore, two very important questions to ask are

1. How well the fitted regression line “fit” (i.e., come close to) the actual
data points (i.e., the scatter plot)?

2. Is the estimated slope parameter b statistically different from zero?

Note that when the slope is zero, it is a flat line, suggesting that
‘X has no effect on Y’.

1. Fit of a Regression
If the analyst has no explanation about why the dependent variable Y (e.g.,
sales) is varying from observation to observation (e.g., salesperson to
salesperson), s/he can only look at each data point
Yi
and examine how
much it is deviating from the average Y.

(Y −Ȳ )
The deviation is i . Squaring this and adding then for all observation,
we get what we call the Total Sums of Square (TSS). Thus,
n
TSS=∑ ( Y i −Ȳ )2
i =1

TSS=( s2 )∗(n−1) 2
Note that , where s is variance of Y

6
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Simple Regression (Continued)

2. Partitioning of TSS Notes

A regression partitions TSS into two groups: TSS=RSS+ESS
(1) Regression (or explained) Sums of Square (RSS) and
(2) Error (or unexplained) Sums of Squares(ESS)
n
RSS=∑ ( Y^ i −Ȳ )2
 i=1
: the difference between the mean and the regression
n
ESS=∑ (Y i −Y^ )2
 i=1
: the difference between the regression and the data

R2 is a measure of “Fit” ; It gives us the % of total variation in Y(TSS) that

could be explained by the regression (RSS). i.e.,

RSS ESS
R2 = =1−
TSS TSS 0≤R2 ≤1 ;

In simple regression, the correlation

r = √ R 2

3. Test for the Slope Parameter (b)

Since we are using a sample to generate data for X and Y, the parameters,
a and b, are not single numbers; there is confidence interval around it,
depending on the respective standard errors (SE).

Therefore, we need to test if the slope parameter (b) is statistically different

from zero.

H 0 : β=0
H a : β≠0

Just like in the case of a one-population mean test, we compute the test
statistic as follows:

b− β
t= Under¿ : β=0
SE ( b)

The computer output prints the t and the corresponding error (called the p-
values)

H 0 and conclude that

If the p<0.05 (i.e., 95% confidence), then reject
β≠0 , which means that X has a significant effect on Y

7
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Simple Regression: Excel Output

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.880
R Square 0.775
Adjusted R Square 0.769
Standard Error 59.560
Observations 40.000

ANOVA
df SS MS F Significance F
Regression 1.000 463451.009 463451.009 130.644 0.000
Residual 38.000 134802.015 3547.421
Total 39.000 598253.024

Coefficient Standard
s Error t Stat P-value
Intercept 135.434 25.907 5.228 0.000
ADV 25.308 2.214 11.430 0.000

8
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Multiple Regression

Multiple (or multivariate) regression: Notes

One dependent variable and more than one independent variables

The Population Model (with k independent variables):

Y =α + β 1 X 1 + β 2 X 2 +. . .+ β k X k +ε

Using data on Y and

X 1 , X 2 ,. . . , X k , one can fit a model:

The Fitted Model:

Y^ =a+b 1 X 1 +b 2 X 2 +.. .+bk X k

1. Fit of the Model:

As in simple regression, fit of a multiple regression model is indicated by

R2 (i.e., the proportion of variations in the dependent (Y) variable can be
explained by the independent variables (all the X’s)). Again:
0≤R2 ≤1

Unlike simple regression, there is a second issue here. If an analyst keeps

on adding more and more independent variable (X’s) to explain the
variation in the dependent (Y) variable, then s/he can conceivably keep on
increasing R2. The question is – at what cost?

The more number of variables you include in the model, the more is the
need for data on these variables. There is a cost for additional information.
Also, it is not fair to compare R2’s of two models, one with fewer (say 2)
independent (X) variables and another with greater number (say 3 or 4) of
independent (X) variables. When you encounter this type of a situation,
you should look at the adjusted R2, which has a way to penalize the model
with more independent variable.

Adjusted 2R=1−(1−R 2 )
n−1
n−k−1
=R 2
n−1
n−k−1
−( k
n−k −1 )( )

9
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Multiple Regression (continued)

2. Effects of the Independent Variables Notes

A second issue that multiple regression models have to contend with is the
relative effects of the independent (X) variables. That is, which X’s have a
stronger impact on Y? Which X’s have weaker or no impact on Y? To
address this issue, we follow two steps.

Step 1: F-test
We start with a very modest question:
Does any one of the X’s have an impact on Y?
Stated differently – is at least one of the β ' s≠0 . Thus,

H 0 : β 1 =β 2=. . .=β k =0
H a : At least one β i≠0

The test of this hypothesis is carried out through a F-test. Look up the RSS /k
F=
ANOVA table in the regression output. If the value of F is “large” and the ESS/(n−k −1)
corresponding p-value is small (<0.05), then one can reject the H0 and
conclude that at least one of the X’s have a statistically significant impact
on Y. But the F-test does not tell us which variable(s). For this, we have to
go to the next step.

Step 2: t-test
In this step, we answer the question:
Which of the independent (X) variables has/have statistically significant
effect on the dependent (Y) variable? i.e., which β ' s≠0 . One has to test a
separate hypothesis for each β associated with each independent variable.

H 0 : β 1 =0 H 0 : β 2 =0 H 0 : β k =0
, ,. . .,
H a : β 1 ≠0 H a : β 2 ≠0 H a : β k ≠0

To test these hypotheses, look at the lower part of the regression output.
For each independent variable (X), there is a corresponding value of the
parameter β , its standard error (SE) reflecting the variation around it, the
corresponding (SE) t-value, and finally, the p-value.

The t is computed as follows:

b −β i
t= i Under ¿ : βi =0
SE ( bi )
If the corresponding p-value is <0.05 (i.e., 95% confidence), then reject
H 0 and conclude that X has a significant effect on Y. That effect can be
either positive or negative.

10
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Multiple Regression: Excel Output

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.939
R Square 0.881
Adjusted R Square 0.871
Standard Error 44.423
Observations 40.000

ANOVA
df SS MS F Significance F
Regression 3.000 527209.081 175736.360 89.051 0.000
Residual 36.000 71043.943 1973.443
Total 39.000 598253.024

Coefficient Standard
s Error t Stat P-value
Intercept 31.150 34.175 0.911 0.368
ADV 12.968 2.737 4.738 0.000
SalesRep 41.246 7.280 5.666 0.000
Wholesale 11.524 7.691 1.498 0.143

11
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Dummy Variables

Let’s consider the role of qualitative independent variables in regression Notes

analysis.

In regression analysis the dependent variables is frequently influenced not

only by variables that can be readily quantified on some well-defined scale
(e.g., income, sales, prices, promotions), but also by variables that are
essentially qualitative in nature (e.g., gender, race, class, residence).

As an example, consider the following model:

Y =α + βD+ ε

Where Y = annual salary of a college professor

D=1 if male college professor
=0 otherwise (i.e., female professor)

We obtain from the model as follow:

 Mean salary of female college professor: E(Y|D=0)=α
 Mean salary of male college professor: E(Y|D=1)=α+β

A test of the null hypothesis that there is no sex discrimination ( H0: β=0 )
can be easily made by running regression in usual manner and finding out
whether on the basis of the t test the estimated β is statistically significant.
See next page.

Let us modify above model with one quantitative variable:

Y =α 1 + α 2 D+ βX+ ε

Where Y = annual salary of a college professor

X = years of teaching experience
D=1 if male college professor
=0 otherwise (i.e., female professor)

We obtain from the model as follow:

 Mean salary of female college professor:
E(Y |X , D=0)=α 1 + βX
 Mean salary of male college professor:
E(Y |X , D=1 )=(α 1 +α 2 )+βX

12
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Dummy Variables: Excel Output

Salary Sex (1=male,0=female)
25
22.0 1
19.0 0
20
18.0 0
21.7 1 15
18.5 0
21.0 1 10
20.5 1
17.0 0 5
17.5 0
21.2 1 0
1 2 3 4 5 6 7 8 9 10

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.935
R Square 0.874
Adjusted R Square 0.858
Standard Error 0.697
Observations 10.000

ANOVA

df SS MS F Significance F
Regression 1.000 26.896 26.896 55.342 0.000
Residual 8.000 3.888 0.486
Total 9.000 30.784

Coefficient
Standard Error t Stat P-value
s
Intercept 18.000 0.312 57.735 0.000
sex 3.280 0.441 7.439 0.000

13
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Regression with Interaction Term

Let’s consider the model with Interaction Term. The following model has, Notes
as independent variables, an interval scaled variable, and a product of a
dummy and an interval scaled variables. Consider:

Y =α 1 + β1 X + β 2 ( D∗X )+ ε

Where Y = annual salary of a college professor

X = years of teaching experience
D=1 if male college professor
=0 otherwise (i.e., female professor)

We obtain from the model as follow:

 D=0 : Y =α 1 + β1 X
 D=1 :Y =α 1 +( β 1 + β 2 ) X

Let’s consider the model with Interaction Term. The following model has,
as independent variables, a dummy variable, an interval scaled variable,
and a product of a dummy and an interval scaled variables. Consider:

Y =α 1 + α 2 D+ β1 X+ β 2 ( D∗X )+ ε

Where Y = annual salary of a college professor

X = years of teaching experience
D=1 if male college professor
=0 otherwise (i.e., female professor)

We obtain from the model as follow:

 D=0 : Y =α 1 + β1 X
 D=1 :Y =( α 1 +α 2 )+( β 1 + β 2 ) X

14
KU | Correlation and Regression Analysis | Prof. Taewan Kim

Multicollinearity

Definition: A condition said to be present in a multiple regression analysis Notes

when the independent variables are highly correlated among themselves.

Interpretation of the multiple regression equation depends implicitly on the

assumption that the predictor variables are not strongly interrelated. It is
usual to interpret a regression coefficient as measuring the change in the
dependent variable when the independent variable is increased by one unit
and all other predictor variables are held constant. At higher degrees of
multicollinearity, the coefficients for individual independent variables
become unstable and as a result they cannot be interpreted effectively:

 Estimators, b’s have larger variances(s.d.’s)

 Confidence Interval gets larger
 Variance gets larger, hence, t-stat tends to be insignificant
2
 Even though lower t-stat, R can be very high
 OLS estimators, b’s and s.e.(b)’s can be sensitive to small change
in the data.

How to detect this problem

1
VIF j =
 Variance Inflation Factor = 1-R 2j
where j = 1, 2, …, k (the number of independent variables),
2
R j denote R2 of regressing X j against all the other X’s,
i.e., k = 3, R21 is the R-square from the regression X 1 against
X 2 and X 3 .
If VIF > 10, then there is a multicollinearity problem.

How to solve this problem

 Using a priori information, i.e., k = 3,

Y =α+ β 1 X 1 + β 2 X 2 + β 3 X 3 + ε
If there is a historical relationship between X 1 and X 2 that is
approximated as X 1 = 0.3 X 2 . Replacing X 2 in the original model,
Y =α+(0 .3 β 2 + β 2 ) X 2 + β 3 X 3 +ε
By Dropping a variable X 1 from the equation we are able to obtain
accurate estimate of 0 . 3 β 2 + β2 and
β 3 . Multicollinearity is no
longer present.

Cautions in Simple Regression Analysis
No ratings yet
Cautions in Simple Regression Analysis
93 pages
Understanding R² in Multiple Regression
No ratings yet
Understanding R² in Multiple Regression
31 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
36 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
29 pages
ST 321 All Notes
No ratings yet
ST 321 All Notes
91 pages
SI - Regression - Notes (IBA ACF)
No ratings yet
SI - Regression - Notes (IBA ACF)
19 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
78 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
44 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
73 pages
Regression Analysis in Marketing Analytics
No ratings yet
Regression Analysis in Marketing Analytics
27 pages
Multiple Linear Regression Analysis
No ratings yet
Multiple Linear Regression Analysis
55 pages
Marketing Mix Modeling with Regression Analysis
No ratings yet
Marketing Mix Modeling with Regression Analysis
43 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
19 pages
Multivariate Statistical Techniques Guide
No ratings yet
Multivariate Statistical Techniques Guide
25 pages
Regression
No ratings yet
Regression
38 pages
Multiple Regression Model Overview
No ratings yet
Multiple Regression Model Overview
19 pages
Regression and Correlation Analysis Guide
100% (1)
Regression and Correlation Analysis Guide
32 pages
Regression and Correlation Explained
No ratings yet
Regression and Correlation Explained
29 pages
Understanding Linear Regression in R
No ratings yet
Understanding Linear Regression in R
23 pages
Trendlines and Regression Analysis Guide
No ratings yet
Trendlines and Regression Analysis Guide
23 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
26 pages
Regression Analysis in Market Research
No ratings yet
Regression Analysis in Market Research
35 pages
Understanding Linear Regression Analysis
No ratings yet
Understanding Linear Regression Analysis
34 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
30 pages
Linear Regression Concepts in R
No ratings yet
Linear Regression Concepts in R
24 pages
Understanding Simple Regression Analysis
No ratings yet
Understanding Simple Regression Analysis
1 page
Multiple Regression in Sales Forecasting
No ratings yet
Multiple Regression in Sales Forecasting
100 pages
Final 9 - Linear Regression and Correlation - Revised
No ratings yet
Final 9 - Linear Regression and Correlation - Revised
32 pages
STAT 353: Expectation, Variance & Regression Guide
No ratings yet
STAT 353: Expectation, Variance & Regression Guide
44 pages
Chi-Square Goodness-of-Fit Analysis
No ratings yet
Chi-Square Goodness-of-Fit Analysis
77 pages
Linear Regression W6 7
No ratings yet
Linear Regression W6 7
37 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
25 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
49 pages
Multiple Regression Analysis Insights
No ratings yet
Multiple Regression Analysis Insights
10 pages
LN06
No ratings yet
LN06
26 pages
Simple Linear Regression Analysis Guide
No ratings yet
Simple Linear Regression Analysis Guide
61 pages
Understanding Regression Analysis Basics
50% (2)
Understanding Regression Analysis Basics
44 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
36 pages
Simple Linear Regression Analysis Guide
No ratings yet
Simple Linear Regression Analysis Guide
13 pages
DD Ub LSN7
No ratings yet
DD Ub LSN7
30 pages
Understanding Regression Analysis Basics
100% (2)
Understanding Regression Analysis Basics
9 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
14 pages
Understanding Simple Regression Models
No ratings yet
Understanding Simple Regression Models
25 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
30 pages
Data Mining Lecture Nine Regression
No ratings yet
Data Mining Lecture Nine Regression
22 pages
Linear Regression Analysis in R
No ratings yet
Linear Regression Analysis in R
24 pages
Chap 2 Multiple Regression
No ratings yet
Chap 2 Multiple Regression
100 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
13 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
39 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
61 pages
Regression Analysis in Business Stats
No ratings yet
Regression Analysis in Business Stats
5 pages
Linear Regression in Predictive Analytics
No ratings yet
Linear Regression in Predictive Analytics
39 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
34 pages
Regression Analysis in Construction Studies
No ratings yet
Regression Analysis in Construction Studies
77 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
33 pages
Predictive Models and Regression Analysis
No ratings yet
Predictive Models and Regression Analysis
30 pages
Advanced Business Analytics Assignment Answers
No ratings yet
Advanced Business Analytics Assignment Answers
2 pages
Understanding Data Position Measures
No ratings yet
Understanding Data Position Measures
6 pages
Correlation and Linear Regression Guide
No ratings yet
Correlation and Linear Regression Guide
39 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
70 pages
Market Research Project
No ratings yet
Market Research Project
151 pages
Data Management and Normal Distribution
No ratings yet
Data Management and Normal Distribution
53 pages
Bivariate Analysis and Scatter Plots Guide
No ratings yet
Bivariate Analysis and Scatter Plots Guide
33 pages
Kome Default
No ratings yet
Kome Default
15 pages
Introduction to Statistics Concepts
No ratings yet
Introduction to Statistics Concepts
37 pages
Environmental Data Analysis Techniques
No ratings yet
Environmental Data Analysis Techniques
48 pages
Audit Committee Impact on Financial Performance
No ratings yet
Audit Committee Impact on Financial Performance
14 pages
Weekly Update: Student Council & Events
No ratings yet
Weekly Update: Student Council & Events
3 pages
Applsci 12 04661
No ratings yet
Applsci 12 04661
14 pages
Understanding Regression Analysis
No ratings yet
Understanding Regression Analysis
18 pages
SPSS and PSPP Overview for COM 216
100% (1)
SPSS and PSPP Overview for COM 216
8 pages
Inferential Statistical Tools Explained
No ratings yet
Inferential Statistical Tools Explained
40 pages
Predicting Class III Treatment Outcomes
No ratings yet
Predicting Class III Treatment Outcomes
6 pages
IUCD Utilization Study in Mekelle, Ethiopia
No ratings yet
IUCD Utilization Study in Mekelle, Ethiopia
3 pages
EDA and Visualization Techniques Guide
No ratings yet
EDA and Visualization Techniques Guide
4 pages
Quantitative Data Analysis Techniques
No ratings yet
Quantitative Data Analysis Techniques
54 pages
Project Report 1
No ratings yet
Project Report 1
52 pages
Data Exploration & Visualization Exam QP
No ratings yet
Data Exploration & Visualization Exam QP
4 pages
Nahwu Sharf and Arabic Translation Skills
No ratings yet
Nahwu Sharf and Arabic Translation Skills
10 pages
Workplace Spirituality's Impact on Performance
No ratings yet
Workplace Spirituality's Impact on Performance
97 pages
Exploratory Data Analysis Test 2
No ratings yet
Exploratory Data Analysis Test 2
6 pages
Data Analyst Roles and Interview Insights
No ratings yet
Data Analyst Roles and Interview Insights
33 pages
Menstrual Hygiene Management Study Ethiopia
No ratings yet
Menstrual Hygiene Management Study Ethiopia
50 pages
Bivariate Data Assessment on Caloric Intake
No ratings yet
Bivariate Data Assessment on Caloric Intake
2 pages
Research Literature Review Guide
No ratings yet
Research Literature Review Guide
4 pages

Correlation and Regression Analysis Guide

Uploaded by

Correlation and Regression Analysis Guide

Uploaded by

KU | Correlation and Regression Analysis | Prof.

Correlation and Regression Analysis

Correlation and Regression

Correlation: closeness of the relationship between two variables. Notes

“Is x correlated to y?” is the same as “Is y correlated to x?”

*Correlation does not mean causation!

Regression: Derive an equation that relates the criterion variable

b. Pick one Xi from theory or model

c. Fitted model (Ordinary Least Squares)

3. Is the model any good?

Bivariate (two-variable) Scatter Plots

Pearson’s Correlation Coefficient

Regression: Assessing the Effects of Marketing Mix Variables

Simple Regression Model Notes

The Population Model:

β is a measure of the effect of X on Y.

When β= 0, X has no effect on Y.

The Fitted Model:

a= estimate of the interceptα , based on the sample data on X and Y

Simple Regression (Continued)

Is the Fitted Regression Model any good? Notes

Therefore, two very important questions to ask are

2. Is the estimated slope parameter b statistically different from zero?

Simple Regression (Continued)

2. Partitioning of TSS Notes

R2 is a measure of “Fit” ; It gives us the % of total variation in Y(TSS) that

In simple regression, the correlation

3. Test for the Slope Parameter (b)

Therefore, we need to test if the slope parameter (b) is statistically different

H 0 and conclude that

Simple Regression: Excel Output

Multiple (or multivariate) regression: Notes

The Population Model (with k independent variables):

Using data on Y and

The Fitted Model:

Y^ =a+b 1 X 1 +b 2 X 2 +.. .+bk X k

1. Fit of the Model:

As in simple regression, fit of a multiple regression model is indicated by

Unlike simple regression, there is a second issue here. If an analyst keeps

Multiple Regression (continued)

2. Effects of the Independent Variables Notes

The t is computed as follows:

Multiple Regression: Excel Output

Let’s consider the role of qualitative independent variables in regression Notes

In regression analysis the dependent variables is frequently influenced not

As an example, consider the following model:

Where Y = annual salary of a college professor

We obtain from the model as follow:

Let us modify above model with one quantitative variable:

Where Y = annual salary of a college professor

We obtain from the model as follow:

Dummy Variables: Excel Output

Regression with Interaction Term

Where Y = annual salary of a college professor

We obtain from the model as follow:

Where Y = annual salary of a college professor

We obtain from the model as follow:

Definition: A condition said to be present in a multiple regression analysis Notes

Interpretation of the multiple regression equation depends implicitly on the

 Estimators, b’s have larger variances(s.d.’s)

How to detect this problem

How to solve this problem

 Using a priori information, i.e., k = 3,

You might also like