0% found this document useful (0 votes)
11 views21 pages

Module 4

Module IV covers correlation and regression analysis, defining correlation as the degree and direction of the relationship between two variables, with types categorized by direction, degree, and form. It discusses methods for studying correlation, including scatter diagrams and Pearson's coefficient, and highlights the limitations of correlation, such as not implying causation. The module also explains regression analysis, its types, equations, and the relationship between correlation and regression, emphasizing their applications in business forecasting and analysis.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views21 pages

Module 4

Module IV covers correlation and regression analysis, defining correlation as the degree and direction of the relationship between two variables, with types categorized by direction, degree, and form. It discusses methods for studying correlation, including scatter diagrams and Pearson's coefficient, and highlights the limitations of correlation, such as not implying causation. The module also explains regression analysis, its types, equations, and the relationship between correlation and regression, emphasizing their applications in business forecasting and analysis.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MODULE 4

Module IV: Correlation & Regression Analysis

1. Correlation

Definition:

Correlation measures the degree and direction of relationship between two variables.

Example:

• Income ↑ → Expenditure ↑ (positive relation)

• Price ↑ → Demand ↓ (negative relation)

2. Types of Correlation

(A) Based on Direction

1. Positive Correlation

• Both variables move in same direction

• Example: Income & savings

2. Negative Correlation

• Variables move in opposite direction

• Example: Price & demand

(B) Based on Degree

• Perfect correlation (r = +1 or −1)

• High correlation

• Low correlation

• Zero correlation (r = 0)

(C) Based on Form


• Linear correlation (straight line)

• Non-linear correlation (curved)

3. Methods of Studying Correlation

1. Scatter Diagram

• Graphical method

• Points plotted on graph

Pattern shows type of correlation

2. Karl Pearson’s Coefficient of Correlation

Measures strength of linear relationship

Formula:

∑(𝑋 − 𝑋ˉ)(𝑌 − 𝑌ˉ)


𝑟=
√∑(𝑋 − 𝑋ˉ)2 ∑(𝑌 − 𝑌ˉ)2

Interpretation of r:

Value of r Meaning

+1 Perfect positive

−1 Perfect negative

0 No correlation

3. Spearman’s Rank Correlation

Used when data is in ranks

Formula:
6∑𝑑 2
𝑟 =1−
𝑛(𝑛2 − 1)

Where:

• 𝑑= difference in ranks

• 𝑛= number of observations

4. Properties of Correlation

• Value lies between −1 and +1

• Independent of origin and scale

• Shows direction and strength

5. Limitations of Correlation

• Does NOT imply causation

• Only measures linear relationship

• Affected by extreme values

6. Regression Analysis

Definition:

Regression analysis studies the functional relationship between variables and helps in
prediction.

Used to estimate one variable based on another

7. Types of Regression

1. Simple Regression

• One independent variable


2. Multiple Regression

• More than one independent variable

8. Regression Lines

There are two regression lines:

1. Regression of Y on X

2. Regression of X on Y

Equations:

Regression of Y on X

𝑌 − 𝑌ˉ = 𝑏𝑦𝑥 (𝑋 − 𝑋ˉ)

Regression of X on Y

𝑋 − 𝑋ˉ = 𝑏𝑥𝑦 (𝑌 − 𝑌ˉ)

9. Regression Coefficients
Cov(𝑋, 𝑌)
𝑏𝑦𝑥 =
𝜎𝑋2
Cov(𝑋, 𝑌)
𝑏𝑥𝑦 =
𝜎𝑌2

10. Relationship Between Correlation and Regression

𝑟 = √𝑏𝑥𝑦 ⋅ 𝑏𝑦𝑥
Sign of r depends on regression coefficients

11. Properties of Regression

• Two regression lines intersect at mean point

• Regression coefficients are independent of origin but not scale

• Useful for prediction

12. Difference Between Correlation and Regression

Correlation Regression

Measures relationship Predicts relationship

Symmetrical Asymmetrical

No cause-effect Shows dependency

Single value (r) Two equations

13. Applications in Business

• Sales forecasting

• Demand estimation

• Cost analysis

• Market research

Module IV – Additional Theory (Advanced Concepts)

1. Scatter Diagram (Detailed Understanding)

A scatter diagram is the simplest way to study correlation.

Interpretation:
• Points close to a straight line → High correlation

• Points widely scattered → Low correlation

• Upward trend → Positive correlation

• Downward trend → Negative correlation

No formula required — purely visual method

2. Probable Error (P.E.) of Correlation

Used to test reliability of correlation coefficient

Formula:

1 − 𝑟2
𝑃. 𝐸. = 0.6745 ×
√𝑛

Interpretation:

• If 𝑟 < 𝑃. 𝐸.→ Not significant

• If 𝑟 > 6 × 𝑃. 𝐸.→ Highly significant

Frequently asked theory/numerical question

3. Coefficient of Determination (r²)

𝑟 2 = (correlation coefficient)2

Meaning:

• Shows percentage of variation explained

Example:

• If 𝑟 = 0.8, then 𝑟 2 = 0.64

• → 64% variation explained


4. Coefficient of Non-Determination

1 − 𝑟2

• Shows unexplained variation

5. Concurrent Deviations Method

• Used when exact data not available

• Based on direction of change (+/−)

Simple but less accurate method

6. Assumptions of Correlation

• Relationship is linear

• Variables are quantitative

• Data is homogeneous

• No extreme outliers

7. Limitations of Correlation (Advanced)

• Cannot establish cause-effect relationship

• Misleading in case of spurious correlation

• Only measures linear relationships

• Affected by extreme values

8. Spurious Correlation

• False or meaningless relationship

Example:
• Ice cream sales & drowning cases
(Both increase in summer but not directly related)

9. Regression (Deep Concept)

Regression shows:

• Dependence of one variable on another

• Used for forecasting and estimation

10. Regression Lines (Important Theory)

• Two lines:

o Y on X

o X on Y

Key Points:

• Both lines intersect at mean point (𝑋ˉ, 𝑌ˉ)

• If correlation is perfect → both lines coincide

• If correlation is zero → lines are perpendicular

11. Regression Coefficients (Properties)

• Both coefficients have same sign

• If one > 1 → other < 1

• Independent of origin but not scale

12. Angle Between Regression Lines

• Smaller angle → higher correlation

• Larger angle → lower correlation

Concept-based question
13. Difference Between r and r²

r r²

Measures direction & strength Measures proportion explained

Can be negative Always positive

Range: -1 to +1 Range: 0 to 1

14. Uses of Correlation & Regression

• Business forecasting

• Demand estimation

• Price analysis

• Financial planning

• Economic research

15. Practical Interpretation

High correlation:

• Strong relationship

Low correlation:

• Weak relationship

Zero correlation:

• No relationship

Module IV – Formula Sheet

1. Karl Pearson’s Correlation Coefficient


Direct Formula

∑(𝑋 − 𝑋ˉ)(𝑌 − 𝑌ˉ)


𝑟=
√∑(𝑋 − 𝑋ˉ)2 ⋅ ∑(𝑌 − 𝑌ˉ)2

Shortcut Formula
𝑁∑𝑋𝑌 − (∑𝑋)(∑𝑌)
𝑟=
√[𝑁∑𝑋 2 − (∑𝑋)2 ][𝑁∑𝑌 2 − (∑𝑌)2 ]

2. Spearman’s Rank Correlation

6∑𝑑 2
𝑟 =1−
𝑛(𝑛2 − 1)

Where:

• 𝑑= difference in ranks

• 𝑛= number of observations

3. Probable Error (P.E.)

1 − 𝑟2
𝑃. 𝐸. = 0.6745 ×
√𝑛

4. Coefficient of Determination

𝑟2

5. Coefficient of Non-Determination

1 − 𝑟2

6. Regression Equations
Regression of Y on X

𝑌 − 𝑌ˉ = 𝑏𝑦𝑥 (𝑋 − 𝑋ˉ)

Regression of X on Y

𝑋 − 𝑋ˉ = 𝑏𝑥𝑦 (𝑌 − 𝑌ˉ)

7. Regression Coefficients
𝜎𝑌
𝑏𝑦𝑥 = 𝑟 ⋅
𝜎𝑋

𝜎𝑋
𝑏𝑥𝑦 = 𝑟 ⋅
𝜎𝑌

8. Relationship Between r and Regression Coefficients

𝑟 = √𝑏𝑥𝑦 ⋅ 𝑏𝑦𝑥

9. Covariance (Basic Idea)

∑(𝑋 − 𝑋ˉ)(𝑌 − 𝑌ˉ)


Cov(𝑋, 𝑌) =
𝑁
Solved Sums – Module IV

1. Karl Pearson’s Correlation

Q1. Find correlation coefficient

Data:

X12345

Y 2 4 6 8 10
Step 1: Observe relationship

Y = 2X → perfectly linear

So,

𝑟 = +1

Answer: +1 (Perfect Positive Correlation)

2. Shortcut Formula (Numerical)

Q2. Find r

X123

Y235

Step 1: Calculate values

∑𝑋 = 6, ∑𝑌 = 10
∑𝑋𝑌 = (1 × 2) + (2 × 3) + (3 × 5) = 2 + 6 + 15 = 23
∑𝑋 2 = 14, ∑𝑌 2 = 38

Step 2: Apply formula


3(23) − (6)(10)
𝑟=
√[3(14) − 62 ][3(38) − 102 ]
69 − 60 9 9
𝑟= = = ≈ 0.98
√[42 − 36][114 − 100] √6 × 14 √84

Answer: 0.98 (High Positive Correlation)

3. Spearman’s Rank Correlation


Q3. Find rank correlation

X 10 20 30 40

Y 15 25 35 45

Step 1: Assign ranks

Both increasing → same ranks

X Rank Y Rank d

10 1 15 1 0

20 2 25 2 0

30 3 35 3 0

40 4 45 4 0

∑𝑑 2 = 0

Step 2: Formula
6(0)
𝑟 =1− =1
4(16 − 1)

Answer: 1 (Perfect Positive Correlation)

4. Probable Error

Q4. Find P.E.

Given: 𝑟 = 0.8, 𝑛 = 16

1 − 𝑟2
𝑃. 𝐸. = 0.6745 ×
√𝑛
1 − 0.64 0.36
= 0.6745 × = 0.6745 × = 0.6745 × 0.09 ≈ 0.061
4 4
Answer: 0.061

5. Regression Coefficient

Q5. Find 𝒃𝒚𝒙

Given:
𝑟 = 0.6, 𝜎𝑋 = 2, 𝜎𝑌 = 3
𝜎𝑌 3
𝑏𝑦𝑥 = 𝑟 × = 0.6 × = 0.6 × 1.5 = 0.9
𝜎𝑋 2

Answer: 0.9

6. Regression Equation

Q6. Find regression line of Y on X

Given:
𝑋ˉ = 10, 𝑌ˉ = 20, 𝑏𝑦𝑥 = 2

Step 1: Formula

𝑌 − 𝑌ˉ = 𝑏𝑦𝑥 (𝑋 − 𝑋ˉ)

Step 2: Substitute

𝑌 − 20 = 2(𝑋 − 10)
𝑌 − 20 = 2𝑋 − 20
𝑌 = 2𝑋

Answer: 𝑌 = 2𝑋
7. Find Correlation from Regression Coefficients

Q7. Given:

𝑏𝑥𝑦 = 0.5, 𝑏𝑦𝑥 = 0.8

𝑟 = √0.5 × 0.8 = √0.4 ≈ 0.63

Answer: 0.63

8. Coefficient of Determination

Q8. If 𝒓 = 𝟎. 𝟕, find 𝒓𝟐

𝑟 2 = (0.7)2 = 0.49

49% variation explained

Answer: 0.49

MCQs – Module IV (with Answers)

Basics of Correlation

1. Correlation measures:
A. Average
B. Relationship between variables
C. Frequency
D. Dispersion
Answer: B

2. Correlation coefficient (r) lies between:


A. 0 and 1
B. −1 and +1
C. 0 and ∞
D. −∞ and +∞
Answer: B

3. If r = +1, it indicates:
A. No correlation
B. Perfect positive correlation
C. Perfect negative correlation
D. Weak correlation
Answer: B

4. If r = 0, it means:
A. Perfect correlation
B. No correlation
C. Negative correlation
D. High correlation
Answer: B

Types of Correlation

5. When variables move in opposite direction:


A. Positive correlation
B. Negative correlation
C. Zero correlation
D. Linear correlation
Answer: B

6. Income and expenditure usually have:


A. Negative correlation
B. Positive correlation
C. No correlation
D. Zero correlation
Answer: B
Methods of Correlation

7. Scatter diagram is a:
A. Mathematical method
B. Graphical method
C. Statistical test
D. Algebraic method
Answer: B

8. Spearman’s rank correlation is used for:


A. Raw data
B. Grouped data
C. Ranked data
D. Continuous data
Answer: C

Karl Pearson

9. Karl Pearson’s coefficient measures:


A. Non-linear relationship
B. Linear relationship
C. Random data
D. Frequency
Answer: B

10. Correlation does NOT imply:


A. Relationship
B. Association
C. Causation
D. Direction
Answer: C

Regression
11. Regression is used for:
A. Description
B. Prediction
C. Classification
D. Tabulation
Answer: B

12. Number of regression lines:


A. 1
B. 2
C. 3
D. 4
Answer: B

13. Regression lines intersect at:


A. Origin
B. Median
C. Mean point
D. Mode
Answer: C

Regression Coefficients

14. Regression coefficients are:


A. Always negative
B. Always positive
C. Same sign
D. Opposite sign
Answer: C

15. If one regression coefficient is greater than 1:


A. Other is also >1
B. Other is <1
C. Both are equal
D. None
Answer: B

Relationship

16. Relationship between r and regression coefficients:


A. r = b₁ + b₂
B. r = √(b₁ × b₂)
C. r = b₁ − b₂
D. r = b₁ / b₂
Answer: B

Probable Error

17. Probable error is used to:


A. Calculate mean
B. Test reliability of r
C. Find median
D. Find mode
Answer: B

Determination

18. r² represents:
A. Mean
B. Variance
C. Explained variation
D. Total frequency
Answer: C

Concepts

19. Spurious correlation means:


A. Strong relation
B. False relation
C. Negative relation
D. Perfect relation
Answer: B

20. Correlation is independent of:


A. Scale and origin
B. Mean
C. Variables
D. Data
Answer: A

Numerical-Type

21. If r = 0.6, then r² =


A. 0.12
B. 0.36
C. 0.6
D. 1
Answer: B

22. If r = −1, relationship is:


A. Perfect positive
B. Perfect negative
C. Zero
D. Weak
Answer: B

Application-Based

23. Regression helps in:


A. Drawing graphs
B. Forecasting
C. Classification
D. Collection
Answer: B

24. Which is graphical method of correlation?


A. Regression
B. Scatter diagram
C. Mean
D. Median
Answer: B

25. High correlation means:


A. Weak relationship
B. Strong relationship
C. No relation
D. Zero relation
Answer: B

You might also like