0% found this document useful (0 votes)
14 views15 pages

Remedial Measures for Model Inadequacy

Chapter 5 discusses remedial measures for model inadequacy in linear regression, focusing on issues like multicollinearity, variance-stabilizing transformations, and the treatment of influential observations. It outlines methods for detecting and addressing multicollinearity, including the use of Variance Inflation Factor (VIF), and introduces various transformations to stabilize variance and linearize relationships. Additionally, the chapter covers Generalized Least Squares (GLS) and Weighted Least Squares (WLS) as techniques to handle violations of OLS assumptions, along with strategies for identifying and managing influential observations.

Uploaded by

bisratengda613
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views15 pages

Remedial Measures for Model Inadequacy

Chapter 5 discusses remedial measures for model inadequacy in linear regression, focusing on issues like multicollinearity, variance-stabilizing transformations, and the treatment of influential observations. It outlines methods for detecting and addressing multicollinearity, including the use of Variance Inflation Factor (VIF), and introduces various transformations to stabilize variance and linearize relationships. Additionally, the chapter covers Generalized Least Squares (GLS) and Weighted Least Squares (WLS) as techniques to handle violations of OLS assumptions, along with strategies for identifying and managing influential observations.

Uploaded by

bisratengda613
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 5

REMEDIAL MEASURES OF
MODEL INADEQUACY
Introduction
• Chapter 4 presented several techniques for checking the adequacy of the
linear regression model.
• Recall that regression model fitting has several implicit assumptions,
including the following:
 The model errors have constant variance and are uncorrelated.
 The model errors have a normal distribution; this assumption is made in
order to conduct hypothesis tests and construct CIs, under this
assumption, the errors are independent.
 The model should be linear at least in parameter.
• Plots of residuals are very powerful methods for detecting violations of
these basic regression assumptions.
• This form of model adequacy checking should be conducted for every
regression model that is under serious consideration for use in practice.
• In this chapter, we focus on methods and procedures for building
regression models when some of the above assumptions are violated.
5.1 Multicolliarity
• Multicollinearity occurs in statistical modelling, specifically in multiple
regression analysis, when two or more independent variables are highly correlated
with one another.
• This high correlation undermines the statistical significance of the independent
variables, making it difficult to isolate their individual effects on the dependent
variable.
Key Characteristics of Multicollinearity:
• High Correlation Between Variables: Independent variables have strong
linear relationships with each other.
• Unstable Coefficients: Regression coefficients become unstable, meaning small
changes in the data can lead to large changes in the coefficients.
• Difficulty in Interpretation: It becomes challenging to determine the true effect
of each independent variable on the dependent variable.
Indicators of Multicollinearity:
1. Variance Inflation Factor (VIF): A common metric; a VIF value > 10 suggests
high multicollinearity.
2. High R-squared with Few Significant Variables: The model has a high
overall R-squared, but individual variables are not statistically significant.
3. Correlation Matrix: Pairwise correlations between independent variables are
close to 1 or -1.
Why Multicollinearity is Problematic:
• It inflates the standard errors of the coefficients, reducing their statistical
significance.
• Makes the model less robust to changes in the dataset.
• Complicates interpretation, as the model cannot distinguish the individual effects
of correlated variables.
How to Detect Multicollinearity:
1. Calculate VIF for all predictors.
2. Examine the condition number (a large value indicates multicollinearity).
3. Analyze the correlation matrix for strong relationships.
How to Handle Multicollinearity:
1. Drop Variables: Remove one or more highly correlated variables.
2. Combine Variables: Use dimensionality reduction techniques like Principal
Component Analysis (PCA).
3. Regularization: Apply techniques like Ridge Regression or Lasso to mitigate
multicollinearity.
4. Transform Variables: Modify or scale variables to reduce correlation.
Variance Inflation Factor (VIF)
• The Variance Inflation Factor (VIF) quantifies how much the variance of a
regression coefficient is inflated due to multicollinearity in the model.
• It helps identify multicollinearity by measuring how strongly an independent
variable is correlated with the other independent variables in the model.
𝟏
For a predictor Xi​, the VIF is given by: VIFi=𝟏−𝑹𝒊𝟐
Where:
• 𝑅𝑖 2 ​ is the coefficient of determination when Xi ​ is regressed on all other predictors.
• 𝑅𝑖 2 ​ indicates how well Xi is explained by the other predictors. A high 𝑅𝑖 2 implies
that Xi is highly correlated with the other variables, leading to high
multicollinearity.
Interpreting VIF
• VIF = 1: No multicollinearity.
• 1 < VIF ≤ 5: Moderate correlation; acceptable.
• VIF > 5: High correlation; potentially problematic.
• VIF > 10: Severe multicollinearity; action needed
Steps to Calculate VIF
1. Fit a Regression Model: Fit a regression model with the dependent variable (Y)
and independent variables (X1,X2,...,Xp​).
2. Calculate 𝑅𝑖 2 ​ for Each Predictor: For each independent variable Xii, regress it
on all other predictors (Xj, j≠i).
3. Compute VIF: Use the formula to calculate VIF for each variable.
How to Reduce VIF
Drop Variables: Remove one or more highly correlated predictors.
Combine Variables: Create a composite variable, e.g., sum or average.
Regularization: Use Ridge or Lasso regression.
Center Variables: Mean-center predictors to reduce multicollinearity caused by
polynomial terms or interactions.
Variance-Stabilizing transformations
• Variance-stabilizing transformations (VSTs) are mathematical
transformations applied to data to stabilize the variance across the range
of a dataset.
• These transformations are commonly used in statistical modelling and
analysis when the variance of the dependent variable is not constant,
violating the assumption of homoscedasticity in regression and other
parametric methods.
Why Use Variance-Stabilizing Transformations?
• Stabilize Variance: Ensure that the variance remains consistent across
different levels of the data.
• Improve Model Fit: Help meet assumptions of linear regression and ANOVA,
such as constant variance.
• Normalize Data: In some cases, VSTs also help make data more symmetric and
closer to a normal distribution.
• Interpretability: Transformations can simplify relationships between variables,
making trends clearer.
Common Variance-Stabilizing Transformations
1. Logarithmic Transformation: Y′=log(Y)
• Use When: Variance increases with the mean (e.g., exponential or multiplicative
data).
• Example: Count data with large ranges, like population sizes.
CONT…
2. Square Root Transformation: Y′= 𝑌​
• Use When: Variance is proportional to the mean.
• Example: Data based on counts, such as number of occurrences.
1
3. Reciprocal Transformation: Y′=𝑌
• Use When: Variance decreases with the mean.
• Example: When high values dominate and need reduction.
4. Box-Cox Transformation:
𝑌 λ −1
, 𝑖𝑓 λ ≠ 0
Y′= λ
log 𝑌 , 𝑖𝑓 λ = 0
• Use When: A family of transformations is needed to find the best stabilizing
power.
• Example: Data with non-constant variance, such as skewed data.
5. Arcsine Transformation (for proportions or percentages): Y′=arcsin(Y)​)
• Use When: Data represent proportions or percentages.
• Example: Proportion of successes in binary outcomes.
𝑌
6. Logit Transformation (for proportions): Y′=log( )
1−𝑌
Use When: Proportional data bounded between 0 and 1.
Example: Data representing probabilities.
Choosing a Transformation
1. Plot the Data: Create scatterplots or residual plots to identify patterns in
variance.
2. Check for Normality: Use histograms or Q-Q plots to examine if a
transformation is needed.
3. Apply Candidate Transformations: Test different transformations to see
which stabilizes variance best.
4. Box-Cox or Yeo-Johnson:
Use these systematic approaches to select an optimal transformation
automatically.
Transformations to linearized model
• Transformations are often applied to non-linear relationships to linearize
them, enabling the use of linear regression or simplifying the analysis.
• By applying appropriate mathematical transformations to the dependent
(Y) and/or independent variables (X), a non-linear model can often be
transformed into a linear one.
Common Non-Linear Relationships and Their Transformations
Here are examples of common non-linear relationships and how they can be
transformed:
1. Exponential Relationship (Y=𝑨𝒆𝒃𝑿 )
Example: Population growth, radioactive decay.
Linear Form: log(Y)=log(A)+bX
Transformation: Apply the logarithm to Y.
Steps to Apply Transformations
1. Understand the Relationship: Plot the data to identify non-linear patterns.
Use scatterplots or pairplots to visualize relationships.
2. Choose the Appropriate Transformation: Based on the observed pattern,
select the transformation (e.g., log, square root).
3. Apply the Transformation: Transform Y, X, or both as needed.
4. Fit the Linear Model: Use the transformed variables in a linear regression
model.
5. Evaluate the Model:
Check residual plots and 𝑅2 values to ensure the model is well-fit.
Generalized and weighted least-squares
• Both Generalized Least Squares (GLS) and Weighted Least Squares
(WLS) are extensions of Ordinary Least Squares (OLS) regression
designed to handle violations of the standard OLS assumptions,
particularly when:
• The variance of the residuals is not constant (heteroscedasticity).
• The residuals are correlated (autocorrelation or serial correlation).
Generalized Least Squares (GLS)
• GLS generalizes OLS to account for correlations and non-constant
variances in the residuals by transforming the data so that the
transformed residuals satisfy the assumptions of OLS (i.e.,
homoscedasticity and no correlation).
Model Assumptions
Residuals (ϵ\epsilonϵ) have a covariance matrix Σ, which is not the identity
matrix.
Σ describes the structure of heteroscedasticity or correlation among
residuals.
.
GLS Transformation
1. Estimate the covariance structure Σ (if not known).
2. Transform the data:
𝑌 ∗ =Σ−1/2 Y
𝑋 ∗ =Σ−1/2X
−1
3. Perform OLS on the transformed data: β^GLS=(𝑋 ∗ ’ 𝑋 ∗ ) 𝑋 ∗ ’ 𝑌 ∗
Advantages
Provides unbiased and efficient estimates of β.
Handles non-spherical error structures.
Challenges
Requires knowledge or estimation of Σ.
Estimation errors in Σ can affect results.

Weighted Least Squares (WLS)


Overview
WLS is a special case of GLS that assumes heteroscedasticity (non-constant
variance) but no correlation among residuals. It weights observations
differently based on the inverse of their variance to stabilize variance.
Model Assumptions
2
• Residuals have a diagonal covariance matrix: Var(ϵi)=σi ​ ,i=1,2,…,n
• Larger weights are assigned to observations with smaller variance.
WLS Transformation
• The WLS estimator minimizes the following objective:
2
• Where: wi=1/σi ​ : Weight assigned to each observation.
Steps to Apply WLS
1. Estimate the weights wi​ (often based on residuals from an initial OLS
model).
2. Multiply each observation by wi to transform the data.
3. Perform OLS on the weighted data.
−1
β^WLS=(𝑋 ′ WX) 𝑋 ′ Y, where W is a diagonal matrix of weights wi​.
Evaluate the Model
Check if heteroscedasticity has been corrected:
Residual diagnostics (residual plots).
Statistical tests for homoscedasticity.
Interpret Results
• Analyse coefficients and their statistical significance.
Note that WLS improves efficiency but does not change the interpretation of
coefficients compared to OLS.
Treatment of influential observation
• Influential observations are data points that have a disproportionate
impact on the results of a statistical model, such as regression.
• They may distort parameter estimates, predictions, and model
assumptions, making it essential to identify and address them.
Steps to Treat Influential Observations
1. Identification of Influential Observations
Common Metrics
Leverage (hii):
• Measures how far an observation is from the centre of X values.
• Rule of thumb: High leverage if hii>2p/n , where p is the number of
• Predictors (including the intercept) and n is the number of observations.
Cook’s Distance (Di):
• Assesses how much a data point influences all regression coefficients.
• Rule of thumb: High influence if Di​>1.
Studentized Residuals:
• Residuals standardized by their estimated variance.
• Rule of thumb: Observations are problematic if the absolute value of the
Studentized residual exceeds 2 or 3.
DFBETAs:
• Measures the impact of a data point on each regression coefficient.
• Rule of thumb: Large influence if ∣DFBETA∣>2/ 𝑛​.
DFFITS:
Measures the influence of a data point on its fitted value.
Rule of thumb: Influential if ∣DFFITS∣>2pn
Treatment Options
Option 1: Investigate the Cause
Data Entry Errors: Correct any errors in data collection or entry.
Measurement Issues: Assess whether the data point is reliable or an artifact.
Contextual Relevance: Determine if the observation is representative of the
population or an outlier due to special conditions.
Option 2: Transformation
Variable Transformation: Apply transformations like logarithmic or square root
to reduce the impact of extreme values.
Robust Regression: Use methods less sensitive to influential points, such as robust
regression (e.g., Huber regression, M-estimators).
Option 3: Model Adjustment
Weighted Regression: Assign smaller weights to influential observations.
Nonlinear Models: Fit a model better suited to capture non-standard patterns.
Option 4: Exclusion
Exclude Influential Observations:
Remove points only if justified (e.g., they are genuine outliers or irrelevant to the
analysis).
Rerun the analysis and check the impact of exclusion.

You might also like