0% found this document useful (0 votes)
21 views3 pages

CEO Salary and Housing Price Analysis

The document outlines a series of statistical analyses using various datasets, including CEOSAL2, HPRICE1, WAGE2, and others, focusing on relationships between variables such as CEO salaries, housing prices, and wages. It includes tasks such as estimating regression models, interpreting coefficients, and evaluating the significance of variables. Additionally, it addresses issues like heteroscedasticity and serial correlation in regression analysis.

Uploaded by

vedicadhingra
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views3 pages

CEO Salary and Housing Price Analysis

The document outlines a series of statistical analyses using various datasets, including CEOSAL2, HPRICE1, WAGE2, and others, focusing on relationships between variables such as CEO salaries, housing prices, and wages. It includes tasks such as estimating regression models, interpreting coefficients, and evaluating the significance of variables. Additionally, it addresses issues like heteroscedasticity and serial correlation in regression analysis.

Uploaded by

vedicadhingra
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1. The data set in CEOSAL2 contains information on chief executive officers for U.S.

corporations. The variable salary is annual compensation, in thousands of dollars, and


ceoten is prior number of years as company CEO.
i. Find the average salary and the average tenure in the sample.
ii. How many CEOs are in their first year as CEO (that is, ceoten=0)? What is the
longest tenure as a CEO?
iii. Estimate the simple regression model
log(salary)= β 0 + β 1 ceoten+u
and report your results in the usual form. What is the (approximate) predicted
percentage increase in salary given one more year as a CEO?

2. Use the data in HPRICE1 to estimate the model:


price= β 0 + β 1 sqrft+ β 2 bdrms+u
where price is the house price measured in thousands of dollars.
i. Write out the results in equation form.
ii. What is the estimated increase in price for a house with one more bedroom, holding
square footage constant?
iii. What is the estimated increase in price for a house with an additional bedroom that
is 140 square feet in size? Compare this to your answer in part (ii).
iv. What percentage of the variation in price is explained by square footage and number
of bedrooms?
v. The first house in the sample has sqrft =2,438 and bdrms=4. Find the predicted
selling price for this house from the OLS regression line.
vi. The actual selling price of the first house in the sample was $300,000 (so
price=300). Find the residual for this house. Does it suggest that the buyer underpaid
or overpaid for the house?

3. The file CEOSAL2 contains data on 177 chief executive officers and can be used to
examine the effects of firm performance on CEO salary.
i. Estimate a model relating annual salary to firm sales and market value. Make the
model of the constant elasticity variety for both independent variables. Write the
results out in equation form.
ii. Add profits to the model from part (i). Why can this variable not be included in
logarithmic form? Would you say that these firm performance variables explain
most of the variation in CEO salaries?
iii. Add the variable ceoten to the model in part (ii). What is the estimated percentage
return for another year of CEO tenure, holding other factors fixed?
iv. Find the sample correlation coefficient between the variables log(mktval) and profits.
Are these variables highly correlated? What does this say about the OLS estimators?

4. Use the data set in WAGE2 for this problem. As usual, be sure all of the following
regressions contain an intercept.
~
i. Run a simple regression of IQ on educ to obtain the slope coefficient, say, δ 1.
~
ii. Run the simple regression of log(wage) on educ, and obtain the slope coefficient, β 1.
iii. Run the multiple regression of log(wage) on educ and IQ, and obtain the slope
coefficients, β^ 1and β^ 2, respectively.
~ ~
iv. Verify that β 1 = β^ 1 + β^ 2 δ 1.

5. The following model can be used to study whether campaign expenditures affect election
outcomes:
voteA= β 0+ β 1 log (expendA)+ β 2 log(expendB)+ β 3 prtystrA +u

where voteA is the percentage of the vote received by Candidate A, expendA and expendB
are campaign expenditures by Candidates A and B, and prtystrA is a measure of party
strength for Candidate A(the percentage of the most recent presidential vote that went to A’s
party).
i. What is the interpretation of b1?
ii. In terms of the parameters, state the null hypothesis that a 1% increase in A’s
expenditures is offset by a 1% increase in B’s expenditures.
iii. Estimate the given model using the data in VOTE1 and report the results in usual
form. Do A’sexpenditures affect the outcome? What about B’s expenditures? Can
you use these results to testthe hypothesis in part (b)?

6. Use the data in KIELMC, only for the year 1981, to answer the following questions. The
data are forhouses that sold during 1981 in North Andover, Massachusetts; 1981 was the
year construction beganon a local garbage incinerator.
i. To study the effects of the incinerator location on housing price, consider the simple
regression model:
log( price )= β 0 + β 1 log(dist )+u
where price is housing price in dollars and dist is distance from the house to the
incineratormeasured in feet. Interpreting this equation causally, what sign do you expect
for β1 if thepresence of the incinerator depresses housing prices? Estimate this equation
and interpret theresults.
ii. To the simple regression model in part (i), add the variables log(intst), log(area),
log(land), rooms, baths, and age, where intst is distance from the home to the interstate,
area is square footage of the house, land is the lot size in square feet, rooms is total
number of rooms, baths is number of bathrooms, and age is age of the house in years.
Now, what do you conclude about the effects of the incinerator? Explain why (i) and (ii)
give conflicting results.
2
iii. Add [log(intst )] to the model from part (ii). Now what happens? What do you
conclude about the importance of functional form?
iv. Is the square of log(dist) significant when you add it to the model from part (iii)?

7. Use the data in WAGE2 for this exercise.


i. Estimate the model:
log(wage)= β 0+ β 1 educ+ β 2 exper + β 3 tenure + β 4 married + β 5 black+ β 6 south+ β 7 urban+u
and report the results in the usual form. Holding other factors fixed, what is the
approximate difference in monthly salary between blacks and nonblacks? Is this
difference statistically significant?
ii. Add the variables and tenure2 to the equation and show that they are jointly insignificant
at even the 20% level.
iii. Extend the original model to allow the return to education to depend on race and test
whether the return to education does depend on race.
iv. Again, start with the original model, but now allow wages to differ across four groups of
people: married and black, married and nonblack, single and black, and single and
nonblack. What is the estimated wage differential between married blacks and married
nonblacks?

8. Using WAGE1 data and now consider the following regressions:


wagei= α 0 + α 1 exper i +ui
ln(wage)i= β 0 + β 1 ln (exper)i +ui
i. Estimate both regressions.
ii. Obtain the absolute and squared values of the residuals for each regressionand plot them
against the explanatory variable. Do you detect anyevidence of heteroscedasticity?
iii. Verify your qualitative conclusion in part (b) with the Glejser, Park, and White’s tests.
iv. If there is evidence of heteroscedasticity, how would you transform the data to reduce its
severity? Show the necessary calculations.

9. Using data NYSE


i. Estimate following regression:
returnt =α 0+ α 1 pricet +ut
ii. Test for the presence of first order serial correlation using Durbin-Watson test. (Hint: use
dwtest() from lmtest package).
iii. Now add a lag value of return (return_1) in the above regression and test again for the
presence of first order autocorrelation.

Common questions

Powered by AI

Heteroscedasticity indicates non-constant error variance across observations. In the wage analysis, plotting residuals might reveal variance increases with experience or education levels. Tests like Glejser, Park, or White's can confirm patterns. Remedies include transforming variables (e.g., logarithmic transformation), using robust standard errors, or generalized least squares to adjust for variance differences .

The residual, calculated as the difference between the predicted and actual selling price (e.g., Predicted price: $320k, Actual price: $300k), indicates an overestimation by the model if positive. This suggests that the factors considered perhaps valued the house higher than what the market determined, hinting at the nuances of 'non-quantified' perceptions or local economic conditions impacting real estate pricing .

The presence of the local garbage incinerator is expected to have a negative effect on housing prices, all else being equal. This can be evaluated by estimating the model log(price)=β 0+β 1log(dist)+u, where `dist` is the distance from a house to the incinerator. A positive estimate for β1 would imply that housing prices increase with distance, supporting the hypothesis that proximity to the incinerator depresses house prices .

If tenure squared is jointly insignificant in the wage model, it suggests that the inclusion of this higher-order term does not contribute additional explanatory power concerning wage variations. This might imply that wage growth does not follow a quadratic pattern with tenure, reinforcing a more linear relationship or indicating model overspecification .

Adding the square of log(intst) to the regression model adjusts for potential non-linear relationships between the distance to the interstate and the housing prices. If this term is significant, it indicates a quadratic relationship, suggesting complex impacts of both proximity and other distance factors, thus refining the causal interpretation of incinerator impacts. This addition reveals the critical nature of model specification on conclusions drawn from regression analysis .

The correlation between log(mktval) and profits within a regression model indicates possible multicollinearity issues, which can inflate the variances of OLS estimators. A high correlation suggests redundant information between the variables, affecting the precision of coefficient estimates and hypothesis testing. OLS estimators remain unbiased, but inference becomes unreliable, indicating a need for variance-inflation factor checks .

If the return to education depends on race, it implies that educational achievements yield different economic returns depending on racial backgrounds. Such a model may reveal systemic inequities or historical biases in wage structures, highlighting differential returns for similar educational investments, potentially motivating policy interventions or further investigations into institutional biases .

The predicted percentage increase in a CEO's salary for each additional year of tenure can be estimated from the regression model log(salary)=β 0+β 1ceoten+u. The coefficient β1 represents the approximate change in the logarithm of the salary for a one-year change in tenure. If β1 is estimated as 0.05, it implies that a one-year increase in tenure results in a 5% salary increase, assuming the logarithmic relationship. This morphs into interpreting β1 as the elasticity of salary w.r.t tenure .

The model voteA=β 0+β 1log(expendA)+β 2log(expendB)+β 3prtystrA+u assesses the elasticity of vote share with respect to expenditures. The coefficient β1 indicates the change in A's vote share per log percent increase in A's expenditures, while β2 represents the impact of B's spending. A hypothesis that a 1% increase in A’s expenditures is countered by a corresponding increase in B’s can be tested through a significance test on β1 and β2 using their t-statistics or P-values .

Adding profits to the model implies that alongside firm sales and market value, profit levels also have a bearing on salary variations. Since profits cannot be logged in the same form, the variable represents a different elasticity measure in its raw form. Despite these additions, if the model explains only a small portion of variation in CEO salaries, it suggests other unaccounted factors also play significant roles .

You might also like