0% found this document useful (0 votes)
84 views5 pages

California Test Score Analysis Worksheet

The document describes a dataset from 420 California school districts that includes test scores, student-teacher ratios, and other variables. It provides a series of questions and answers about analyzing the data: 1) There is a weak negative relationship between student-teacher ratio and average test scores. Scores decrease by around 0.19 points for each additional student. 2) The regression model finds student-teacher ratio, expenditures per student, computers per student, percent on reduced lunch, district income, and percent English learners are statistically significant predictors of test scores. 3) The model is valid as it contains statistically significant coefficients, but may not reliably predict scores outside the original student-teacher ratio range in the data

Uploaded by

Rupok Chowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views5 pages

California Test Score Analysis Worksheet

The document describes a dataset from 420 California school districts that includes test scores, student-teacher ratios, and other variables. It provides a series of questions and answers about analyzing the data: 1) There is a weak negative relationship between student-teacher ratio and average test scores. Scores decrease by around 0.19 points for each additional student. 2) The regression model finds student-teacher ratio, expenditures per student, computers per student, percent on reduced lunch, district income, and percent English learners are statistically significant predictors of test scores. 3) The model is valid as it contains statistically significant coefficients, but may not reliably predict scores outside the original student-teacher ratio range in the data

Uploaded by

Rupok Chowdhury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ECO 515 (Summer 2020)

Worksheet -1

Name-Sarah Nuzhat Khan

ID-20175008

THE CALIFORNIA TEST SCORE DATA SET

The data used here are from all 420 K-6 and K-8 districts in California with data available for 1998 and
1999. Test scores are the average of the reading and math scores on the Stanford 9 standardized test
administered to 5th grade students. The student-teacher ratio used here is the number of full-time
equivalent teachers in the district, divided by the number of students. All of these data were obtained
from the California Department of Education ([Link]).

Series in Data Set: DIST_CODE: DISTRICT CODE; READ_SCR: AVG READING SCORE; MATH_SCR: AVG
MATH SCORE; COUNTY : COUNTY; DISTRICT: DISTRICT; GR_SPAN: GRADE SPAN OF DISTRICT;
ENRL_TOT : TOTAL ENROLLMENT; TEACHERS: NUMBER OF TEACHERS; COMPUTER: NUMBER OF
COMPUTERS; TESTSCR: AVG TEST SCORE (= (READ_SCR+MATH_SCR)/2 ); COMP_STU: COMPUTERS PER
STUDENT ( = COMPUTER/ENRL_TOT); EXPN_STU: EXPENTITURES PER STUDENT ($’S); STR: STUDENT
TEACHER RATIO (ENRL_TOT/TEACHERS); EL_PCT: PERCENT OF ENGLISH LEARNERS; MEAL_PCT:
PERCENT QUALIFYING FOR REDUCED-PRICE LUNCH; CALW_PCT: PERCENT QUALIFYING FOR CALWORKS;
AVGINC: DISTRICT AVERAGE INCOME (IN $1000'S);

a)Download the data from the student companion website of Stock and Watson. Calculate summary
statistics of STR and TESTSCR. What can you tell from these statistics?

Ans:-

From calculating the summary statistics of STR and TESTSCR we can say that:-
• When student teacher ratio is minimum at 14, average test score is 605.55

• When student teacher ratio is maximum at 25, average test score is 706.75

• When student teacher ratio std. deviation is 1.89

• When average test score std. deviation is 19.05.

b) Draw a scatter plot of average test scores (testscr) andstudent-teacher ratio (str). What does the
scatter plot indicate regarding relationship between test scores and class size?

Ans:-

Data from 420 California school districts shows that, There is a weak negative relationship between the
student-teacher ratio and test scores.

c) Run a regression of testscr on str, expn_stu, comp_stu, meal_pct, calw_pct, avginc, el_pct and copy
the STATA output below.

Ans:-
d) Write down the estimated regression line from the STATA output.

Ans:- testscore= 659.59 + ( – 0.189) str + ( 0.00152) expn_stu + (11.89) comp_stu + ( – 0.375) meal_pct
+ ( – 0.077) calw_pct + ( 0.62) avginc + ( – 0.198) el_pct

e) Explain what the coefficient of str means.

Ans:- The coefficient of str is ( -.18991) which means for every additional student in class, average test
score is expected to decrease by 0.18991 points, keeping other variables constant.

f) Report the standard error of regression (SER). What are the units of measurement for the SER
(dollars? years? scores? or is it unit free)?

Ans:- MSE = SS (residual) / df (residual) = 29011.1128 / 412 = 70.41

MSE = 70.41 = 8.39 This is Unit free.


g) Report the regression adjusted coefficient of determination. What are the units of measurement
for this coefficient (dollars? years? scores? or is it unit free)? What does it mean? Why should we use
this instead of simple coefficient of determination?

Ans:- Adjusted R-squared = 1- {(ss(residual)/df(residual))/(ss/df)} = 1- (70.41/363.03) = 0.8060 .Adjusted


R-squared is the modified version of R-squared. It estimates how well terms fit a curve or line. It
represents more appropriate measure than R-squared ( R-squared shows biased measure). Adding more
and more useless variable to a model decreases adjusted R-squared. Therefore, we use adjusted R-
squared.

h) Last year a classroom had 19 students and this year it has 23 students. What is the regression’s
prediction for the change in the classroom average test score?

Ans:- str = (-0.189) For new 4 students, the score is estimated to go down by (0.189 * 4) = 0.756 holding
other variables constant.

i) The student-teacher ratio has a minimum value of 14 and a maximum value of 25.8 in the 420
classrooms. Will the regression give reliable predictions for a class with 35 students? Why or why
not?

Ans:- The regression will not give reliable predictions for a class with 35 students as we have predicted Y
(35) but not actual Y in this case.

j) Based on your results, can you argue that a smaller class size will increase student test scores on
average?

Ans:-The regression takes teacher to student ratio in prediction and contain no information on how
districts with extremely small classes perform, so these data alone are not a reliable basis for predicting
the effect of a radical move to such an extremely low student-teacher ratio. Hence, the class size has no
effect on the regression & a smaller class has no effect on test scores.
k) Which explanatory variables are statistically significant? Use the p-value approach and explain their
signs.

Ans:-Variables whose p-value is less than 5% are statistically significant.

In this model they are: Percent of English learners (-ve) Percent qualifying for reduced-price lunch (+ve)
District average income (+ve).

l) Is the model valid? Conduct the test of validity and comment.

Ans:-This model is valid because it contains statistically significant coefficients.

Common questions

Powered by AI

The summary statistics reveal that a lower student-teacher ratio correlates with higher average test scores. Specifically, when the student-teacher ratio is at its minimum value of 14, the average test score is 605.55, whereas at the maximum ratio of 25, the average test score is 706.75. This indicates a weak negative relationship, suggesting that increases in class size might be associated with small decreases in average test scores .

The model predicts that an increase in the classroom size from 19 to 23 students will result in a decrease in the average test score by approximately 0.756 points. This prediction is based on the student-teacher ratio coefficient of -0.189 for each additional student, multiplied by the 4 additional students, yielding a score reduction of 0.756 points, assuming other factors remain constant .

The adjusted R-squared is preferred over the simple R-squared because it adjusts for the number of predictors in the model, providing a more accurate measure of the model’s explanatory power. The adjusted R-squared is 0.8060, which implies that approximately 80.6% of the variance in the average test scores is explained by the independent variables in the model. This high value suggests a strong fit for the model, but it's also important because it does not automatically increase with the addition of new variables like R-squared does, hence avoiding overestimation of model effectiveness when irrelevant variables are added .

The predictive reliability of the regression model is low when applied to classrooms with a 35:1 student-teacher ratio because the model is based on data with a maximum ratio of 25.8. Extrapolating beyond this observed data range is unreliable since the model does not account for behavioral patterns in extreme class sizes, potentially leading to inaccurate predictions .

The statistically significant variables include the percent of English learners (with a negative sign), the percent qualifying for reduced-price lunch (with a positive sign), and district average income (with a positive sign). This suggests that a higher percentage of English learners is associated with lower test scores, while higher percentages of students qualifying for reduced-price lunch and higher district average income are both associated with higher test scores, highlighting socio-economic factors' impact on educational performance .

The standard error of regression (SER) measures the average distance that the observed values fall from the regression line. A smaller SER indicates higher model accuracy. In this model, the SER is calculated to be unit-free, which suggests it doesn’t add specific measurement units to test scores but rather reflects the typical prediction error. The smaller value suggests that the model's predictions are generally close to the actual data points, enhancing confidence in the model's estimated effects, though interpretation should still be cautious and contextually aware .

The coefficient of the student-teacher ratio in the regression analysis is -0.189. This indicates that for each additional student per teacher (i.e., an increase in the student-teacher ratio), the average test score is expected to decrease by approximately 0.189 points, assuming all other factors remain constant. This coefficient quantifies the negative impact of increasing class size on test performance .

The scatter plot demonstrates a weak negative relationship between student-teacher ratios and average test scores. As class sizes increase (higher student-teacher ratios), test scores tend to decrease slightly, indicating that larger class sizes might adversely affect student performance. However, the relationship is weak, suggesting that other factors also significantly influence test scores .

The regression results alone do not provide sufficient evidence to argue that smaller class sizes significantly increase average student test scores. While the model shows a weak negative correlation between class size and test scores, it lacks data from districts with extremely small classes and does not control for all potential confounding factors. Other influences might also be at play, making it inaccurate to directly infer causality from class size to test score improvements based solely on this dataset .

The model is considered valid as it includes statistically significant coefficients, indicating that the variables chosen explain some variance in average test scores. The significant p-values, particularly those below 5%, support the reliability of the model's predictors. Also, validity tests in the analysis confirm the model adequately fits the data, although it is important to recognize limitations in extrapolating these findings to unobserved contexts or drawing causal inferences .

You might also like