0% found this document useful (0 votes)
12 views23 pages

Statistical Analysis of Experimental Data

The document presents various statistical experiments and tests, including chi-square tests for bean populations, die rolls, and sales effectiveness of an advertisement campaign. It also discusses correlation coefficients, rank correlation, and measures of central tendency, variation, skewness, and kurtosis. Additionally, it includes practical applications of these statistical concepts in analyzing data from different scenarios.

Uploaded by

Timepass
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views23 pages

Statistical Analysis of Experimental Data

The document presents various statistical experiments and tests, including chi-square tests for bean populations, die rolls, and sales effectiveness of an advertisement campaign. It also discusses correlation coefficients, rank correlation, and measures of central tendency, variation, skewness, and kurtosis. Additionally, it includes practical applications of these statistical concepts in analyzing data from different scenarios.

Uploaded by

Timepass
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1.

The theory predicts the population of beans in the four groups A, B, C and D should be
[Link]. In an experiment among 1600 beans, the numbers in the four groups were 882,
213, 287 and 118. Does the experimental result support the theory? ( The table value
for 3 degrees of freedom at the 5% level of significance is 7.81).
2. A die is rolled 100 times with the following distribution. ( The table value for 5 degrees of
freedom at the 1% level of significance is 15.086).
Number 1 2345 6

Observed 17 14 20 17 17 15
Frequenc
y
3. Tata Soaps manufacturing company was distributing a particular brand of soap through a
large number of retail shops. Before a heavy advertisement campaign, the mean sales
per week per shop was 140 dozens,. After the campaign, a sample of 26 shops was
taken and the mean sales was found to be 147 dozens with a standard deviation 16. Can
you consider the advertisement effective?. ( The table value for 25 degrees of freedom at
the 5% level of significance is 1.798).
4. A machine is designed to produce insulating washers for electrical devices of average
thickness 0.025cm. A random sample of 10 washers was found to have an average
thickness of 0.024cm with a standard deviation of 0.002cm. Test the significance of
deviation. Value 0f t for 9 degrees of freedom at 5% level of significance is 2.262).

5. Find the least value of r in a sample of 18 pairs of observation from a bivariate normal
population significant at 5% level of significance. ( The table value for 16 degrees of
freedom at the 5% level of significance for a two tailed test is 2.12).
6. A random sample of 27 pairs of observations from a normal population gives a
correlation coefficient of 0.42. Is it likely that the variables in the population are
uncorrelated?. ( The table value for 25 degrees of freedom at the 5% level of
significance for a two tailed test is 2.06).
7. A random sample of 20 daily workers of state. A was found to have an average daily
earning of Rs. 44 with sample variance 900. Another sample of 20 daily workers from
state B was found to earn on an average Rs. 30 per day with sample variance 400. Test
whether the workers in state A are earring more than in state B.
8. A school claimed that the students studying are more intelligent than the average
[Link] calculating the IQ scores of 50 students, with mean 11. The mean of the
population IQ is 100 and the standard deviation is 15. State whether the claim of principal
is right or not at a 5% level of significance.( Z-score at 5% level of significance is 1.645).
9. Calculate the coefficient of correlation for the following data.
x 9 8 765 432 1

y 15 16 14 13 11 12 10 8 9
10. From the following data which shows the ages X and systolic B.P. Y of 12 womens. Are
the two variable ages X and B.P. Y correlated?
Age 56 42 72 36 63 47 55 49 38 42 68 60
(X)

B.P. 147 125 160 118 149 128 150 145 115 140 152 155
(Y)
11. Find the rank correlation coefficient from the following data.
x 10 12 18 18 15 40

y 12 18 25 25 50 25
12. Determine the rank correlation for the following data which shows the marks obtained in
two quizzes in mathematics.
Mark 6 5 8 8 7 6 10 4 9 7
s in
the
1st
quiz
(X)

Mark 8 7 7 10 5 8 10 6 8 6
s in
the
2nd
quiz
(Y)
13. Find the Bowley Skewness for the following set of data:
No of pets B No. of family Cumulative frequency

0 60 60

1 60 120

2 50 170

3 20 190

4 25 215

5 10 225

6 or more 5 230
14. Find out the fallacy. If any in the following statement: if X is a possession variate such
that P(X=2) = 9P(X=4) + 90 P(X= 6), then mean of X=1.
15. A variable X follows Poison distribution with variance 3. Calculate (i) P(X=2),
(ii) P(X>=4).
16. For a certain normal distribution, the first moment of about 10 is 40 nd the fourth moment
of about 50 is 48. What is the arithmetic mean and standard deviation of the distribution?

17. If X is a normal variate with a mean of 30 and a standard deviation is 5. Find the
probabilities that (i) 26<=X<= 40 (II) X<=45
18. A trucking company wishes to test the average life of the four brands of tyres. The
company uses all the brands on randomly selected trucks. The records showing the lives
( Thousands of miles) of tyres are as given in the table.
Test the hypothesis that the average life for each band of tyres is the same. Assume
standard deviation= 0.01. Apply suitable test
Brand 1 Brand 2 Brand 3 Brand 4

20 19 21 15

23 15 19 17

18 17 20 16

17 20 17 18
19.
Q20. Elucidate in what way measures of central tendency, variation, skewness and
kurtosis are complementary to one another in understanding a frequency
distribution table.
Ans. Measures of central tendency, variation, skewness, and kurtosis are complementary
to each other in understanding a frequency distribution table because they provide
different perspectives on the shape and characteristics of the distribution.

Central tendency measures, such as the mean, median, and mode, provide information
about the typical or average value of the data. They give a sense of where the majority of
the data lies and can be used to compare different datasets.

Variation measures, such as the range, variance, and standard deviation, provide
information about the spread or dispersion of the data. They describe how the data is
distributed around the central tendency and can be used to compare the variability of
different datasets.

Skewness measures, such as the skewness coefficient, provide information about the
symmetry of the distribution. A positively skewed distribution has a longer tail to the right,
while a negatively skewed distribution has a longer tail to the left. Skewness can affect the
interpretation of the mean as a measure of central tendency, and it can be important in
certain applications such as finance and insurance.

Kurtosis measures, such as the kurtosis coefficient, provide information about the
peakedness of the distribution. A distribution with high kurtosis has a sharp peak and
heavy tails, while a distribution with low kurtosis has a flat peak and light tails. Kurtosis
can affect the interpretation of the variance and standard deviation as measures of
variation, and it can be important in certain applications such as risk management.

Taken together, these measures provide a comprehensive picture of the distribution of


data and can be used to identify patterns, outliers, and other characteristics that may be
relevant to understanding the data. For example, a distribution with a high mean and high
standard deviation may indicate that there are both high and low values in the data, while
a distribution with a negative skewness coefficient and low kurtosis may indicate that the
data is concentrated in the lower range.

By using a combination of these measures, analysts can gain a deeper understanding of


the underlying patterns and characteristics of the data and make more informed decisions
based on that understanding.

Common questions

Powered by AI

Bowley Skewness is calculated using the formula (Q3 + Q1 - 2Q2)/(Q3 - Q1), where Q1, Q2, and Q3 are the first quartile, median, and third quartile, respectively. Given the cumulative frequencies, identify these quartiles and calculate the skewness. It quantifies asymmetry in data, with positive values indicating right skew and negative for left skew; zero suggests symmetry .

The principal's claim is tested using a z-test for a sample with a mean IQ of 111 against a population mean of 100 and a standard deviation of 15. The z-statistic is (111 - 100) / (15/√50) ≈ 5.20. Since 5.20 > 1.645 (z-critical value at 5% significance level), the claim is supported, suggesting the sampled students are more intelligent than the average .

The effectiveness of the advertisement campaign is evaluated using a t-test for the difference in means. The sample mean after the campaign is 147 dozens with a standard deviation of 16, compared to a previous mean of 140 dozens. With 25 degrees of freedom, the t-statistic is calculated as (147 - 140) / (16/√26) ≈ 1.92. Since 1.92 > 1.798 (the critical value at 5% significance level), the results suggest the advertisement was effective in increasing sales .

The correlation between ages and systolic blood pressure is assessed using Pearson’s correlation coefficient. Calculating the coefficient from the given data determines the strength and direction of the relationship. If significantly positive or negative, it indicates a strong relationship between age and blood pressure, with statistical tests like the t-test validating the significance of the correlation based on the sample size and correlation coefficient .

In testing whether variables are uncorrelated, we use the correlation coefficient from the sample. With a sample correlation coefficient of 0.42 from 27 pairs and a t-tests for correlation significance, the critical t-value at 5% significance level is 2.06 for 25 degrees of freedom. Calculating the test statistic t = r√(n-2)/√(1-r²) gives t ≈ 2.233509. Since t > 2.06, we reject the null hypothesis and conclude it is unlikely the variables are uncorrelated .

The chi-square goodness of fit test compares the observed bean populations (882, 213, 287, and 118) to the expected populations under the 9:3:3:1 ratio for 1600 beans. Calculating the expected frequencies as 900, 300, 300, and 100, the chi-square statistic is computed as Σ((O-E)^2/E), where O is the observed and E is the expected frequency. This results in a chi-square value of approximately 3.62. Since 3.62 < 7.81 (the critical value for 3 degrees of freedom at 5% significance), we fail to reject the null hypothesis, indicating the experimental results support the predicted 9:3:3:1 distribution .

The inference regarding the earnings of daily workers from states A and B requires a comparison of means using a two-sample t-test. The means are 44 and 30 with variances 900 and 400, respectively, each with a sample size of 20. The t-statistic is calculated, taking into account equal variances, yielding a result on whether the earnings are significantly different. The calculated t-value, when compared to a critical value for 38 degrees of freedom, determines if state A's workers earn more significantly than those in state B .

To determine if the machine produces washers significantly different from the designed thickness of 0.025 cm, a t-test is used. The sample mean is 0.024 cm with a standard deviation of 0.002 cm for a sample size of 10. The t-statistic is calculated as (0.024 - 0.025) / (0.002/√10) ≈ -1.58. Since |-1.58| < 2.262 (critical value for 9 degrees of freedom), we fail to reject the null hypothesis, indicating no significant deviation from the designed thickness .

Central tendency measures indicate the average or typical value of data, revealing where most data lies, and help compare datasets. Variation measures describe data spread around the central value, aiding in assessing data consistency. Skewness reveals distribution symmetry; positive skewness suggests a rightward tail, affecting mean interpretation. Kurtosis assesses distribution peaks; high kurtosis suggests a sharp peak, impacting variation perception. Together, these measures detail data distribution, helping identify patterns and make informed decisions regarding the dataset .

The rank correlation coefficient, specifically Spearman's rank, is calculated from the ranks of two datasets. Compute the difference in ranks (d) for each pair, then sum the squares of these differences. The coefficient is calculated as 1 - (6Σd²/n(n²-1)), where n is the number of observations. This coefficient assesses the monotonic relationship between the datasets, indicating whether high ranks in one correspond to high ranks in another .

You might also like