0% found this document useful (0 votes)
12 views9 pages

Chi-Square and ANOVA Statistical Tests

The document discusses statistical tests including the Chi-Square Test and ANOVA, explaining their purposes and providing examples of problems to solve using these methods. It also covers non-parametric tests such as the Sign Test and Mann-Whitney U test, detailing their applications and significance. Overall, the document serves as a comprehensive guide to various statistical methods for analyzing categorical and continuous data.

Uploaded by

thanujmurthy47
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

Chi-Square and ANOVA Statistical Tests

The document discusses statistical tests including the Chi-Square Test and ANOVA, explaining their purposes and providing examples of problems to solve using these methods. It also covers non-parametric tests such as the Sign Test and Mann-Whitney U test, detailing their applications and significance. Overall, the document serves as a comprehensive guide to various statistical methods for analyzing categorical and continuous data.

Uploaded by

thanujmurthy47
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chi-Square Test

The chi-square test is a statistical test used to determine whether there is a significant
association between categorical variables. It's commonly used to analyze contingency
tables, which display the frequency distribution of two or more categorical variables.

O = Observed value
E = Expected value
Degree of freedom = (c-1) (r-1)
Problems
1. Out of a sample of 120 persons in a village, 76 were administered a new drug for
preventing influenza and out of them 24 persons were attacked by influenza. Out of those
who were not administered the new drug, 12 persons were not affected by influenza.
a. Prepare 2 X 2 tables showing the actual and expected frequencies.
b. Use Chi-square Test to find out whether the new drug is effective or not.
2. In a sample of 2,000 families, 1,400 families are consumers of tea. Out of 1,800 Hindu
families, 1,236 families consume tea. Use x2 test and state whether there is any significant
difference between the consumption of tea among Hindu and non-Hindu families.
3. A certain drug was administered to 456 males, out of a total of 720, in a certain locality to
test its efficacy against typhoid. The incidence of typhoid is shown below. Find out the
effectiveness of the drug against the disease.
Infection No infection Total
Administering the drug 144 312 456
Without administering the drug 192 72 264
Total 336 384 720

4. Out of 8,000 graduates in a town, 800 are females; out of 1,600 graduate employees 120
are female. Using x2 to determine if any distinction is made in appointment on the basis of
sex.
5. A certain drug was administered to 500 people out of a total of 800 included in the sample
to test its efficacy against typhoid. The results are given below:
Typhoid No typhoid Total
Drug 200 300 500
No drug 280 20 300
Total 480 320 800
On the basis of these data, can it be concluded that the drug is effective in preventing
typhoid?
ANOVA - Analysis of Variance
Analysis of variance (ANOVA) is a statistical technique used for analyzing the
difference between the means of more than two samples. It is a parametric test of
the hypothesis. It is a step wise estimation procedure (such as the "variation"
among and between groups) used to attest the equality between two or more
population means.
ANOVA was developed by statistician and eugenicist Ronald Fisher. Though many
statisticians including Fisher worked on the development of ANOVA model but it
became widely known after being included in Fisher's 1925 book “Statistical
Methods for Research Workers”. The ANOVA is based on the law of total variance,
where the observed variance in a particular variable is partitioned into components
attributable to different sources of variation. ANOVA provides an analytical study
for testing the differences among group means and thus generalizes the t-test
beyond two means. ANOVA uses F-tests to statistically test the equality of means.
For example, five fertilizers are applied to four plots of wheat, and the yield of wheat
on each of the plots is given. We may be interested in finding out whether the effect
of these fertilizers on the yields is significantly different or in other words whether
the samples have come from the same normal population. The answer to this
problem is provided by the technique of analysis of variance. Thus basic purpose of
the analysis of variance is to test the homogeneity of several means.
One-way ANOVA
A one-way ANOVA only involves one factor or independent variable.

Sources of variation Degree of Sum of squares Mean F ratio


freedom square
SSC
Between samples c–1 SSC = MSC

c-1

SSE MSC F
Within samples c (r-1 ) SSE = MSE =
MSE
c(r-1)
SST
Total cr-1
Problems:
1. To test the significance of variation in the retail prices of a commodity in three
principal cities, Mumbai, Kolkata, and Delhi, four shops were chosen at random
in each city and the prices that lacked confidence in their mathematical ability
were observed in rupees were as follows:
Kanpur 15 7 11 13
Lucknow 14 10 10 6
Delhi 4 10 8 8
Do the data indicate that the prices in the three cities are significantly different?

2. A dietitian wants to see if there is any difference in the effectiveness of three diets. He
selected homogenous group of 24 people and placed them in to three sub groups, where
each sub group trying a different diet plan. Following are the observations of weight loss
in KGS were recorded for members of each group.
Keto Diet Intermittent Fasting Gluten Free Diet
4.0 3.6 6.5
3.8 5.2 7.2
3.7 2.8 5.9
6.2 3.0 5.5
5.6 3.8 6.8
4.2 5.0 7.7
3.9 8.0
5.5 8.2
4.2 7.0
i. State the null and alternative hypothesis
ii. Setup the ANOVA table for this problem
iii. What would you advice the dietician about the effectiveness of the three diets?
Use α = 0.05

3. The lifetime of 3 electric bulbs of 4 brands measured in 100 hours are presented below.
Test the hypothesis that mean life times of four brands of bulbs are the same using
one-way analysis of variance
BRANDS
A B C D
20 25 24 23
19 23 20 20
21 21 22 20
ANOVA Two way
A two-way ANOVA is used to estimate how the mean of a quantitative variable changes
according to the levels of two categorical variables. Use a two-way ANOVA when you want
to know how two independent variables, in combination, affect a dependent variable.
Source of Sum of Degrees of Mean Square F - ratio
Variation square Freedom

Between SSC c-1 MSC= SSC/(c-1)


𝐹C = MSC/MSE
columns
Between rows SSR r-1 MSR= SSR/(r-1)
𝐹R = MSR/MSE

Residual error SSE (c-1)(r-1) MSE= SSE/(c-1)(r-1)

Problems:
1. The following table gives the number of refrigerators sold by 4 salesmen in
three months May, June and July. Is there a significant difference in the
sales made by the four salesmen? Is there a significant difference in the sales
made during different months?
Month Salesman
A B C D
March 50 40 48 39
April 46 48 50 45
May 39 44 40 39
2. The following table gives the number of units of production per day turned out by four
different types of machines:

Employee Types of Machines


M1 M2 M3 M4
E1 40 36 45 30
E2 38 42 50 41
E3 36 30 48 35
E4 46 47 52 44
Using Analysis of Variance (a). test the hypothesis that the mean production is the same
for the four machines (b) test the hypothesis that the employees do not differ with respect
to mean productivity.
3. The following are the defective pieces produced by four operators working
in turn, on four different machines:
Operator
Machine I II III IV
A 3 2 3 2
B 3 2 3 4
C 2 3 4 3
D 3 4 3 2
Perform analysis of variance at 5% level of significance to ascertain whether
variability in production is due to variability in operator’s performance or variability
in machine’s performance.

Non – Parametric Test


1. Sign Test
The sign test in statistics is a non-parametric method used to determine if the
median of a dataset differs significantly from a specific value. It's particularly
useful when the data doesn't meet the assumptions of parametric tests like the
t-test, such as when the data is not normally distributed or when the sample
size is small.
The sign test is widely used in various fields, especially when dealing with
small sample sizes, ordinal data, or non-normally distributed data. It's a simple
and robust method for comparing medians or testing hypotheses without
making assumptions about the underlying distribution of the data.
2. The Mann-Whitney U test, also known as the Mann-Whitney-Wilcoxon test,
is a non-parametric statistical test used to compare two independent groups
when the dependent variable is ordinal or continuous but not normally
distributed. It's an alternative to the independent samples t-test, which
assumes normality and homogeneity of variance.
The Mann-Whitney U test is widely used in various fields, especially when
dealing with small sample sizes, ordinal data, or non-normally distributed
data. It's robust against violations of normality assumptions and provides a
way to compare groups without requiring assumptions about the underlying
distribution of the data.
3. The median run test is a non-parametric statistical test used to determine
whether a data sequence appears to be random or not. It's particularly useful
for detecting systematic deviations from randomness, such as clustering or
periodicity, in a sequence of observations.
The median run test is relatively straightforward to apply and interpret. It's
commonly used in quality control, time series analysis, and various fields
where detecting patterns or anomalies in data sequences is important.
However, it's worth noting that the test has some limitations and may not
detect certain types of departures from randomness.
4. The Kolmogorov-Smirnov (KS) one-sample test is a non-parametric
statistical test used to determine whether a sample comes from a specific
distribution. It's particularly useful when you want to compare a sample
distribution to a theoretical distribution or when you want to test the
goodness-of-fit of a sample to a known distribution.
The KS one-sample test is commonly used in various fields, including quality
control, finance, and environmental science, to assess whether a sample
distribution conforms to a theoretical distribution. It's particularly valuable
when you have limited knowledge about the underlying distribution of your
data or when the distribution is suspected to deviate from a standard
parametric distribution.

You might also like