0% found this document useful (0 votes)

84 views7 pages

Understanding Chi Square Statistics

The document discusses the chi square statistic, which is used to investigate whether distributions of categorical variables differ from one another. It provides examples of how to calculate chi square statistics for 2x2 contingency tables and goodness of fit tests, and how to use chi square distribution tables to determine if differences are statistically significant. The chi square statistic compares observed counts to expected counts to test hypotheses about independence and goodness of fit.

Uploaded by

Noviana Dian Utami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views7 pages

Understanding Chi Square Statistics

Uploaded by

Noviana Dian Utami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

11/4/2014

Chi Square Statistics

The Chi Square Statistic

Types of Data:
There are basically two types of random variables and they yield two types of data: numerical and
categorical. A chi square (X2) statistic is used to investigate whether distributions of categorical
variables differ from one another. Basically categorical variable yield data in the categories and
numerical variables yield data in numerical form. Responses to such questions as "What is your
major?" or Do you own a car?" are categorical because they yield data such as "biology" or "no." In
contrast, responses to such questions as "How tall are you?" or "What is your G.P.A.?" are numerical.
Numerical data can be either discrete or continuous. The table below may help you see the differences
between these two variables.
Data Type

Question Type

Categorical

What is your sex?

Disrete- How many cars do you
own?
Continuous - How tall are you?

Numerical
Numerical

Possible
Responses
male or female
two or three
72 inches

Notice that discrete data arise fom a counting process, while continuous data arise from a measuring
process.
The Chi Square statistic compares the tallies or counts of categorical responses between two (or more)
independent groups. (note: Chi square tests can only be used on actual numbers and not on
percentages, proportions, means, etc.)
2 x 2 Contingency Table
There are several types of chi square tests depending on the way the data was collected and the
hypothesis being tested. We'll begin with the simplest case: a 2 x 2 contingency table. If we set the 2 x
2 table to the general notation shown below in Table 1, using the letters a, b, c, and d to denote the
contents of the cells, then we would have the following table:
Table 1. General notation for a 2 x 2 contingency table.
Variable 1
Variable 2
Category 1
Category 2
Total

Data type 1
a
c
a+c

Data type 2
b
d
b+d

Totals
a+b
c+d
a+b+c+d=N

For a 2 x 2 contingency table the Chi Square statistic is calculated by the formula:

[Link]

1/7

11/4/2014

Chi Square Statistics

Note: notice that the four components of the denominator are the four totals from the table columns
and rows.
Suppose you conducted a drug trial on a group of animals and you hypothesized that the animals
receiving the drug would show increased heart rates compared to those that did not receive the drug.
You conduct the study and collect the following data:
Ho: The proportion of animals whose heart rate increased is independent of drug treatment.
Ha: The proportion of animals whose heart rate increased is associated with drug treatment.

Table 2. Hypothetical drug trial results.

Heart Rate
Increased
36
30
66

Treated
Not treated
Total

No Heart Rate
Total
Increase
14
50
25
55
39
105

Applying the formula above we get:

Chi square = 105[(36)(25) - (14)(30)]2 / (50)(55)(39)(66) = 3.418
Before we can proceed we eed to know how many degrees of freedom we have. When a comparison
is made between one sample and another, a simple rule is that the degrees of freedom equal (number
of columns minus one) x (number of rows minus one) not counting the totals for rows or columns. For
our data this gives (2-1) x (2-1) = 1.
We now have our chi square statistic (x2 = 3.418), our predetermined alpha level of significance
(0.05), and our degrees of freedom (df = 1). Entering the Chi square distribution table with 1 degree of
freedom and reading along the row we find our value of x2 (3.418) lies between 2.706 and 3.841. The
corresponding probability is between the 0.10 and 0.05 probability levels. That means that the p-value
is above 0.05 (it is actually 0.065). Since a p-value of 0.65 is greater than the conventionally accepted
significance level of 0.05 (i.e. p > 0.05) we fail to reject the null hypothesis. In other words, there is
no statistically significant difference in the proportion of animals whose heart rate increased.
What would happen if the number of control animals whose heart rate increased dropped to 29 instead
of 30 and, consequently, the number of controls whose hear rate did not increase changed from 25 to
26? Try it. Notice that the new x2 value is 4.125 and this value exceeds the table value of 3.841 (at 1
degree of freedom and an alpha level of 0.05). This means that p < 0.05 (it is now0.04) and we reject
the null hypothesis in favor of the alternative hypothesis - the heart rate of animals is different
between the treatment groups. When p < 0.05 we generally refer to this as a significant difference.
Table 3. Chi Square distribution table.
probability level (alpha)
Df

0.5

1
2

0.02

0.01

0.001

0.455 2.706 3.841

5.412

6.635

10.827

1.386 4.605 5.991

7.824

9.210

13.815

[Link]

0.10

0.05

2/7

11/4/2014

Chi Square Statistics

2.366 6.251 7.815

9.837

11.345 16.268

3.357 7.779 9.488

11.668 13.277 18.465

4.351 9.236 11.070 13.388 15.086 20.517

To make the chi square calculations a bit easier, plug your observed and expected values into the
following applet. Click on the cell and then enter the value. Click the compute button on the lower
right corner to see the chi square value printed in the lower left hand coner.

Chi Square Goodness of Fit (One Sample Test)

This test allows us to compae a collection of categorical data with some theoretical expected
distribution. This test is often used in genetics to compare the results of a cross with the theoretical
distribution based on genetic theory. Suppose you preformed a simpe monohybrid cross between two
individuals that were heterozygous for the trait of interest.
Aa x Aa
The results of your cross are shown in Table 4.

Table 4. Results of a monohybrid coss between two heterozygotes for the 'a' gene.

A
a
Totals

A
10
33
43

a
42
15
57

Totals
52
48
100

The penotypic ratio 85 of the A type and 15 of the a-type (homozygous recessive). In a monohybrid
cross between two heterozygotes, however, we would have predicted a 3:1 ratio of phenotypes. In
other words, we would have expected to get 75 A-type and 25 a-type. Are or resuls different?

[Link]

3/7

11/4/2014

Chi Square Statistics

Calculate the chi square statistic x2 by completing the following steps:

1. For each observed number in the table subtract the corresponding expected number (O E).
2. Square the difference [ (O E)2 ].
3. Divide the squares obtained for each cell in the table by the expected number for that cell [ (O E)2 / E ].
4. Sum all the values for (O - E)2 / E. This is the chi square statistic.
For our example, the calculation would be:
Observed Expected

(O
E)

(O E)2

(O E)2 / E

Atype

100

1.33

atype

100

4.0

Total

100

5.33
x2 = 5.33

We now have our chi square statistic (x2 = 5.33), our predetermined alpha level of significalnce
(0.05), and our degrees of freedom (df =1). Entering the Chi square distribution table with 1 degree of
freedom and reading along the row we find our value of x2 5.33) lies between 3.841 and 5.412. The
corresponding probability is 0.05<P<0.02. This is smaller than the conventionally accepted
significance level of 0.05 or 5%, so the null hypothesis that the two distributions are the same is
rejected. In other words, when the computed x2 statistic exceeds the critical value in the table for a
0.05 probability level, then we can reject the null hypothesis of equal distributions. Since our x2
statistic (5.33) exceeded the critical value for 0.05 probability level (3.841) we can reject the null
hypothesis that the observed values of our cross are the same as the theoretical distribution of a 3:1
ratio.
Table 3. Chi Square distribution table.
probability level (alpha)
Df

0.5

0.10

0.05

0.02

0.01

0.001

0.455 2.706 3.841

5.412

6.635

10.827

1.386 4.605 5.991

7.824

9.210

13.815

2.366 6.251 7.815

9.837

11.345 16.268

3.357 7.779 9.488

11.668 13.277 18.465

4.351 9.236 11.070 13.388 15.086 20.517

To put this into context, it means that we do not have a 3:1 ratio of A_ to aa offspring.
To make the chi square calculations a bit easier, plug your observed and expected values into the
following java applet.
Click on the cell and then enter the value. Click the compute button on the lower right corner to see
[Link]

4/7

11/4/2014

Chi Square Statistics

the chi square value printed in the lower left hand coner.

Chi Square Test of Independence

For a contingency table that has r rows and c columns, the chi square test can be thought of as a test of
independence. In a test ofindependence the null and alternative hypotheses are:
Ho: The two categorical variables are independent.
Ha: The two categorical variables are related.
We can use the equation Chi Square = the sum of all the (fo - fe)2 / fe
Here fo denotes the frequency of the observed data and fe is the frequency of the expected values. The
general table would look something like the one below:
Category Category Category
I
II
III
Sample A
a
b
c
Sample B
d
e
f
Sample C
g
h
i
Column
a+d+g b+e+h
c+f+i
Totals

Row Totals
a+b+c
d+e+f
g+h+i
a+b+c+d+e+f+g+h+i=N

Now we need to calculate the expected values for each cell in the table and we can do that using the
the row total times the column total divided by the grand total (N). For example, for cell a the
expected value would be (a+b+c)(a+d+g)/N.
Once the expected values have been calculated for each cell, we can use the same procedure are
before for a simple 2 x 2 table.
Observed Expected

[Link]

|O (O E)2
E|

(O E)2 / E

5/7

11/4/2014

Chi Square Statistics

Suppose you have the following categorical data set.

Table . Incidence of three types of malaria in three tropical regions.
Asia Africa
Malaria
A
Malaria
B
Malaria
C
Totals

South
America

Totals

100

250

We could now set up the following table:

Observed
31
14
45
2
5
53
53
45
2

Expected
30.96
23.04
36.00
20.64
15.36
24.00
34.40
25.60
40.00

|O -E|
0.04
9.04
9.00
18.64
10.36
29.00
18.60
19.40
38.00

(O E)2

(O E)2 / E

0.0016
81.72
81.00
347.45
107.33
841.00
345.96
376.36
1444.00

0.0000516
3.546
2.25
16.83
6.99
35.04
10.06
14.70
36.10

Chi Square = 125.516

Degrees of Freedom = (c - 1)(r - 1) = 2(2) = 4
Table 3. Chi Square distribution table.
probability level (alpha)
Df

0.5

0.10

0.05

0.02

0.01

0.001

0.455 2.706 3.841

5.412

6.635

10.827

1.386 4.605 5.991

7.824

9.210

13.815

2.366 6.251 7.815

9.837

11.345 16.268

3.357 7.779 9.488

11.668 13.277 18.465

4.351 9.236 11.070 13.388 15.086 20.517

Reject Ho because 125.516 is greater than 9.488 (for alpha = 0.05)

Thus, we would reject the null hypothesis that there is no relationship between location and type of
malaria. Our data tell us there is a relationship between type of malaria and location, but that's all it
[Link]

6/7

11/4/2014

Chi Square Statistics

says.
Follow the link below to access a java-based program for calculating Chi Square statistics for
contingency tables of up to 9 rows by 9 columns. Enter the number of row and colums in the spaces
provided on the page and click the submit button. A new form will appear asking you to enter your
actual data into the cells of the contingency table. When finished entering your data, click the
"calculate now" button to see the results of your Chi Square analysis. You may wish to print this last
page to keep as a record.
Chi Square,
This page was created as part of the Mathbeans Project. The java applets were created by David Eck
and modified by Jim Ryan. The Mathbeans Project is funded by a grant from the National Science
Foundation DUE-9950473.

[Link]

7/7

Estimation and Hypothesis Testing Explained
No ratings yet
Estimation and Hypothesis Testing Explained
5 pages
SPSS Data Analysis Assignments Guide
No ratings yet
SPSS Data Analysis Assignments Guide
6 pages
Introduction to One-Way ANOVA
No ratings yet
Introduction to One-Way ANOVA
30 pages
Understanding Hypothesis Testing
No ratings yet
Understanding Hypothesis Testing
86 pages
Understanding Randomized Block Design
100% (1)
Understanding Randomized Block Design
4 pages
Biostatistics Practice Questions Set A
No ratings yet
Biostatistics Practice Questions Set A
5 pages
Attack Rate and Risk Assessment in Epidemiology
100% (1)
Attack Rate and Risk Assessment in Epidemiology
63 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
3 pages
Hypothesis Testing in ANOVA and Kruskal-Wallis
No ratings yet
Hypothesis Testing in ANOVA and Kruskal-Wallis
5 pages
Measures of Disease Occurrence
No ratings yet
Measures of Disease Occurrence
25 pages
Z-Test for Large Sample Analysis
No ratings yet
Z-Test for Large Sample Analysis
24 pages
Confidence Intervals for Mean Estimation
No ratings yet
Confidence Intervals for Mean Estimation
83 pages
Rejecting Null Hypothesis in ANOVA
No ratings yet
Rejecting Null Hypothesis in ANOVA
17 pages
Overview of Applied Biostatistics
No ratings yet
Overview of Applied Biostatistics
53 pages
Biostatistics Lecture - 1 - Foundations and Basics of Statistical Tests and Data
No ratings yet
Biostatistics Lecture - 1 - Foundations and Basics of Statistical Tests and Data
62 pages
Data Transformation in SPSS Guide
No ratings yet
Data Transformation in SPSS Guide
5 pages
Statistical Hypothesis Testing Notes
No ratings yet
Statistical Hypothesis Testing Notes
13 pages
Estimation and Hypothesis Testing Explained
No ratings yet
Estimation and Hypothesis Testing Explained
2 pages
Hypothesis Testing: Types I & II Errors
No ratings yet
Hypothesis Testing: Types I & II Errors
9 pages
Understanding Discrete Probability Distributions
No ratings yet
Understanding Discrete Probability Distributions
64 pages
Epidemiologic Measures of Association
No ratings yet
Epidemiologic Measures of Association
59 pages
Understanding Inferential Statistics Basics
No ratings yet
Understanding Inferential Statistics Basics
4 pages
Biostatistics Exercises and Data Analysis
No ratings yet
Biostatistics Exercises and Data Analysis
2 pages
Causal Inference in Epidemiology
No ratings yet
Causal Inference in Epidemiology
93 pages
Agricultural Statistics Part - 2
No ratings yet
Agricultural Statistics Part - 2
20 pages
Correlation and Regression in Biostatistics
No ratings yet
Correlation and Regression in Biostatistics
121 pages
Introduction to Descriptive Statistics
No ratings yet
Introduction to Descriptive Statistics
10 pages
Understanding Correlation Analysis Techniques
No ratings yet
Understanding Correlation Analysis Techniques
16 pages
Excel ANOVA for Marketing Insights
No ratings yet
Excel ANOVA for Marketing Insights
20 pages
Statistical Tests, P Values, Confidence Intervals, and Power, A Guide To Misinterpretations.
No ratings yet
Statistical Tests, P Values, Confidence Intervals, and Power, A Guide To Misinterpretations.
15 pages
Simple Random Sampling Explained
100% (1)
Simple Random Sampling Explained
10 pages
Expected Value in Geometric Distribution
No ratings yet
Expected Value in Geometric Distribution
9 pages
Two-Sample Hypothesis Testing Methods
No ratings yet
Two-Sample Hypothesis Testing Methods
43 pages
One Way ANOVA Assignment Guide
No ratings yet
One Way ANOVA Assignment Guide
7 pages
Descriptive Epidemiologic Study Designs
No ratings yet
Descriptive Epidemiologic Study Designs
48 pages
Hypothesis Testing Procedure Explained
100% (2)
Hypothesis Testing Procedure Explained
58 pages
Types and Applications of Data Analysis
100% (1)
Types and Applications of Data Analysis
28 pages
ANOVA: Understanding Variance Analysis
No ratings yet
ANOVA: Understanding Variance Analysis
32 pages
Introduction to Biostatistics & Epidemiology
No ratings yet
Introduction to Biostatistics & Epidemiology
4 pages
Analyzing Grouped Data Statistics
No ratings yet
Analyzing Grouped Data Statistics
51 pages
EpiData 3.1 Setup and Usage Guide
No ratings yet
EpiData 3.1 Setup and Usage Guide
36 pages
Understanding One-Way and Two-Way ANOVA
No ratings yet
Understanding One-Way and Two-Way ANOVA
36 pages
Understanding Split-Plot Design
No ratings yet
Understanding Split-Plot Design
28 pages
Inferential Statistics and Estimation
No ratings yet
Inferential Statistics and Estimation
135 pages
Non-Parametric vs Parametric Tests
No ratings yet
Non-Parametric vs Parametric Tests
13 pages
Understanding ROC Curves in Classification
100% (1)
Understanding ROC Curves in Classification
5 pages
Basic Epidemiology & Statistics for IPC
No ratings yet
Basic Epidemiology & Statistics for IPC
42 pages
Linear Regression and Correlation Analysis
No ratings yet
Linear Regression and Correlation Analysis
33 pages
One and Two-Way ANOVA Overview
100% (1)
One and Two-Way ANOVA Overview
11 pages
Introduction to Biostatistics Course
No ratings yet
Introduction to Biostatistics Course
59 pages
Understanding Biological Variation and Statistics
No ratings yet
Understanding Biological Variation and Statistics
9 pages
Understanding Outliers in Statistics
100% (1)
Understanding Outliers in Statistics
5 pages
Understanding Data and Sampling Techniques
100% (1)
Understanding Data and Sampling Techniques
83 pages
Applied Econometrics: Causal Inference Guide
No ratings yet
Applied Econometrics: Causal Inference Guide
766 pages
Biostatistics Lecture Notes PDF
No ratings yet
Biostatistics Lecture Notes PDF
97 pages
Understanding the Chi Square Statistic
100% (1)
Understanding the Chi Square Statistic
7 pages
Understanding Chi Square Tests in Statistics
No ratings yet
Understanding Chi Square Tests in Statistics
12 pages
Understanding Chi Square Statistics
100% (1)
Understanding Chi Square Statistics
7 pages
Module 009 - Chi-Square
No ratings yet
Module 009 - Chi-Square
15 pages
Necrophilia Criminality in India Analysis
No ratings yet
Necrophilia Criminality in India Analysis
13 pages
Impact of Freezing on Corn Seed Vigor
No ratings yet
Impact of Freezing on Corn Seed Vigor
10 pages
The Decline of the Hunger Artist
No ratings yet
The Decline of the Hunger Artist
32 pages
OFW Experiences and Insights
100% (1)
OFW Experiences and Insights
3 pages
Science Questions for 10th Grade Students
No ratings yet
Science Questions for 10th Grade Students
12 pages
Public Health and IPR in India Analysis
No ratings yet
Public Health and IPR in India Analysis
11 pages
Chem 17 Sample Exam Overview
No ratings yet
Chem 17 Sample Exam Overview
5 pages
Physical Geography of Fiji
No ratings yet
Physical Geography of Fiji
42 pages
Insulation Test Certificate Form ESM-11-01
No ratings yet
Insulation Test Certificate Form ESM-11-01
1 page
Lessons from Kipling's "If" Poem
No ratings yet
Lessons from Kipling's "If" Poem
2 pages
ValuegenSE-30 Tabla de Tecnicas USA
No ratings yet
ValuegenSE-30 Tabla de Tecnicas USA
2 pages
Height Work Safety Analysis Checklist
No ratings yet
Height Work Safety Analysis Checklist
6 pages
Power Cables: Materials and Applications
No ratings yet
Power Cables: Materials and Applications
240 pages
Aqueous vs Vitreous Humor Functions
No ratings yet
Aqueous vs Vitreous Humor Functions
1 page
Theodore P Grosvenor
100% (1)
Theodore P Grosvenor
5 pages
Calcium's Impact on Tomato Growth & Yield
No ratings yet
Calcium's Impact on Tomato Growth & Yield
5 pages
Patient-Centered Care Competencies Guide
No ratings yet
Patient-Centered Care Competencies Guide
27 pages
Capture Myopathy in Wild Animals Review
No ratings yet
Capture Myopathy in Wild Animals Review
8 pages
UZIN L3 Gold: Rapid Drying Leveller
No ratings yet
UZIN L3 Gold: Rapid Drying Leveller
2 pages
Urbanization, Violence, and Security Insights
No ratings yet
Urbanization, Violence, and Security Insights
13 pages
GP PowerBank M520 Specifications
No ratings yet
GP PowerBank M520 Specifications
2 pages
Interoperability Levels in Health IT
No ratings yet
Interoperability Levels in Health IT
12 pages
HR Work Instruction for Recruitment & Training
No ratings yet
HR Work Instruction for Recruitment & Training
2 pages
KIIC Waste Water Standards 2022
No ratings yet
KIIC Waste Water Standards 2022
4 pages
Rate Confirmation for Freight Shipment
No ratings yet
Rate Confirmation for Freight Shipment
1 page
Neem Pesticide Guide for USAID Partners
No ratings yet
Neem Pesticide Guide for USAID Partners
10 pages
Motivation Strategies for Productivity
No ratings yet
Motivation Strategies for Productivity
22 pages
Dental Trauma Management Guidelines
No ratings yet
Dental Trauma Management Guidelines
61 pages
Seminar on Extreme Ultraviolet Lithography
No ratings yet
Seminar on Extreme Ultraviolet Lithography
19 pages
DVT Prophylaxis Guidelines for Surgery
No ratings yet
DVT Prophylaxis Guidelines for Surgery
3 pages

Understanding Chi Square Statistics

Uploaded by

Understanding Chi Square Statistics

Uploaded by

11/4/2014

Chi Square Statistics

The Chi Square Statistic

What is your sex?

Chi Square Statistics

Table 2. Hypothetical drug trial results.

Applying the formula above we get:

0.455 2.706 3.841

1.386 4.605 5.991

Chi Square Statistics

2.366 6.251 7.815

3.357 7.779 9.488

11.668 13.277 18.465

4.351 9.236 11.070 13.388 15.086 20.517

Chi Square Goodness of Fit (One Sample Test)

Chi Square Statistics

Calculate the chi square statistic x2 by completing the following steps:

0.455 2.706 3.841

1.386 4.605 5.991

2.366 6.251 7.815

3.357 7.779 9.488

11.668 13.277 18.465

4.351 9.236 11.070 13.388 15.086 20.517

Chi Square Statistics

Chi Square Test of Independence

Chi Square Statistics

Suppose you have the following categorical data set.

We could now set up the following table:

Chi Square = 125.516

0.455 2.706 3.841

1.386 4.605 5.991

2.366 6.251 7.815

3.357 7.779 9.488

11.668 13.277 18.465

4.351 9.236 11.070 13.388 15.086 20.517

Reject Ho because 125.516 is greater than 9.488 (for alpha = 0.05)

Chi Square Statistics

You might also like