Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
UNIT FOUR
4. MEASURES OF CENTRAL TENDENCY
4.1. Introduction
In unit three, the discussion is focused on ways of organising and presenting data in the form
of frequency distribution, tables, graphs and charts. However, data organisation and
presentation alone does not provide much clear information about the data. The data must be
described and analysed to aid for making interpretations. In this unit, we introduce measures
that are used to describe certain features of data. One of such measures is known as measures
of central tendency or averages.
A measure of central tendency is a single value/score which is a representative or summary of
a large quantity of numerical data. It is a value located at the centre of distribution (set of
data) around which the majority of values concentrate. Due to this they are also called
measures of location. There are there common averages: mean, median and mode.
Importance of Averages
1) They provide a summary or condensed value of a large quantity of numerical
data. For example, it is difficult to remember the individual height of all college
students but quite simple to remember their average height.
2) They afford a basis for comparison of similar group of data. For example, it is
very difficult to compare the height of each Ethiopian with each Chinese, but it is
quite simple to compare the mean height of the citizens of these two countries.
4. 2. Mean
Definition: The arithmetic mean or simply mean of a set of values/scores is the ratio of the
sum all scores to the number of scores in a set.
Mean (X) = Sum of all scores
Number of scores
4.2.1. Population and Sample Mean for Ungrouped Data
The population mean is denoted by the symbol µ (Greek letter ‘‘mu’’) and given by:
n
µ=⅀X
n =1
N
The sample mean is denoted by the symbol X (read as ‘‘x bar’’) and given by:
n
X=⅀X
n=1
n
Note that the value such as X is a statistic and µ is a parameter.
4.2.2. Computing the Mean for Ungrouped Data
Example - 1: Compute the mean of sample values 5, 4, 9, 12, 10.
1 Department of Psychology, Dilla University March, 2021
Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
Solution: X = 5 + 4 + 9 + 12 +10 = 40 = 8
5 5
Example – 2: Find the mean of scores: 5, 5, 5, 4, 4, 3, 3, 3, 3
Solution: X = 5 + 5 + 5 + 4 + 4 + 3 + 3 + 3 + 3 + 3 = (5 x 3) + (4 x 2) + (3 x 4) = 35 ≈ 3.9
9 9 9
4.2.3. Computing the Mean for Grouped Data
In continuous frequency distribution, we use the midpoints as representatives of the class
intervals and compute mean by using the following formula:
X = ⅀f.m Where, f = frequency of a class
⅀f m = midpoint of a class
Example -3: The following frequency table gives the height (in inches) of hundred students in
a college.
Classes Frequency(f)
60 - < 62 5
62 - < 64 18
64 - < 66 42
66 - < 68 20
68 - < 70 8
70 - < 72 7
Total 100
Calculate the arithmetic mean.
Solution:
Classes Frequency(f) Midpoint(m) mf
60 - < 62 5 61 305
62 - < 64 18 63 1134
64 - < 66 42 65 2730
66 - < 68 20 67 1340
68 - < 70 8 69 552
70 - < 72 7 71 497
Total 100 6558
Thus, X = ⅀f.m = 6558 = 65.58
⅀f 100
4.2.4. Properties of Arithmetic Mean
Property -1: The sum of the deviation about the mean is zero. Symbolically, ⅀(X – X) = 0
Deviation about the mean is the difference between the mean and each raw score.
Example -4: Consider the values 1, 2, 3, 4, 5. Obviously, the mean of these numbers is 3.
2 Department of Psychology, Dilla University March, 2021
Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
X Deviation(X – X)
1 1–3=-2
2 2–3=-1
3 3–3=0
4 4–3=1
5 5–3=2
Sum ⅀( X – X) = 0
Property -2:
a) The mean of the new distribution created by adding a constant to every value in
the original distribution is equal to the mean of the original distribution plus the
constant. Symbolically, Xx + C = X + C
b) The mean of the new distribution created by multiplying every value in a
distribution by a constant is equal to the mean of the original distribution
multiplied by the constant, Symbolically, X x,C = X. C.
Example -5: If you add the constant 3 to every value in example -4, the new distribution is 4,
5, 6, 7, 8. You can check that the mean of this new distribution is 6 which is the mean of the
original distribution plus three.
Example -6: If you multiply every value in example -4 by a constant 4, the new distribution is
4, 8, 12, 16, 20. You can check that the mean of this new distribution is 12 which is the mean
of the original distribution times four.
Property – 3: The sum of the squared deviations about the mean is the minimum. This means
the sum of the squared deviations about the mean is smaller than the sum of the squared
2
deviations about any number other than the mean. Symbolically, ⅀(X – X) is the smallest.
Example -7: Consider example -4 above and find the sum of the squared deviations about the
mean.
X X–X 2
(X – X)
1 1 -3 = -2 4
2 2 – 3 = -1 1
3 3–3=0 0
4 4–3=1 1
5 5–3=2 4
⅀X = 15 2
⅀(X – X) = 10
From the above table, you can see that the sum of the squared deviations about the mean is
10. According to this property, if we take any number other than the mean and repeat the
above process we obtain the sum of the squared deviations greater than 10, which is the
smallest possible value.
3 Department of Psychology, Dilla University March, 2021
Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
The following is the fourth property of mean which is stated as the last section of this
subtopic about mean.
4.2.5. The Combined (Weighted) Mean
Property - 4: The mean of a collection of scores (values) formed by combining K subgroups
of scores is the sum of the products of subgroup means multiplied by the number of scores in
the respective subgroups. This sum is then divided by the total number of scores in all
subgroups.
Combined Mean = ⅀nj .Xj
⅀nj
Where, K = number of subgroups
nj = number of scores in a subgroup
Xj = mean of subgroup j
If n1 scores have the mean X1, n2 scores have mean X2, nK scores have mean XK, then the
combined mean of all scores is given by:
Combined mean = n1X1 + n2X2 + n3X3 + . . . + nKXK = ⅀nKXK
n1 + n2 + n3 + . . . + nK ⅀nK
Example– 8: If the mean of one class of 50 students is 30 and the mean of another class of
100 students is 40. What is the mean of all 150 students?
Solution: Given: n1 = 50, X1 = 30, n2 = 100, X2 = 40
Combined mean X = (n1..X1) + (n2..X2) = (50 x 30) + (100 x 40) = 5500 ≈ 36.7
n1 + n2 = 50 + 100. = 150
4.3. The Median
Definition:
The median is the middle score (value) in a ranked data set (distribution).
Remark:
a) To compute the median, the scores must be arranged first in some logical order
(ascending or descending order).
b) The median is the score which divides a given distribution into two equal halves (50%
of scores above it and 50% below it).
4.3.1. Computing the Median (Md) for Ungrouped Data
th
a) If the number of scores n is odd, the median is the (n + 1) term and there is only one
middle score which is the median. 2
b) If the number of scores n is even, the median is described by two terms, (n/2) th term
and [(n + 1)/2]th term and the median is the mean of these two terms.
4 Department of Psychology, Dilla University March, 2021
Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
Example - 9: Find the median of the observations (scores): 1, 2, 5, 12, 9, 10, 3.
Solution: Ascending order: 1, 2, 3, 5, 9, 10, 12.
Since the number scores is odd, the median is 5.
Example -10: Find the median of the values: 2, 14, 5, 12, 9, 10, 3, 19
Solution: Ascending order: 1, 2, 5, 7, 9, 12, 14, 19.
The median is 7 + 9 = 8
2
4.3.2. Computing the Median for Grouped Data
For grouped continuous data, median (Md) is computed by the following formula:
Md = l1 + ½n – c x h
f
Where, l1 = the lower class boundary of the median class
f = the frequency of the median class
h = the class width (size)
n = ⅀f , is the total frequency
c = the cumulative frequency of the class preceding the median class.
Steps to compute the median for continuous frequency distribution:
1) Prepare the less than cumulative frequency
2) Find ½ n
3) Locate cumulative frequency just greater than ½ n.
4) Identify the median class
5) Compute the median
Example -11: Find the median of the following data.
Class interval f
30 - < 40 2
40 - < 50 18
50 - < 60 24
60 - < 70 20
70 - < 80 8
80 - < 90 3
Solution:
Class interval f Less than cumul. frequency
30 - < 40 2 2
40 - < 50 18 20
50 - < 60 24 44
60 - < 70 20 64
70 - < 80 8 72
80 - < 90 3 75
Total 75
5 Department of Psychology, Dilla University March, 2021
Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
n = ⅀f = 75, therefore, ½ n = ½ .75 = 37.5 ≈ 38.
The 38th value lies in the class interval 50 – 60, which is therefore the median class.
l1 = 49.5, h = 10, f = 24, and c = 20.
Thus, Md = l1 + ½ n – c x h = 49.5 + 38 – 20 x 10 = 49.5 + 180 = 49.5 + 7.5 = 57
f 24 24
4.4. The Mode (Mo)
Definition: The mode is the most frequent value in a given data set (distribution). It is the
value with the highest frequency.
4.4.1. Finding the Mode for Ungrouped Data
Example -12: Find the mode of the following scores: 3, 5, 7, 7, 7,7, 8, 9, 9, 10, 10, 10.
Solution: The score 7 occurs the maximum number of times. Therefore, the mode is 7.
Remark:
1) A distribution may have no mode, one mode, two modes or more than two modes.
2) A data set with each score occurring only once has no mode. Example {26, 150,
60,13}.
3) Uni-modal: A data set with only one mode. Example {2, 3, 4, 2, 5, 2}
4) Bimodal: A data set with only two modes. Example {2, 3, 4, 2, 3, 4, 5, 2, 3}.
5) Multimodal: A data set with more than two modes.
4.4.2. Computing the Mode for Grouped Data
For the grouped data, the mode is computed by the following formula:
Mo = Bl + d1 xh
d1 +d2
Where, Bl = Lower class boundary of the modal class
h = Class width (size)
d1 = the difference between the modal class frequency and the frequency of pre –
modal class.
d2 = the difference between the modal class frequency and the frequency of post –
modal class.
Example – 13: Compute the mode for the following data:
Class intervals f
91 - 100 10
101 – 110 37
111 – 120 65
121 – 130 80
131 – 140 51
141 – 150 35
151 – 160 18
161 – 170 4
6 Department of Psychology, Dilla University March, 2021
Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
Solution: The modal class is 121 – 130
d1 = 80 – 65 = 15
d2 = 80 - 51 = 29
B1 = 120.5
h = 10
Mo = Bl + d1 x h = 120.5 + 15 x 10 =
d1 + d2 15 + 29
= 120.5 + ( 15/44) x 10
= 120.5 + 3.4
= 123.9
4.5. Relationship between the Mean, Median and Mode
We cannot conclude that one of the three measures of central tendency is the best overall.
Each of them may be better in different situations.
The mean has the advantage that its calculation includes all values of a distribution.
The median is preferable when a data set include extreme values.
The mode is simple to locate but not much used in practical situation.
The following graphs show the relationship among the mean, median and mode which can
give us some idea about the shape of frequency distribution.
1) In a unimodal symmetric (normal) distribution, the mean, median, and mode are equal.
Mean = Median = Mode
2) A) In a positively skewed distribution (skewed to the right), the mean is larger the median
which is larger than the mode.
Mode Median Mean
7 Department of Psychology, Dilla University March, 2021
Introduction to Statistical Methods in Education (EdPM – 1022) Unit Four Note
B) In a negatively skewed distribution (skewed to the left), the mean is smaller than the
median which is smaller than the mode.
Mean Median Mode
4.6. Factors in Selecting a Measure of Central Tendency
What factors should we consider when selecting the best measure of central tendency to
report in a particular situation? The two important factors are the following:
1) The scale of measurement of a variable and
2) The shape of the distribution of a variable.
1) Scale of Measurement
a) If the scale is nominal, the mode is the only possible measure used to report.
b) For ordinal scale, either the median or mode is appropriate.
c) For variables measured on interval or ratio scale (continuous scale), all the three
measures, that is, the mean, median or mode is appropriate.
II) Shape of Distribution
a) In a unimodal symmetric (normal) distribution, we can use the mean, median or
mode to report.
b) In a skewed distribution, the median is the best to use.
Summary of Measures of Central Tendency
Types of Measures of Central Tendency
Mean Median Mode
Definition The ratio of the sum of The middle score The score occurring with
all scores to the total in a ranked data highest frequency
number of scores set(distribution)
Used with Interval or ratio data Ordinal, interval or Nominal, ordinal,
ratio data interval or ratio data
Caution Not for use with Best to use with Not a reliable measure of
distribution with a few distributions with a central tendency
extreme scores few extreme scores
8 Department of Psychology, Dilla University March, 2021