0% found this document useful (0 votes)
10 views16 pages

Understanding Central Tendency Measures

Chapter Three discusses measures of central tendency, focusing on their objectives, desirable properties, and types including mean, median, and mode. It explains the arithmetic mean, weighted mean, geometric mean, and harmonic mean, along with examples and properties of these measures. The chapter also highlights the merits and drawbacks of the arithmetic mean and provides formulas for calculating various means based on different data distributions.

Uploaded by

awel
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views16 pages

Understanding Central Tendency Measures

Chapter Three discusses measures of central tendency, focusing on their objectives, desirable properties, and types including mean, median, and mode. It explains the arithmetic mean, weighted mean, geometric mean, and harmonic mean, along with examples and properties of these measures. The chapter also highlights the merits and drawbacks of the arithmetic mean and provides formulas for calculating various means based on different data distributions.

Uploaded by

awel
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Chapter Three

MEASURES OF CENTRAL TENDENCY

3.1 Objectives of Measuring Central Tendency

A single value that describes the characteristics of the entire mass of data is called measures of
central tendency or average.
Objectives of measuring central tendency are:

 To get a single value that represent(describe) characteristics of the entire data


 To summarizing/reducing the volume of the data
 To facilitating comparison within one group or between groups of data
 To enable further statistical analysis

Desirable properties of measure of central tendency

We say a measure of central tendency is best if it possess most of the following. It should:
- be simple to understand and easy to calculate/interpret,
- exist and be unique,
- be rigidly defined by mathematical formula,
- based on all observations,
- Not be seriously affected by extreme observations,
- Have capable of further statistical analysis and/or algebraic manipulation.
3.2 The Summation Notation (∑)
Let a data set consists of a number of observations, represents by 1 2
x , x , ..., x n where n (the last
x
subscript) denotes the number of observations in the data and i is the ith observation. Then the
sum
n
x 1+ x2 +…+ x n=∑ x i
i=1

For instance a data set consisting of six measurements 21, 13, 54, 46, 32 and 37 is represented by
x 1 , x 2 , x3 , x 4 , x 5 and x 6 where x 1 = 21, x 2 = 13, x 3 = 54, x 4 = 46, x 5 = 32 and x 6 = 37.
6
∑ xi=
Their sum becomes i=1 21+13+59+46+32+37=208.
n

x 2+ x 2+ . . . + x ∑ xi 2
Similarly 1 2 n2 = i=1

Some Properties of the Summation Notation


n
∑c
1. i=1 = n.c where c is a constant number.
n n
∑ b . x i=b ∑ x i
2. i=1 i=1 where b is a constant number
n n
∑ (a+bx i )=n . a+ b ∑ xi
3. i=1 i=1 where a and b are constant numbers
n n n
∑ ( x i± y i )= ∑ x i±∑ y i
4. i=1 i=1 i=1
n n n
∑ x i y i ≠ ∑ xi ∑ yi
5. i=1 i=1 i=1
Example:
12 12 12 2 12 2
∑ x i = 26 , ∑ y i = 17 , ∑ x i =484 , ∑ y i = 362
i=1 i=1 i=1 i=1

Let

12 12

I) ∑(4 x +3 y ),
i i II ) ∑ 2 x i ( x i −7 )
Find i=1 i=1

12 12 12

I) ∑ (4 x i +3 y i )=4 ∑ x i +∑ Y i =4(26 )+3(17 )=105


Solution: i=1 i=1 i=1

12 12 2 12

II ) ∑ 2 x i ( x i−7) =2 ∑ x i −14 ∑ x i=2(484 ) −14 (26 ) =604


i=1 i=1 i=1

3.3 Types of Measures of Central Tendency

Several types of averages or measures of central tendency can be defined, the most commons are

- the mean
- the mode
- the median
3.3.1. The Mean

There are four types of means: Arithmetic mean, weighted arithmetic mean, Harmonic mean and
Geometric mean.

Arithmetic mean is defined as the sum of the measurements of the items divided by the total
number of items.
Arithmetic Mean for Ungrouped Frequency Distribution
When the data are arranged or given on the form of ungrouped frequency distribution, then the
formula for the mean is
k

f 1 x 1 + f 2 x 2 +…+ f k x k i=1
∑ f i xi
X= = k Note that
f 1 +f 2+ …+f k
∑ fi
i=1

Example 1: You measure the body lengths (in inches) of 10 full-term infants at birth and record
the following:

17.5, 19.5, 17.5, 19, 20, 21, 18, 19.5, 18, 10.75

Compute the mean length of the infants for these data.

17.5+19.5+17.5+19+20+ 21+ 18+19.5+18+10.75 180.75


Solution: x= = =18.075
10 10

Example 2: Monthly incomes of fourth year regular students are given in the following
frequency distribution.

Monthly income (birr) 54.5 64.5 74.5 84.5 94.5 104.5 114.5

Number of students 6 9 15 25 13 7 5

Compute the mean for these data.

Solution:
6 ×54.5+64.5 × 9+74.5 ×15+84.5 × 25+94.5 ×13+104.5 × 7+114.5×5 6670
x= = =83.375
6+ 9+15+25+13+ 7+5 80

Arithmetic Mean for Grouped Frequency Distribution

If data are given in the form of continuous frequency distribution, the sample mean can be
computed as
k
∑ f i mi f m + f m +.. .+ f m
i=1 1 1 2 2 k k
x= k
=
∑f
i =1
i
f + f + .. .+ f
1 2 k

Where
m i is he class mark of the i th class; i = 1, 2, …, k

fi i th
= the frequency of the class and k = the number of classes
k
∑ f i =n
Note that i=1 = the total number of observations.

Example: The following table gives the daily wages of laborers. Calculate the average daily
wages paid to a laborer.

Wages in birr 11-13 13-15 15-17 17-19 19-21 21-23 23-25

Number of laborers 3 4 5 6 6 4 3

Solution:

Wages in birr 11-13 13-15 15-17 17-19 19-21 21-23 23-25


frequency 3 4 5 6 6 4 3
Class mark(m) 12 14 16 18 20 22 24
f i mi 36 56 80 108 120 88 72
∑ ❑=560 ∑ fi=¿ 31¿

x=
∑ ❑ = 560 =18.06
∑ fi 31
Properties of the Arithmetic Mean

 The sum of the deviations of the items from their arithmetic mean is zero. This means, the
algebraic sum of the deviations of a set of numbers
x 1 , x 2 , . . ., x n from their mean x̄ is zero.
n

 (x i  x ) 0
That is i 1

 The sum of the squares of the deviations of a set of observations from any number, say A, is
minimum when A= X . That is, ∑ (x i−x)2 ≤ ∑ (x i− A)
2

 When a set of observations is divided into k groups and x̄ 1 is the mean of n1 observations of
x̄ n
group 1, x̄ 2 is the mean of n2 observations of group2, …, k is the mean of k observations
of group k , then the combined mean ,denoted by
x̄ c , of all observations taken together is
given by
k

n x + n x +…+ nk x k i=1
∑ ni x i
X c= 1 1 2 2 = k
n1 +n 2+ …+nk
∑ ni
i=1

 If a wrong figure has been used in calculating the mean, we can correct if we know the
correct figure that should have been used. Let
 X wr denote the wrong figure used in calculating the mean
 X c be the correct figure that should have been used
 X wr be the wrong mean calculated using X wr , then the correct mean, X correct , is given by
n X wr + X c − X wr
X correct ¿
n

 If the mean of
x 1 , x 2 , . . ., x n is x̄ , then
a) the mean of
x 1±k , x 2±k , . . . , x n ±k will be x̄±k
kx , kx , .. .,kx n will bek x̄ .
b) The mean of 1 2
Example 1: Last year there were three sections taking Stat 273 course in Alemaya University. At
the end of the semester, the three sections got average marks of 80, 83 and 76. There were 28, 32
and 35 students in each section respectively. Find the mean mark for the entire students.

Solution:

n1 x̄ 1 + n2 x̄ 2 +n3 x̄3 28(80 )+32(83 )+35 (76) 7556


x̄ c= = = =
n1 + n2 + n3 28+32+35 95 79.54

Example 2: An average weight of 10 students was calculated to be 65 kg, but latter, it was
discovered that one measurement was misread as 40 kg instead of 80 kg. Calculate the corrected
average weight.

n X wr + X c − X wr 10 ( 65 ) +80−40
Solution: X correct ¿ = =69
n 10

Exercise: The average score on the mid-term examination of 25 students was 75.8 out of 100.
After the mid-term exam, however, a student whose score was 41 out of 100 dropped the course.
What is the average/mean score among the 24 students?

Weighted Arithmetic Mean

In finding arithmetic mean, all items were assumed to be of equally importance (each value in
the data set has equal weight). When the observations have different weight, we use weighted
average. Weights are assigned to each item in proportion to its relative importance.

If
x 1 , x 2 , . . ., x k represent values of the items and w 1 , w 2 , . . . , w k are the corresponding weights, then

the weighted mean, ( x̄ w ) is given by


k

w1 x1 + w2 x 2 +…+ wk x k i=1
∑ wi xi
X w= = k
w1 +w 2+ …+w k
∑ wi
i=1
Example: A student’s final mark in Mathematics, Physics, Chemistry and Biology are
respectively 82, 80, 90 and [Link] the respective credits received for these courses are 3, 5, 3 and
1, determine the approximate average mark the student has got for one course.

Solution: We use a weighted arithmetic mean, weight associated with each course being taken as
the number of credits received for the corresponding course.

xi 82 80 90 70

wi 3 5 3 1

x̄ w =
∑ w i x i = (3×82)+(5×80 )+(3×90 )+(1×70 ) =82 .17
Therefore ∑ wi 3+5+3+1

Average mark of the student for one course is approximately 82.

Exercise: If a student gets A in 4 cr. hrs, B in 3 cr. hrs and D in 2 cr. hrs courses, what is his
GPA in this semester?

Values 4 3 1

Weight 4 3 2

Merits of Arithmetic Mean

- Arithmetic mean has a rigidly defined mathematical formula so that its value is always
definite.
- It is calculated based on all observations.
- Arithmetic mean is simple to calculate and easy to understand.
- It doesn’t need arrangement of data in increasing or decreasing order.
- Arithmetic mean is also capable of further algebraic treatment.
- It affords a good standard of comparison.
Drawbacks of Arithmetic Mean

- It is highly affected by extreme (abnormal) values in the series.


- It can be a number which does not exist in the series.
- It sometime gives such results which appear almost absurd. For example it is likely that we
can get an average of ‘3.6 children’ per family.
- It can’t be calculated for open-ended classes.

Geometric Mean: It used when observed values are measured as ratios, percentages,
proportions, indices or growth rates.
GM = √x
n
1. x 2 .. .. x, n

If the observed have frequencies


GM = √x
n f1
1 . x
f2
2 .. .. x
fk
k

Example: compute the geometric mean of the following values: 2, 8, 6, 4, 10, 6, 8, 4

Solution:
Values 2 4 6 8 10 Total

frequencies 1 2 2 2 1 8

√ 2∗4
2 2 2
6 . ¿ 8 ∗10=5 . 41
8
GM = ∗

Harmonic Mean: is a suitable measure of central tendency when the data pertains to speed, rate and
n n
HM = =
n 1 1 1
∑i=1 +.. . .+
time. x x i 1 x n

If the data arranged in the form of frequency distribution

∑i=1 f i f f
k
1 +.. . ..+ k
HM = =
n 1 1
∑i=1
f i x f x +. .. . .. ..+ f x
i 1 1 k k

Example: A motorist travels 480km in 3 days. She travels for 10 hours at rate of 48km/hr on 1 st day,
for 12 hours at rate of 40km/hr on the 2 nd day and for 15 hours at rate of 32km/hr on the 3 rd day.
What is her average speed?

3
HM = =39 . 92
1 1 1
+ +
48 40 32
Relations among different means

1. x ≥GM ≥HM

2. For two observations √ x∗HM =GM


3. x =GM =HM if all observation have equal magnitude
3.3.2 The Median

The median of a set of items (numbers) arranged in order of magnitude (i.e. in an array form) is the
middle value or the arithmetic mean of the two middle values. We shall denote the median of
x1 , x 2 , ..., x n by~
x . For ungrouped data the median is obtained by

~x=¿ x if the number of items, n, is odd ¿¿¿


{ n+1
2
¿

For grouped data the median, obtained by interpolation method, is given by

( )
n
−F
~ 2
X= Lmed +W
f med

Where
Lmed = lower class boundary of the median class

F= Sum of frequencies of all class lower than the median class (in other words it is the
cumulative frequency immediately preceding the median class)

f med = Frequency of the median class and W = is class width

n
The median class is the class with the smallest cumulative frequency greater than or equal to 2 .
Examples1: The birth weights in pounds of five babies born in a hospital on a certain day are 9.2,
6.4, 10.5, 8.1 and 7.8. Find the median weight of these five babies.

Solution: the median is 8.1.

Examples 2: The following table gives the distribution of the weekly wages of employees of a small
firm.

Wages in birr No. of employees

126 and below 3


127 – 135 5

136 – 144 9

145 – 153 12

154 – 162 5

163 – 171 4

172 and above 2

a) Find the median weekly wage.


b) Why is the median a more suitable measure of central tendency than the mean in
this case?

Merits of median

- It is not influenced by extreme values.


- Arithmetic mean is rigidly defined a mathematical formula so that its value is always definite.
- Median can be calculated even in case of open-ended intervals.
- It can be computed for ratio, interval, and ordinal level of data.
Demerits of median
- It is not capable of further algebraic treatment.
- It is not a good representative of the data if the number of items (data) is small.
- The arrangement of items in order of magnitude is sometimes very tedious process if the number
of items is very large.
3.3.3 The Mode

The mode or the modal value is the most frequently occurring score/observation in a series and
denoted by x^ . Note that the mode may not exist in the series or, even if it does exist, it may not be
unique.

For grouped data, the mode is found by the following formula:

( )
Δ1
x^ =Lmod + W
Δ1+ Δ2

Where
Lmod = lower class boundary of the modal class
Δ 1= The difference between the frequency of the modal class and frequency of the class

immediately preceding the modal class

Δ 2 = The difference between the frequency of the modal class and frequency of the class

Immediately follows the modal class

W = is the class width

The modal class is the class with the highest frequency in the distribution.

1. Find the mode of 5, 3, 5, 8, 9Examples 1: The marks obtained by ten students in a semester
exam inMode =5statistics are: 70, 65, 68, 70, 75, 73, 80, 70, 83 and 86. Find the mode of the
students’ marks.

Solution: mode = 70 it occurs three times

Example 2: Find the mode for the frequency distribution of the birth weight (in kilogram) of 30
children given below.

Weight 1.9-2.3 2.3-2.7 2.7-3.1 3.1-3.5 3.5-3.9 3.9-4.3

No. of children 5 5 9 4 4 3

Solution: 2.7-2.3 is the modal class since it has the highest frequency

Δ 1=9−5=4 and Δ 2 =9−4=5 Lmod =2 .7

( 4 4+5 )∗0 . 4=2 .878


x^ =2 .7 +

Merits of mode
- Mode is not affected by extreme values.
- Mode can be calculated even in the case of open-end intervals. And it is not necessary to know all
observations.
- It can be computed for all level of data i.e. ratio, interval, ordinal or nominal.
Demerits of mode
- Mode may not exist in the series and if it exists it may not be a unique value.
- It does not fulfill most of the requirements of a good measure of central tendency
3.3.4 Quantiles
Quantiles are values which divides the data set arranged in order of magnitude in to certain equal
parts. They are averages of position (non-central tendency). Some of these are quartiles, deciles and
percentiles.
I. Quartiles: are values which divide the data set in to four equal parts, denoted by Q1 , Q2 and 3 . The
Q
first quartile is also called the lower quartile and the third quartile is the upper quartile. The second
quartile is the median.
 For Ungrouped data:
Q th
Let j be the j quartile value for j 1, 2, 3 . Then

( ) item; j=1 , 2 , 3 .
th
j
Q j= ( n+1 )
4

Examples: find the quartiles, Q1 Q2 and Q3 of the following data 20, 30, 25,23,22,32,36,18
Solution:
Arrange the data in ascending form, and n= 8

( )
th
1
Q 1= ( 8+1 ) item;.
18, 20, 22, 23, 25, 30, 32, 36 4
Q1=2.25th item =2nd item +0.25(3rditem – 2nd item)
= 20+0.25(22-20)
=20+0.5=20.5

( )
th
2
Q 2= ( 8+1 ) item;.
4
Q2=4.5 item = 4th item +0.5(5th item -4th item)
th

= 23+ 0.5(25-23) = 24
Q3= ?

 For grouped data


We can apply the following formula:

( )
j⋅n
− FQ
4 j
Q j=LQ + W ; j=1 , 2 , 3 .
j f Qj

Where
Q j= the j th quartile we are going to calculate

LQ = th
j Lower class boundary of the j quartile class

FQ = j th quartile class
j Sum of frequencies of all classes lower than the
fQ = th
j Frequency of the j quartile class and W = Class width

th
The j quartile class is the class with the smallest cumulative frequency greater than or equal to
j⋅n
4

.Example: Find the Q1 Q2 and Q3 of the following data

Class Class boundaries Frequency Cumulative frq less than


50-69 49.5-65.5 3 3
70-89 69.5-89.5 7 10
90-109 89.5-109.5 4 14
110-129 109.5-129.5 4 18
130-149 129.5-149.5 9 27

jn 1(27)
Quartile class= N =27 so for Q1= = 6. 75 which is in the second class
4 4

L1= 69.5 fQ1=7 FQ1=3 W=20

( )
jn
Q1= L1+ 4
−F Q1
F Q1
W
Q1= 69.5 + ( 6. 75−3
7 ) 20 =80.2
Find q2 and q3 by this form

II. Deciles: are values dividing the data in to ten equal parts, denoted by
D1 , D2 , ..., D9 . The fifth decile
is the median.
 For Ungrouped data
D th
Let j be the j percentile value for j 1, 2, ... , 9 . Then

( )
th
j
D j= ( n+1 ) item ; j=1 , 2 , . . . , 9
10
Find the quartiles, D1 D6 and D8 of the following data 20, 30, 25,23,22,32,36,18
Solution
Arrange the data in ascending form, and n= 8

18, 20, 22, 23, 25, 30, 32, 36

( )
th
2
D 2= ( 8+1 ) item; ¿1 2 1
10 =1.8 th item st
item +0.8( nd item- st item) =18+0.8(20-18)=19.6
( )
th
8
D 8= ( 8+1 ) item ;
10 = 7.2th item = 7th +0.2(8th -7th) = 32 +0.2(4) =32.8

D6 =?

 For grouped data


We can apply the following formula:

( )
j⋅n −F D j
10
D j=L D + W ; j=1 , 2 , . . . , 9
j f Dj

Define the symbols similar way as we did in the case of quartiles.

The j th decile class is the class with the smallest cumulative frequency greater than or equal to j⋅
n
10

Example:

Find D5 and D7
Values Frequency Cf
140- 150 17 17
150- 160 29 46
160- 170 42 88
170- 180 72 160
180- 190 84 244
190- 200 107 351
200- 210 49 400
210- 220 34 434
220- 230 31 465
230- 240 16 481
240- 250 12 493

Solutions:
• First find the less than cumulative frequency.
• Use the formula to calculate the required deciles
D5= determine the class containing the 5th decile

5∗N
D5 = =246.5 N= 493
10

190-200 is the class containing the fifth decile

LD5 =189.5 , w =10 N=493, FD =244, f=107

D5 = 189.5 + ( 246.5−244
107 ) 10 = 189.7 ≈ 190
D7= determine the class containing the 7th decile

7∗N
D7 = =345.1
10

190-200 is the class containing the fifth decile

LD7 =189.5, w =10 N=493, FD =244, f=107

D7= 189.5 + ( 345.1−244


107 ) 10 = 198.94 = 199
Percentiles: are values which divide the data in to one hundred equal parts, denoted by
P1 , P2 , ... P99 .
The fiftieth percentile is the median.

 For ungrouped data


Let
P j be the percentile value for j=1, 2, 3, . . . , 99 . Then

( )
th
j
P j= ( n+1 ) item ; j=1 , 2 , 3 , . . . , 99
100
Find the quartiles, P8 P50 and P8 of the following data 20, 30, 25,23,22,32,32,36,18
Solution
Arrange the data in ascending form, and n= 8

18, 20, 22, 23, 25, 30, 32, 36

( )
th
8
P 8= ( 8+1 ) item ;
100

=0.72th item = 1st = 18

P50 = 24 P85= 35

 For grouped data


We can use the following formula:

( )
j⋅n −F P j
100
P j=L P + W ; j =1, 2 , 3 , . .. , 99
j f Pj

Define the symbols similar way as we did in the case of quartiles.


The j th percentile class is the class with the smallest cumulative frequency greater than or equal to
j⋅n
100 .

Example:

Find P90 and P99


Values Frequency Cf
140- 150 17 17
150- 160 29 46
160- 170 42 88
170- 180 72 160
180- 190 84 244
190- 200 107 351
200- 210 49 400
210- 220 34 434
220- 230 31 465
230- 240 16 481
240- 250 12 493
P90= determine the class containing the 90th Percentile

99∗N
D90 = =443.7
100

220- 230 is the class containing the fifth decile

LP90 =219.5, w =10 N=493, FD =434, f=31

P90= 219.5 + ( 443.7−434


31 ) 10 = 222.629 = 223
P99= ?

Interpretations

1.
Q j is the value below which ( j×25) percent of the observations in the series are found (where
j 1, 2, 3 ). For instance Q 3 means the value below which 75 percent of observations in the given
series are found.
D
2. j Is the value below which( j×10) percent of the observations in the series are found (where
j 1, 2, ... , 9 ). For instance D 4 is the value below which 40 percent of the values are found in the
series.
3.
P j is the value below which j percent of the total observations are found (where j=1, 2, 3, . . . , 99 ).
P
For example 73 percent of the observations in a given series are below 73 .

Exercise: The following table presents the male population of a certain region in Ethiopia.
Find a) all quartiles
th th
b) The 9 and5 decile and
th th
c) 65 and 75 percentiles

Age groups (in years) 0 – 5 5 – 10 10 – 15 15 – 20 20 – 25 25 – 30 30 – 35 35 - 40

Male population 2580 3737 4620 5200 7250 620 297 355

You might also like