0% found this document useful (0 votes)
67 views34 pages

Mode and Quartiles in Statistics

This document defines and provides examples for calculating measures of central tendency, including the mode, median, and mean. It discusses: 1) How to calculate the mode for both ungrouped and grouped data, including using formulas to determine the modal class and value. 2) How to calculate the median using the cumulative frequency approach. 3) How to calculate the mean by determining the class midpoints, deviations, and weighted deviations and summing them.

Uploaded by

Martin Kobimbo
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views34 pages

Mode and Quartiles in Statistics

This document defines and provides examples for calculating measures of central tendency, including the mode, median, and mean. It discusses: 1) How to calculate the mode for both ungrouped and grouped data, including using formulas to determine the modal class and value. 2) How to calculate the median using the cumulative frequency approach. 3) How to calculate the mean by determining the class midpoints, deviations, and weighted deviations and summing them.

Uploaded by

Martin Kobimbo
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MEASURES OF CENTRAL TENDENCY

MODE

Meaning
The mode refers to that value in a distribution, which occur most frequently. It is an actual
value, which has the highest concentration of items in and around it.

Computation of the Mode

1. Ungrouped or Raw Data


For ungrouped data or a series of individual observations, mode is often found by mere
inspection.

Example 1:
2 , 7, 10, 15, 10, 17, 8, 10, 2

 Mode = M0 = 10

In some cases the mode may be absent while in some cases there may be more than one mode.

Example 2:
1) 12, 10, 15, 24, 30 (no mode)

2) 7, 10, 15, 12, 7, 14, 24, 10, 7, 20, 10

∴ The modes are 7 and 10

2. Grouped Data

a) Discrete Distribution
For Discrete distribution, see the highest frequency and corresponding value of X is mode.
A discrete variable is the one whose outcomes are measured in fixed numbers.

b) Continuous Distribution
See the highest frequency then the corresponding value of class interval is called the
modal class. Then apply the following formula:

𝑓1−𝑓0
Mode = M0 = l1+ x𝑖
(𝑓1−𝑓0 )+(𝑓1−𝑓 )
2

Page 1 of 34
Where: 𝑙1 = the lower value of the class in which the lies

𝑓1 = the frequency of the class in which the mode lies


𝑓0 = the frequency of the class preceding the modal class
𝑓2 = the frequency of the class succeeding the modal class
𝑖 = the class interval of the modal classs

NOTE: While applying the above formula, we should ensure that the class-intervals are
uniform throughout. If the class-intervals are not uniform, then they should be made uniform
on the assumption that the frequencies are evenly distributed throughout the class.

Example 3:
Let us take the following frequency distribution:

Class Intervals Frequency


30−40 4
40−50 6
50−60 8
60−70 12
70−80 9
80−90 7
90−100 4

Required:
Calculate the mode in respect of this series.

Solution
12−8
Mode = M0 = 60+ (12−8)+(12−9)
x10

4
= 60 + 𝑥10 = 65.7 approx.
4+3

3. Determination of Modal Class


For a frequency distribution modal class corresponds to the maximum frequency. But it is not
possible to identify by inspection the class where the mode lies in any one (or more) of the
following cases:
i. If the maximum frequency is repeated.

Page 2 of 34
ii. If the maximum frequency occurs in the beginning or at the end of the distribution.
iii. If there are irregularities in the distribution, the modal class is determined by the
method of grouping.

Steps for Calculation


1. Prepare a grouping table with 6 columns.
2. In column I, write down the given frequencies.
3. Column II is obtained by combining the frequencies two by two.
4. Leave the 1st frequency and combine the remaining frequencies two by two and write in
column III.
5. Column IV is obtained by combining the frequencies three by three.
6. Leave the 1st frequency and combine the remaining frequencies three by three and write
in column V.
7. Leave the 1st and 2nd frequencies and combine the remaining frequencies three by three
and write in column VI.
8. Mark the highest frequency in each column.
9. Form an analysis table to find the modal class.
10. After finding the modal class use the formula to calculate the modal value.

Example 4
Calculate the mode for the following frequency distribution.

Class Interval 0−5 5−10 10−15 15−20 20−25 25−30 30−35 35−40
Frequency 9 12 15 16 17 15 10 13

Solution
Grouping Table
Class Interval Frequency 2 3 4 5 6
0−5 9
5−10 12 21 36
10−15 15 27 43
15−20 16 31 48
20−25 17 33 48
25−30 15 32 42
30−35 10 25 38
35−40 13 23

Page 3 of 34
Analysis Table

Columns 0−5 5−10 10−15 15−20 20−25 25−30 30−35 35−40


1 1
2 1 1
3 1 1
4 1 1 1
5 1 1 1
6 1 1 1
Total 1 2 4 5 2

The maximum occurred corresponding to 20−25, and hence it is the modal class.
17−16
Mode = M0 = 20+ (17−16)+(17−15)
𝑥5

1
M0 = 20 + 𝑥5 = 21.6 approx.
1+2

Example 5
The following table gives some frequency data:

Size of Item Frequency Cummulative Currency


10−20 10 10
20−30 18 28
30−40 25 53
40−50 26 79
50−60 17 96
60−70 4 100
Total 100

Required:
Calculate the mode

Page 4 of 34
Solution
Grouping Table
Class Interval Frequency 2 3 4 5 6
10−20 10
20−30 18 28 53
30−40 25 43 69
40−50 26 51 68
50−60 17 43 47
60−70 4 21

Analysis Table

Columns 10−20 20−30 30−40 40−50 50−60 60−70


1 1
2 1 1
3 1 1 1 1
4 1 1 1
5 1 1 1
6 1 1 1
Total 1 3 5 5 2

Mode = 3 median - 2 mean

n + 1 100 + 1
Median = = = 50.5th item
2 2

This lies in the class 30−40.

𝑙2 − 𝑙1 40 − 30
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑙1 + (𝑚 − 𝑐 ) = 30 + (50.50 − 28) = 30 + 9 = 39
𝑓 25

Calculation of Arithmetic Mean

Class- Interval Frequency Mid- Points d d'=d/10 fd’


10−20 10 15 −20 −2 −20
20−30 18 25 −10 −1 −18
30−40 25 35 0 0 0
40−50 26 45 10 1 26
50−60 17 55 20 2 34
60−70 4 65 30 3 12
Total 100 34

Page 5 of 34
Assumed mean= 35

∑ fd′
Median = A + xi
n

34
Median = A35 + x10 = 38.4
100
Mode = 3 median − 2 mean = 3(39) − 2(38.4) = 117 − 76.8 = 40.2

Merits of Mode
1. It is easy to calculate and in some cases it can be located mere inspection.
2. Mode is not at all affected by extreme values.
3. It can be calculated for open-end classes.
4. It is usually an actual value of an important part of the series.
5. In some circumstances it is the best representative of data.

Demerits of Mode
1. It is not based on all observations.
2. It is not capable of further mathematical treatment.
3. Mode is ill-defined generally, it is not possible to find mode in some cases.
4. As compared with mean, mode is affected to a great extent,by sampling fluctuations.
5. It is unsuitable in cases where relative importance of items has to be considered.

QUARTILES

Meaning
The quartiles divide the distribution in four parts. There are three quartiles. The second
quartile (Q2) divides the distribution into two halves and therefore is the same as the median.
The first (lower) quartile (Q1) marks off the first one-fourth, the third (upper) quartile (Q3)
marks off the three-fourth. In other words, the three quartiles Q1, Q2 and Q3 are such that 25
percent of the data fall below Q1, 25 percent fall between Q1 and Q2, 25 percent fall between
Q2 and Q3 and 25 percent fall above Q3.

Computation of the Mode

1. Raw or Ungrouped Data


First arrange the given data in the increasing order and use the formula for Q 1 and Q3.

Page 6 of 34
n+1
Q1 = ( ) th item
4
n+1
Q3 = 3 ( ) th item
4

Example 1
Compute quartiles for the data given below:
25,18,30, 8, 15, 5, 10, 35, 40, 45

Solution
5, 8, 10, 15, 18,25, 30,35,40, 45
n+1
Q1 = ( ) th item
4
10 + 1
Q1 = ( ) th item
4
Q1 = (2.75)th item
3
Q1 = 2nd item + ( ) (3rd item − 2nd item)
4
3
Q1 = 8 + ( ) (10 − 8) = 9.5
4

n+1
Q3 = 3 ( ) th item
4

Q 3 = 3(2.75)th item

Q 3 = (8.25)th item

1
Q 3 = 8th item + ( ) (9th item − 8th item)
4

1
𝑄3 = 35 + ( ) (40 − 35) = 36.25
4

2. Discrete Series
Step1: Find cumulative frequencies.
𝑛+1
Step2: Find ( )
4
𝑛+1
Step3: See in the cumulative frequencies, the value just greater than ( ), then the
4

corresponding value of x is Q1.

Page 7 of 34
𝑛+1
Step 4: Find 3 ( )
4
𝑛+1
Step 5: See in the cumulative frequencies, the value just greater than 3 ( ), then the
4

corresponding value of x is Q3.

Example 2
Compute quartiles for the data given bellow:

X 5 8 12 15 19 24 30
F 4 3 2 4 5 2 4

Solution :

X F CF
5 4 4
8 3 7
12 2 9
15 4 13
19 5 18
24 2 20
30 4 24
Total 24

N+1 24 + 1 25
Q1 = ( ) th item = ( ) = ( ) = 6.25th item
4 4 4
N+1 24 + 1 25
Q3 = 3 ( ) th item = 3 ( ) = 3 ( ) = 18.25th item
4 4 4

Q1 = 8; Q 3 = 24

3. Continuous Series
Step1: Find cumulative frequencies
N
Step2: Find ( )
4
𝑁
Step 3: See in the cumulative frequencies, the value just greater than ( 4 ), then the

corresponding class interval is called first quartile class.


3
Step 4: Find 3 (4)
3
Step 5: See in the cumulative frequencies the value just greater than 3 (4), then the

corresponding class interval is called 3rd quartile class.

Page 8 of 34
Step 6: Apply the respective formulae.

N
− m1
Q1 = l1 + ( 4 ) x c1
f1

N
3 ( 4 ) − m3
Q 3 = l3 + ( ) xc3
f3

Where: l1 = lower limit of the first quartile class


f1 = frequency of the first quartile class
c1 = width of the first quartile class
m1 = cf preceding the first quartile class
l3 = lower limit of the third quartile class
f3 = frequency of the third quartile class
c3 = width of the third quartile class
m3 = cf preceding the third quartile class

Example 3
The following series relates to the marks secured by students in an examination.

Marks Number of Students


0−10 11
10−20 18
20−30 25
30−40 28
40−50 30
50−60 33
60−70 22
70−80 15
80−90 12
90−100 10

Required:
Find the quartiles.

Page 9 of 34
Solution:

Marks Number of Students Cummulative Frequency


0−10 11 11
10−20 18 29
20−30 25 54
30−40 28 82
40−50 30 112
50−60 33 145
60−70 22 167
70−80 15 182
80−90 12 194
90−100 10 204
Total 204

N 204 N
( )=( ) = 51; 3 ( ) = 153
4 4 4

51 − 29
Q1 = 20 + ( ) x 10 = 28.8
25

153 − 145
Q1 = 60 + ( ) x 10 = 63.64
22

PERCENTILES
The percentile values divide the distribution into 100 parts each containing 1 percent of the
cases. The percentile (Pk) is that value of the variable up to which lie exactly k% of the total
number of observations.

1. Percentile for Raw Data or Ungrouped Data


Relationship :
P25 = Q1 ; P50 = Q2 = Median and P75 = Q3

Example 4
Calculate P15 for the data given below:
5, 24 , 36 , 12 , 20 , 8

Solution:
Arranging the given values in the increasing order.

Page 10 of 34
5, 8, 12, 20, 24, 36
15(n + 1)
P15 = ( ) th item
100
15(6 + 1)
P15 = ( ) th item
100
(15x7)
P15 = ( ) th item
100
P15 = (1.05)th item
P15 = 1st item + 0.05(2nd item − 1st item)
P15 = 5 + 0.05(8 − 5) = 5.15

2. Percentile for Grouped Data

Example 5
Find P53 for the following frequency distribution:

Class Interval 0−5 5−10 10−15 15−20 20−25 25−30 30−35 35−40
Frequency 5 8 12 16 20 10 4 3

Solution :
Class Interval Frequency Cummulative Frequency
0−5 5 5
5−10 8 13
10−15 12 25
15−20 16 41
20−25 20 61
25−30 10 71
30−35 4 75
35−40 3 78
Total 78

53N
−m
P53 = l1 + 100 xc
f
53(78)
− 41
P53 = 20 + 100 x5 = 20.085
f

Page 11 of 34
MEASURES OF DISPERSION

MEANING
Dispersion (also known as scatter, spread or variation) measures the extent to which the items
vary from some central value.

SIGNIFICANCE OF MEASURING VARIATION


1. Measures of variation point out as to how far an average is representative of the mass.
2. Measures of dispersion determine nature and cause of variation in order to control the
variation itself.
3. Measures of dispersion enable a comparison to be made of two or more series with regard
to their variability.
4. Measures of dispersion are the basis of Many powerful analytical tools in statistics such as
correlation analysis, testing of hypothesis, analysis of variance, the statistical quality
control and regression analysis.

Characteristics/Properties of a Good Measure of Dispersion


1. It should be simple to understand.
2. It should be easy to compute.
3. It should be rigidly defined.
4. It should be based on each and every item of the distribution.
5. It should be amenable to further algebraic treatment.
6. It should have sampling stability.
7. Extreme items should not unduly affect it.

ABSOLUTE AND RELATIVE MEASURES OF DISPERSION


There are two kinds of measures of dispersion, namely:
1. Absolute measure of dispersion.
2. Relative measure of dispersion.

Absolute measure of dispersion indicates the amount of variation in a set of values in terms of
units of observations. For example, when rainfalls on different days are available in mm, any
absolute measure of dispersion gives the variation in rainfall in mm. On the other hand
relative measures of dispersion are free from the units of measurements of the observations.

Page 12 of 34
They are pure numbers. They are used to compare the variation in two or more sets, which are
having different units of measurements of observations.

Absolute measure Relative measure


1. Range 1. Co-efficient of Range
2. Quartile deviation 2. Co-efficient of Quartile deviation
3. Mean deviation 3. Co-efficient of Mean deviation
4. Standard deviation 4. Co-efficient of variation

RANGE AND COEFFICIENT OF RANGE

1. Range
This is the simplest possible measure of dispersion and is defined as the difference between
the largest and smallest values of the variable.

Range = L − S
𝑊ℎ𝑒𝑟𝑒: L = Largest Value
S = Smallest Value

In individual observations and discrete series, L and S are easily identified. In continuous
series, the following two methods are followed.

Method 1:
L = Upper boundary of the highest class
S = Lower boundary of the highest class

Method 2:
L = Mid value of the highest class
S = Mid value of the lowest class

2. Co-efficient of Range
L−S
Coefficient of Range =
L+S
Example 1
Find the value of range and its co-efficient for the following data.
7, 9, 6, 8, 11, 10

Page 13 of 34
Solution:
Range = L − S = 11 − 4 = 7
L − S 11 − 4
Coefficient of Range = = = 0.4667
L + S 11 + 4

Example 2:
Calculate range and its co efficient from the following distribution.
Size : 60−63 63−66 66−69 69−72 72−75
Number : 5 18 42 27 8

Solution:
Range = L − S = 75 − 60 = 15
L − S 75 − 60
Coefficient of Range = = = 0.1111
L + S 75 + 60
Merits
1. It is simple to understand.
2. It is easy to calculate.
3. In certain types of problems like quality control, weather forecasts, share price analysis, et
c., range is most widely used.

Demerits:
1. It is very much affected by the extreme items.
2. It is based on only two extreme observations.
3. It cannot be calculated from open-end class intervals.
4. It is not suitable for mathematical treatment.
5. It is a very rarely used measure.

QUARTILE DEVIATION AND CO-EFFICIENT OF QUARTILE DEVIATION

1. Quartile Deviation (Q.D)


Definition: Quartile Deviation is half of the difference between the first and third quartiles.
Hence, it is called Semi-Inter Quartile Range.
𝑄3 − 𝑄1
𝑄. 𝐷 =
2

𝑄3 −𝑄1
Among the quartiles Q1, Q2 and Q3, the range Q3 – Q1 is called inter quartile range and 2
,
semi inter quartile range.

Page 14 of 34
2. Co-efficient of Quartile Deviation
Q 3 − Q1
Co − efficient of Q. D =
Q 3 + Q1

Example 3
Find the Quartile Deviation for the following data:
391, 384, 591, 407, 672, 522, 777, 733, 1490, 2488

Solution:
Arrange the given values in ascending order.
384, 391, 407, 522, 591, 672, 733, 777, 1490, 2488.

N + 1 10 + 1
Position of Q1 is = = 12.75th item
4 4

Q1 = 2nd item + 0.75(3rd Item − 2nd Item)

𝑄1 = 391 + 0.75 (4.7 − 391) = 403

N+1
Position of Q 3 is 3 ( ) = 3(12.75) = 8.25th item
4

Q 3 = 8th Item + 0.25(9th Item − 8th Item)

Q 3 = 777 + 0.25(1490 − 777) = 955.25

955.25 − 403
𝑄. 𝐷 = = 276.125
2

Example 4
Weekly wages of labours are given below. Calculated Q.D and Coefficient of Q.D.

Weekly Wage (Kshs.) 100 200 400 500 600


No. of Weeks 5 8 21 12 6

Page 15 of 34
Solution :

Weekly Wage (Kshs.) No. of Weeks Cum. No. of Weeks


100 5 5
200 8 13
400 21 34
500 12 46
600 6 52
Total 52

N + 1 52 + 1
Position of Q1 is = = 13.25th item
4 4

Q1 = 13th Item + 0.25(14th Item − 13th Item)

𝑄1 = 200 + 0.25 (400 − 200) = 250

N+1
Position of Q 3 is 3 ( ) = 3(13.25) = 39.75th item
4

Q 3 = 39th Item + 0.75(40th Item − 39th Item)

Q 3 = 500 + 0.75(600 − 500) = 575

575 − 250
𝑄. 𝐷 = = 162.5
2
Q 3 − Q1
Co − efficient of Q. D =
Q 3 + Q1
575 − 250 325
Co − efficient of Q. D = = = 0.394
575 + 250 825

Example 5
For the data given below, give the quartile deviation and coefficient of quartile deviation.
X 351−500 501−650 651−800 801−950 951−1100
F 48 189 88 47 28

Page 16 of 34
Solution:
X True Class Intervals F Cumulative Frequency
351−500 350.5−500.5 48 48
501−650 500.5−650.5 189 237
651−800 650.5−800.5 88 325
801−950 800.5−950.5 47 372
951−1100 950.5−1100.5 28 400
Total 400

N 400 N
Q1 = = = 100; Q 2 = 3 ( ) = 3 (100) = 300
4 4 4

N
− m1
Q1 = l1 + ( 4 ) x c1
f1

100 − 48
Q1 = 500.5 + ( ) x 150 = 541.77
189

N
3 ( 4 ) − m3
Q 3 = l3 + ( ) xc3
f3

300 − 237
Q 3 = 650.5 + ( ) x150 = 757.89
88
Q 3 − Q1 757.89 − 541.77
Q. D = = = 108.06
2 2
Q 3 − Q1 757.89 − 541.77
Co − efficient Q. D = = = 0.1663
Q 3 + Q1 757.89 + 541.77

Merits of Quartile Deviation


1. It is simple to understand and easy to calculate.
2. It is not affected by extreme values.
3. It can be calculated for data with open end classes also.

Demerits of Quartile Deviation


1. It is not based on all the items. It is based on two positional values Q 1 and Q3 and ignores
the extreme 50% of the items.
2. It is not amenable to further mathematical treatment.

Page 17 of 34
3. It is affected by sampling fluctuations.

MEAN DEVIATION AND COEFFICIENT OF MEAN DEVIATION

1. Mean Deviation
The mean deviation is measure of dispersion based on all items in a distribution. Mean
deviation is the arithmetic mean of the deviations of a series computed from any measure of
central tendency; i.e., the mean, median or mode, all the deviations are taken as positive i.e.,
signs are ignored. But in general practice and due to wide applications of mean, the mean
deviation is generally computed from mean. M.D can be used to denote mean deviation.

2. Coefficient of mean deviation:


Mean deviation calculated by any measure of central tendency is an absolute measure. For the
purpose of comparing variation among different series, a relative mean deviation is required.
The relative mean deviation is obtained by dividing the mean deviation by the average used
for calculating mean deviation.

Mean Deviation
Co − efficient of Mean Deviation =
Mean or Median or Mode

If the result is desired in percentage, the coefficient of mean deviation.

Mean Deviation
Co − efficient of Mean Deviation = x100
Mean or Median or Mode

COMPUTATION OF MEAN DEVIATION

1. Individual Series
a. Calculate the average mean, median or mode of the series.
b. Take the deviations of items from average ignoring signs and denote these deviations
by |D|.
c. Compute the total of these deviations, i.e., Σ |D|
d. Divide this total obtained by the number of items.

D
M. D. =
n

Page 18 of 34
Example 6
Calculate mean deviation from mean and median for the following data: 100, 150, 200, 250,
360, 490, 500, 600, 671 also calculate coefficients of M.D.

Solution:

 X 3321
Mean = = = 369
N 9

Now arrange the data in ascending order

100, 150, 200, 250, 360, 490, 500, 600, 671

n+1 9+1
Mean = Value of ( ) th item = Value of ( ) th item = Value of 5th item = 360
2 2

X D=X−Mean D=X−Median
100 269 260
150 219 210
200 169 160
250 119 110
360 9 0
490 121 130
500 131 140
600 231 240
671 302 311
3321 1570 1561

 D 1570
M. D. from mean = = = 174.44
n 9
MD 174.44
Co − efficient of M. D. = = = 0.47
Mean 369
 D 1561
M. D. from median = = = 173.44
n 9
MD 173.44
Co − efficient of M. D. = = = 0.48
Median 360

Page 19 of 34
2. Mean Deviation −Discrete Series
Step 1: Find out an average (mean, median or mode).
Step 2: Find out the deviation of the variable values from the average, ignoring signs and
denote them by |D|
Step 3: Multiply the deviation of each value by its respective frequency and find out the total
Σf | D|
Step 4: Divide Σf | D| by the total frequencies N

Example 7
Compute Mean deviation from mean and median from the following data:

Height in cms 158 159 160 161 162 163 164 165 166
No. of persons 15 20 32 35 33 22 20 10 8

Also compute coefficient of mean deviation.

Solution:

Height (X) No. of d = x−A fd D=X−mean fD


persons (f) A = 162
158 15 −4 −60 3.51 52.65
159 20 −3 −60 2.51 50.20
160 32 −2 −64 1.51 48.32
161 35 −1 −35 0.51 17.85
162 33 0 0 0.49 16.17
163 22 1 22 1.49 32.78
164 20 2 40 2.49 49.80
165 10 3 30 3.49 34.90
166 8 4 32 4.49 35.92
Total 195 −95 338.59

fd −95
Mean = A + = 162 + = 161.51
N 195

fD 338.59
M. D. = = = 1.74
N 195
Page 20 of 34
M. D. 1.74
Co − efficient M. D. = = = 0.0108
Mean 161.51

Height (x) No. of persons (f) c.f. D=X−median fD


158 15 15 3 45
159 20 35 2 40
160 32 67 1 32
161 35 102 0 0
162 33 135 1 33
163 22 157 2 44
164 20 177 3 60
165 10 187 4 40
166 8 195 5 40
195 334

N 195
Median = Size of ( ) th item = Size of ( ) th item = Size of 98th item = 161
2 2

fD 334
M. D. = = = 1.71
N 195

M. D. 1.71
Co − efficient M. D. = = = 0.0106
Median 161

3. Mean Deviation-Continuous Series


The method of calculating mean deviation in a continuous series same as the discrete series.
In continuous series we have to find out the mid points of the various classes and take
deviation of these points from the average selected. Thus

fD
M. D. =
N

Where: D = m − Average ; m = mid point

Example 8:
Find out the mean deviation from mean and median from the following series.

Page 21 of 34
Age in years No. of persons
0−10 20
10−20 25
20−30 32
30−40 40
40−50 42
50−60 35
60−70 10
70−80 80

Also compute co-efficient of mean deviation.

Solution:
x m f 𝑚−𝐴 fd D=X−mean fD
𝑑= 𝑐
𝐴 = 35; 𝑐 = 10
0−10 5 20 −3 −60 31.5 630.0
10−20 15 25 −2 −50 21.5 537.5
20−30 25 32 −1 −32 11.5 368.0
30−40 35 40 0 0 1.5 60.0
40−50 45 42 1 42 8.5 357.0
50−60 55 35 2 70 18.5 647.5
60−70 65 10 3 30 28.5 285.0
70−80 75 8 4 32 38.5 308.0
Total 212 3192.5

∑ fd 320
Mean = A + ∗ c = 35 + x10 = 36.5
N 212

∑ fD 3192.5
M. D. = = = 15.06
N 212

Page 22 of 34
Calculation of Median and M.D. from Median
x m f c.f D=m−Md fD
0−10 5 20 20 32.25 645.00
10−20 15 25 45 22.25 556.25
20−30 25 32 77 12.25 392.00
30−40 35 40 117 2.25 90.00
40−50 45 42 159 7.75 325.50
50−60 55 35 194 17.75 621.25
60−70 65 10 204 27.75 277.50
70−80 75 8 212 37.75 302.00
Total 212 3209.50

N 212
Median = ( ) th item = = 106
2 2

N
−m 106 − 77
Median = 𝑙 + 2 ∗ c = 30 + ∗ 10 = 37.25
f 40

∑ fD 3209.5
M. D. = = = 15.14
N 212

M. D. 15.14
Co − efficient of M. D. = = = 0.41
Median 37.25

Merits of M.D.
1. It is simple to understand and easy to compute.
2. It is rigidly defined.
3. It is based on all items of the series.
4. It is not much affected by the fluctuations of sampling.
5. It is less affected by the extreme items.
6. It is flexible, because it can be calculated from any average.
7. It is better measure of comparison.

Demerits of M.D.
1. It is not a very accurate measure of dispersion.
2. It is not suitable for further mathematical calculation.
3. It is rarely used. It is not as popular as standard deviation.

Page 23 of 34
4. Algebraic positive and negative signs are ignored. It is mathematically unsound and
illogical.

STANDARD DEVIATION AND COEFFICIENT OF VARIATION

1. Definition
It is defined as the positive square-root of the arithmetic mean of the Square of the deviations
of the given observation from their arithmetic mean. It is the square–root of the mean of the
squared deviation from the arithmetic mean. Square of standard deviation is called Variance.

2. Calculation of Standard Deviation-Individual Series


There are two methods of calculating Standard deviation in an individual series.
a) Deviations taken from Actual mean
b) Deviation taken from Assumed mean

(a) Deviation taken from Actual mean


This method is adopted when the mean is a whole number.

Steps:
1. Find out the actual mean of the series ( )
2. Find out the deviation of each value from the mean (X = X – )
3. Square the deviations and take the total of squared deviations ∑ X 2
∑ X2
4. Divide the total (∑ X 2 ) by the number of observation ( )
n

Formulae:

2
∑ X2 (X − X)
Standard Deviation () = √( ) 𝑜𝑟 √
n n

(b) Deviations Taken from Assumed Mean


This method is adopted when the arithmetic mean is fractional value. Taking deviations from
fractional value would be a very difficult and tedious task. To save time and labour, the short–
cut method is applied. In this method, the deviations are taken from an assumed mean.

The formula is:

Page 24 of 34
2
∑ d2 ∑d
 = √( )−( )
N N

Where: d stands for the deviations from the assumed mean = (X − A)

Steps:
1. Assume any one of the item in the series as an average (A)
2. Find out the deviations from the assumed mean; i.e., X-A denoted by d and also the total
of the deviations Σd
3. Square the deviations; i.e., d2 and add up the squares of deviations, i.e, Σd2
4. Then substitute the values in the following formula:

2
∑ d2 ∑d
 = √( )−( )
N N

Note: We can also use the simplified formula for standard deviation.

1 2
= √ 2
(n ∑ d ) − (∑ d)
n

For the frequency distribution

c 2
= √(N ∑ fd2 ) − (∑ fd)
n

Example 9
Calculate the standard deviation from the following data.
14, 22, 9, 15, 20, 17, 12, 11

Page 25 of 34
Solution:
Deviations from actual mean.

Values (X) (X − X) (X − X)2


14 –1 1
22 7 49
9 –6 36
15 0 0
20 4 16
17 2 4
12 –3 9
11 –4 16
120 140

120
X= = 15
8

2
(X − X) 140
=√ =√ = 4.18
n 8

Example 10
The table below gives the marks obtained by 10 students in statistics. Calculate standard
deviation.

Student Nos : 1 2 3 4 5 6 7 8 9 10
Marks 43 48 65 57 31 60 37 48 78 59

Solution
Deviations from assumed mean

Student Nos : Marks (X) d = X − A (A = 57) d2


1 43 –14 196
2 48 –9 81
3 65 8 64
4 57 0 0
5 31 –26 676

Page 26 of 34
6 60 3 9
7 37 –20 400
8 48 –9 81
9 78 21 441
10 59 2 4
N=10 d=–44 d2=1952

2
∑ d2 ∑d
 = √( )−( )
N N

1952 −44 2
 = √( )−( ) = 13.26
10 10

3. Calculation of Standard Deviation for Discrete Series


There are three methods for calculating standard deviation in discrete series:
(a) Actual mean methods
(b) Assumed mean method
(c) Step-deviation method.

(a) Actual mean method

Steps:
1. Calculate the mean of the series.
2. Find deviations for various items from the means i.e., d = X − X
3. Square the deviations (d2) and multiply by the respective frequencies (f) to get fd2.
4. Total to product (Σfd2) Then apply the formula:

∑ fd2
=√
∑f

If the actual mean in fractions, the calculation takes lot of time and labour; and as such this
method is rarely used in practice.

Page 27 of 34
(b) Assumed Mean Method
Here deviation are taken not from an actual mean but from an assumed mean. Also this
method is used, if the given variable values are not in equal intervals.

Steps:
1. Assume any one of the items in the series as an assumed mean and denoted by A.
2. Find out the deviations from assumed mean, i.e, X-A and denote it by d.
3. Multiply these deviations by the respective frequencies and get the Σfd.
4. Square the deviations (d2).
5. Multiply the squared deviations (d2) by the respective frequencies (f) and get Σfd2.
6. Substitute the values in the following formula:

2
∑ fd2 ∑ fd
=√ −( )
∑f ∑f

Where: d = A − A, N = f

Example 11:
Calculate Standard deviation from the following data.

X 20 22 25 31 35 40 42 45
f 5 12 15 20 25 14 10 6

Solution :
Deviations from assumed mean

X f d = X − A (A = 31) d2 fd fd2
20 5 −11 121 −55 605
22 12 −9 81 −108 972
25 15 −6 36 −90 540
31 20 0 0 0 0
35 25 4 16 100 400
40 14 9 81 126 1134
42 10 11 121 110 1210
45 6 14 196 84 1176
Total N=107 fd=167 fd2=6037

Page 28 of 34
2
∑ fd2 ∑ fd
=√ −( )
∑f ∑f

6037 167 2
=√ −( ) = 7.35
107 107

(c) Step-deviation method:


If the variable values are in equal intervals, then we adopt this method.

Steps:
1. Assume the center value of the series as assumed mean A.
X−A
2. Find out d′ = , where C is the interval between each value.
C

3. Multiply these deviations d′ by the respective frequencies and get ∑ fd′.


4. Square the deviations and get d′2 .
5. Multiply the squared deviation (d′2 ) by the respective frequencies (f) and obtain the total
∑ fd′2 .
6. Substitute the values in the following formula to get the standard deviation.

2
∑ fd′2 fd′2
=√ ∑f
− ( ∑ ) *C
f

Example 12
Compute Standard deviation from the following data.

Marks 10 20 30 40 50 60
No. of students 8 12 20 10 7 3

Solution:

Marks (X) No. of students (f) ′


X − 30 d2 fd fd2
d =
10
10 8 −2 4 −16 32
20 12 −1 1 −12 12
30 20 0 0 0 0
40 10 1 1 10 10
50 7 2 4 14 28
60 3 3 9 9 27
N=60 fd=5 fd2=109

Page 29 of 34
2
∑ fd′2 fd′2
=√ ∑f
− ( ∑ f ) *C

∑ 1092 5 2
=√ − ( ) ∗ 10 = 13.45
60 60

4. Calculation of Standard Deviation for Continuous series


In the continuous series the method of calculating standard deviation is almost the same as in
a discrete series. But in a continuous series, mid-values of the class intervals are to be found
out. The step- deviation method is widely used.

The formula is,


2
∑ fd′2 fd′2
=√ −( ) *C
N N

m−A
Where d′ = ; C = Class interval
C

Steps:
1. Find out the mid-value of each class.
2. Assume the center value as an assumed mean and denote it by A.
m−A
3. Find out d′ = C

4. Multiply the deviations d′ by the respective frequencies and get fd′


5. Square the deviations and get 𝑑 ′2 .
6. Multiply the squared deviations 𝑑 ′2 ) by the respective frequencies and get fd′2
7. Substituting the values in the following formula to get the standard deviation.

2
∑ fd′2 fd′2
=√ N
−( N
) *C

Example 13:
The daily temperature recorded in a city in Russia in a year is given below.

Temperature C0 No. of days


−40 to −30 10
−30 to −20 18
−20 to −10 30
−10 to 0 42

Page 30 of 34
0 to −10 65
10 to −20 180
20 to 30 20

Required:
Calculate Standard Deviation.

Solution :

Temperature Mid-Point No. of days m − (−5) d′2 fd′ fd′2


d′ =
(X) (m) (f) 10

−40 to −30 −35 10 −3 9 −30 90


−30 to −20 −25 18 −2 4 −36 72
−20 to −10 −15 30 −1 1 −30 30
−10 to 0 −5 42 0 0 0 0
0 to −10 5 65 1 1 65 65
10 to −20 15 180 2 4 360 720
20 to 30 25 20 3 9 60 180
N=365 fd=389 fd2=1157
2
∑ fd′2 fd′
=√ N
− ( N ) *C

1157 389 2
 = √ 365 − (365) *10 =14.260 𝐶

Merits of Standard Deviation


1. It is rigidly defined and its value is always definite and based on all the observations and
the actual signs of deviations are used.
2. As it is based on arithmetic mean, it has all the merits of arithmetic mean.
3. It is the most important and widely used measure of dispersion.
4. It is possible for further algebraic treatment.
5. It is less affected by the fluctuations of sampling and hence stable.
6. It is the basis for measuring the coefficient of correlation and sampling.

Demerits of Standard Deviation


1. It is not easy to understand and it is difficult to calculate.
2. It gives more weight to extreme values because the values are squared up.
Page 31 of 34
3. As it is an absolute measure of variability, it cannot be used for the purpose of
comparison.

Coefficient of Variation
The standard deviation is an absolute measure of dispersion. It is expressed in terms of units
in which the original figures are collected and stated. The standard deviation of heights of
students cannot be compared with the standard deviation of weights of students, as both are
expressed in different units, i.e heights in centimeter and weights in kilograms. Therefore the
standard deviation must be converted into a relative measure of dispersion for the purpose of
comparison. The relative measure is known as the coefficient of variation.

The coefficient of variation is obtained by dividing the standard deviation by the mean and
multiply it by 100. symbolically,


Coefficient of Variation (C. V. ) = x100
X

If we want to compare the variability of two or more series, we can use C.V. The series or
groups of data for which the C.V. is greater indicate that the group is more variable, less
stable, less uniform, less consistent or less homogeneous. If the C.V. is less, it indicates that
the group is less variable, more stable, more uniform, more consistent or more homogeneous.

Example 15
In two factories A and B located in the same industrial area, the average weekly wages (in
rupees) and the standard deviations are as follows:

Factory Average Standard Deviation No. of workers


A 34.5 5 476
B 28.5 4.5 524

Required:
(a) Which factory A or B pays out a larger amount as weekly wages?
(b) Which factory A or B has greater variability in individual wages?

Solution:
Total wages paid by factory A = 34.5x476 = Kshs. 16,422

Page 32 of 34
(a) Total wages paid by factory B = 28.5x524 = Kshs. 14,934

Therefore factory A pays out larger amount as weekly wages.

(b) C.V. of distribution of weekly wages of factory A and B are


 5
CV (A) = x100 = x100 = 14.49%
X 34.5
 4.5
CV (B) = x100 = x100 = 15.79%
X 28.5

Factory B has greater variability in individual wages, since C.V. of factory B is greater than
C.V of factory A.

Example 16
Prices of a particular commodity in five years in two cities are given below:

Price in City A Price in City B


20 10
22 20
19 18
23 12
16 15

Which city has more stable prices?

Solution:
Actual mean method
City A City B
Prices (X) dx = X − 20 dx 2 Prices (Y) dy = Y − 15 dy 2
20 0 0 10 −5 25
22 2 4 20 5 25
19 −1 1 18 3 9
23 3 9 12 −3 9
16 −4 16 15 0 0
X=100 dx dx2 Y=75 dy=0 dy2=68

Page 33 of 34
∑ X 100
City A: X = = = 20
n 5

∑ dx 2 30
=√ = √ = 2.45
n 5

 2.45
CV (A) = x100 = x100 = 12.25%
X 20
∑ X 75
City B: X = = = 15
n 5

∑ dx 2 68
=√ = √ = 3.69
n 5

 3.69
CV (A) = x100 = x100 = 24.6%
X 15

City A had more stable prices than City B, because the coefficient of variation is less in City
A.

Page 34 of 34

You might also like