A-level maths paper 2 - Statistics
This a branch of mathematics dealing with collection, presentation, analysis and interpretation of
data
Types of data
(a) Discrete data
Its information collected by counting and usually takes integral values that do not lie within a
given range
(b) Continuous data
It is information that takes values within a given range
Discrete or ungrouped data
Terms Used
(i) Mean or average of a sample
∑
It is denoted by ̅ and defined as ̅ ; where x is the variable given and n is the number
of variable
If assumed mean (working mean) a is given then
∑
̅ ; where d = x – A
∑ ∑
If the frequency, f, s given then ̅ ∑
or ̅ ∑
∑ ∑
(ii) Variance or Var(X) = ∑
(∑ )
∑ ∑
(iii) Standard deviation = √ ∑
(∑ )
(iv) Mode
This is the value of the distribution that appears most
(v) Median
This is the middle value of the distribution obtained after the values have been arranged
either in ascending of descending order.
Median = ( ) value.
Note that the values must be arranged in ascending or descending order
(vi) Range
It is the difference between the largest value and the smallest value.
(vii) Quartiles
A quartile is a value that divides given values into four equal parts
q1 is the lower quartile and is defined by
q1 = ( ) value where N is the sum of all the variables
q3 is the upper quartile and is defined by
q3 = ( ) value where N is the sum of all the variables
Note that the values must be arranged in ascending or descending order
(viii) Percentiles
A percentile is a value that divides given values into 100 pats.
P10 is the 10th percentile and is defined as
P10 = ( ) value where N is the sum of all the variables
P90 is the 10th percentile and is defined as
P90 = ( ) value where N is the sum of all the variables
Note that the values must be arranged in ascending or descending order
(ix) Deciles
A decile is a value that divides given values into 10 pats.
D5 is the 5th decile and is defined as
D5 = ( ) value where N is the sum of all the variables
Note that the values must be arranged in ascending or descending order
Example 1
Given the following sets of values
2, 1, 3, 4, 5, 6, 7, 8, 9, 10, 3, 4, 6, 8, 9, 6, 3, 2
(a) Form a frequency table of ungrouped data
x f fx cf x2 fx2
1 1 1 1 1 1
2 2 4 3 4 8
3 3 9 6 9 27
4 2 8 8 16 32
5 1 5 9 25 25
6 4 24 13 36 144
7 2 14 15 49 98
8 2 16 17 64 128
9 2 18 19 81 162
10 1 10 20 100 100
∑ ∑ 725
(b) Use the table to find the mean and mode
∑
(i) Mean, ̅ ∑
∑ ∑
(ii) Standard deviation = √ (∑ ) √ ( ) = 2.5588(4D)
∑
(iii) Mode = 6 (the value that appear most
(iv) Find the median value
Median = ( ) value = ( ) = 10th value from cumulative frequency, cf ;
median = 6
(v) Range = 10 – 1 = 9
(vi) Lower quartile, q1 = ( ) =( ) value, from cf, q1 = 3
(vii) Lower quartile, q3 = ( ) =( ) value, from cf, q3 = 7
(viii) Tenth percentile ( ) value where N is the sum of all the variables
( ) ( ) =18th value
From cf, 9
Example 2
The ages of eight students in a class are: 12, 13, 14, 15, 12, 17, 12, 13, 16, 12.
Find the;
(a) Form a frequency table of ungrouped data
x f cf fx x2 fx2
12 4 4 48 144 576
13 2 6 26 169 338
14 1 7 14 196 196
15 1 8 15 225 225
16 1 9 16 256 256
17 1 10 17 289 289
∑f =10 ∑fx = 136 ∑fx2=1,880
(b) Use the table to find
a. mean age
∑
mean = ∑
b. variance
∑ ∑
Var (X) = ∑
(∑ )
= – 13.62
= 3.04
c. Standard deviation = √ =1.7436(4D)
d. Mode = 12 (most frequent figure)
e. Median = ( ) = 5th value = 13
f. Range = 17 – 12 = 5
g. Lower quartile, q1 = ( ) =( ) value, from cf, q1 = 12
h. Lower quartile, q3 = ( ) =( ) value, from cf, q3 = 15
i. Tenth percentile ( ) value where N is the sum of all the variables
( ) ( ) =9th value
From cf, 16
Example 3
The frequency distribution table shows the marks of some student from a certain school
x 45 63 65 66 70 72 75 80 88
f 3 5 6 4 6 2 1 2 1
Calculate standard deviation
Solution
x f fx fx2
45 3 135 6075
63 5 315 19845
65 6 390 25350
66 4 264 17424
70 6 420 29400
72 2 144 10368
75 1 75 5625
80 2 160 12800
88 1 88 7744
∑ 30 1991 ∑ 134631
Using assumed mean to get variance and standard deviation
∑ ∑
Var (X) = ∑
(∑ )
∑ ∑
s.d = √ ∑
(∑ )
∑ ∑
S.d = √ ∑
(∑ )
=√ ( )
=√ ( )
=√
= 9.12
Revision exercise 1
1. The data below represents the length of leaves in cm: 4.5 4.4, 6.2, 9.4, 8.2, 12.6, 10.0, 8.8, 3.8 and
13.6. find the;
(a) Mean
(b) Standard deviation
2. The concentration in m per litre of a trace element in 7 randomly chosen samples of water from
spring wells were: 240.8, 237.3, 236.6, 2333.9 and 232.5. Determine the mean and the variance
of the concentration of the trace elements per litre.
3. The table below shows the length of flowers from a certain plant to the nearest 0.5cm.
Length (cm) 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0
Number of flowers 4 9 11 8 10 7 2 3
Find the:
(a) Mean
(b) Mode
(c) The median
(d) Standard deviation
4. The marks scored by 11 students in a test are:52, 61, 78, 49, 47, 79, 54, 58, 62, 73, 72
Find;
(a) Median,
(b) Mean,
(c) Interquartile range
(d) Semi-quartile range
5. The frequency distribution table shows the heights of some students at a certain school
Height 154 155 160 164 171 180
Frequency 4 6 8 5 4 3
Determine the variance and standard deviation of the data using a working mean of 160
6. The table below shows the marks obtained by 20 students in a mathematics test marked out of
20
Marks 10 11 12 13 14 15 16 17 18 19 20
Number of students 1 2 2 2 2 4 2 1 2 1 1
Find:
(a) Mean mark
(b) Standard deviation
(c) 60th percentile
(d) Interquartile range
7. Given the following scores
8, 6, 8, 9, 10, 6, 4, 5, 6, 4, 4, 6, 8, 7, 10, 8, 6, 11, 12, 8
(a) Form a frequency distribution table of ungrouped data.
(b) Find the standard deviation
(c) Calculate semi-quartile range
(d) Determine the range of 45th and 90th percentile.
Solutions to revision exercise 1
1. The data below represents the length of leaves in cm: 4.5 4.4, 6.2, 9.4, 8.2, 12.6, 10.0, 8.8, 3.8 and
13.6. find the;
(a) Mean
(b) Standard deviation
Solution
x x2
4.5 20.25
4.4 19.36
6.2 38.44
9.2 84.64
8.2 67.24
12.0 144
10.0 100
8.8 77.44
3.8 14.44
13.6 184.96
∑ 80.7 ∑ 750.77
∑
(a) Mean, ̅ =
∑
(b) s.d =√ ̅ =√ ( ) =3.155
2. The concentration in m per litre of a trace element in 7 randomly chosen samples of water from
spring wells were: 240.8, 237.3, 236.6, 2333.9 and 232.5. Determine the mean and the variance
of the concentration of the trace elements per litre.
Solution
∑𝑥
x x2 Mean, 𝑥̅ =
𝑛
240.8 57984.64 ∑𝑥
Var(x) = 𝑥̅
237.3 56311.29 𝑛
236.7 56026.89
=
234.2 54849.64
236.6 55979.56 = 55,702.5 – 55,696
233.9 54709.21
= 6.5
232.5 54056.25
∑ 1652 ∑ 389917.5
3. The table below shows the length of flowers from a certain plant to the nearest 0.5cm.
Length (cm) 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0
Number of flowers 4 9 11 8 10 7 2 3
Find the:
(a) Mean
(b) Mode
(c) The median
(d) Standard deviation
Solution
x f fx fx2 cf ∑ 𝑓𝑥
(a)Mean = ∑𝑓
7.5 4 30 225 4
8.0 9 72 576 13
8.5 11 93.5 794.75 24
9.0 8 72 648 32 (b) mode = 8.5
9.5 10 95 902.5 42
𝑡
10.0 7 70 700 49 (c) Median = ( ) value
10.5 2 21 220.5 51
11.0 3 33 363 54 = 27th value
∑ 54 ∑ 486.5 ∑ 4429.75 From cf the 27th value = 9.0
∑
(d) s.d = √ ∑ ̅ =√ ̅ =1
4. The marks scored by 11 students in a test are:52, 61, 78, 49, 47, 79, 54, 58, 62, 73, 72
Find;
(a) Median
Arrange values in ascending order
47, 49, 52, 54, 58, 61, 62, 72,73,78,79
(a) Median = = 6th value = 61
(b) Mean,
∑ = 52+ 61+78 + 49 + 47 + 79 + 54 + 58 + 62 + 73 + 72 =685
∑
Mean = = 62.273
(b) Interquartile range
q1 = ( ) value = 2.75th value = 52
q3 = ( ) value = 8.25th value = 73
Interquartile range = 73 -52 = 21
(c) Semi-quartile range =
5. The frequency distribution table shows the heights of some students at a certain school
Height 154 155 160 164 171 180
Frequency 4 6 8 5 4 3
Determine the variance and standard deviation of the data using a working mean of 160
Solution
x f d = x - A fd fd2 ∑ 𝑓𝑑 ∑ 𝑓𝑑
154 4 -6 -24 144 Var (X) = ∑𝑓
(∑ )
𝑓
155 6 -5 -30 150
160 8 0 0 0 = ( )
164 5 4 20 80
171 4 11 44 484 = 63.156
180 3 20 60 1200
s.d = 𝑉𝑎𝑟(𝑥) = √ =7.95
∑ 30 ∑ 70 ∑ 2058
6. The table below shows the marks obtained by 20 students in a mathematics test marked out of
20
Marks 10 11 12 13 14 15 16 17 18 19 20
Number of students 1 2 2 2 2 4 2 1 2 1 1
Find:
(a) Mean mark
(b) Standard deviation
(c) 60th percentile
(d) Interquartile range
Solution
x f cf fx fx2
∑ 𝑓𝑥 ∑ 𝑓𝑥
10 1 1 10 100 (a) s.d = √ ∑𝑓
(∑ )
𝑓
11 2 3 22 242
12 2 5 24 288 =√ ( )
13 2 7 26 338 = 2.722
14 2 9 28 392 (b) 60th percentile
15 4 13 60 900 𝑡
16 2 15 32 512 =( 𝑥 ) value
17 1 16 17 289
= 12th value from cf =15
18 2 18 36 648
19 1 19 19 361
20 1 20 20 400
∑ 20 ∑ 294 ∑ 4470
(d) q1 = ( ) value = 5th value from cf = 12
Q3 = ( ) value = 15th value from cf = 16
Interquartile range = 16 -12 = 4
7. Given the following scores
8, 6, 8, 9, 10, 6, 4, 5, 6, 4, 4, 6, 8, 7, 10, 8, 6, 11, 12, 8
(a) Form a frequency distribution table of ungrouped data.
(b) Find the standard deviation
(c) Calculate semi-quartile range
(d) Determine the range of 45th and 90th percentile.
Solution
x f cf fx fx2
∑ 𝑓𝑑 ∑ 𝑓𝑑
4 3 3 12 48 (b) s.d = √ (∑ )
∑𝑓 𝑓
5 1 4 5 25
6 5 9 30 180
=√ ( ) =2.26
7 1 11 7 49
8 5 16 40 320 𝑡
(c) q1 = ( 𝑥 ) value
9 1 17 9 81
10 2 19 20 200 = 5th value from cf = 6
11 1 20 11 121 𝑡
q3 = ( 𝑥 ) value
12 1 21 12 144
∑ 20 ∑ 146 ∑ 1168 = 15th value, from cf = 8
Semi-interquartile range =
(d) 45th percentile = ( ) value = 9th value from cf = 6
90th percentile = ( ) value = 18th value from cf = 10
The range between the 90th percentile and 45th percentile = 10 – 6 = 4
Continuous or grouped data.
This is data whose scores or values are said to be continuous and take interval values
Example
Class 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 - 89
Number of 4 5 7 3 6 4 1
students
Terms used
(a) Class: these are limits of distribution. In the table above, the classes are: (20 – 29),
(30 – 39), (40 – 49), (50 – 59), (60 – 69), (70 – 79), (80 – 89).
(b) Class mark or mark (x)
(c) Frequency (f) = number of items in a class
This is the mid-point value of the class. It is normally denoted by x. in the table above, the
class marks are 24.5, 34.5, 44.5 …..
(d) Class boundary
These are continuous class limits. In the above table the first class boundary is (20-0.5) –
(29 +0.5). In this case, the lower class boundary is 19.5 and upper class boundary is 29.5
For class interval 2.0 – 2.9, the class boundary is (2.0 -0.05) – (2.9 + 0.05) =
1.95-2.95.
(e) Class width or class interval
This is the width of each class boundary.
It is given by;
Class width = upper class boundary – lower class boundary
In the table above, class width = 29.5 – 19.5 = 10
∑
(f) Mean, ̅ = ∑
∑ ∑
(g) Variance (Var(X) = ∑
(∑ )
∑ ∑
(h) Standard deviation = ( ) √ (∑ )
∑
(i) Median of grouped data
Median of grouped data is defined by
∑
Median = Lb + ( )
Where Lb = lower class boundary of the median class
C= class width of the median class
(j) Mode of grouped data
Mode of grouped data with equal class width is defined as
Mode = Lb + ( )
Where
Lb = lower class boundary of modal class
C = Class width of the modal class
= Modal frequency (pre-modal frequency)
= Modal frequency (post modal frequency)
(k) Median of grouped data
Median of grouped data is defined by
∑
Median = Lb + ( )
Where Lb = lower class boundary of the median class
C= class width of the median class
f = frequency of the median class
[Link]= cumulative frequency before that of the median class
(l) Histogram
This is a graph consisting of vertical bars. It is a graph of frequency against class boundary.
The area of the bar is equal to the frequency. Histogram is used to obtain the mode
(m) Percentile
This a value that divides a given distribution into 100 equal parts
The 60th percentile for instance is defined as
∑
P60 = ( )
Where
Lb = lower class boundary of the 60th class
C= class width
F = frequency of the 60th class
[Link] = cumulative frequency before that one of the 60th class
(n) Quartiles
This a value that divides a given distribution into 4 equal parts
The lower quartile denoted q1 for instance is defined as
∑
q1 = ( )
Where
Lb = lower class boundary of the q1 class
C= class width
f = frequency of the q1 class
[Link] = cumulative frequency before that one of the q1 class
The upper quartile denoted q3 for instance is defined as
∑
q3 = ( )
Where
Lb = lower class boundary of the q3 class
C= class width
f = frequency of the q3 class
[Link] = cumulative frequency before that one of the q3 class
Interquartile range = q3 –q1
Semi-interquartile range =
Example 4
The table below shows the weight of 250 students at The Science Foundation College
Weight (kg) 44.0 – 48.0 – 52.0 – 56.0 – 60.0 – 64.0 – 68.0 – 72.0 –
47.9 51.9 55.9 59.9 63.9 67.9 71.9 75.9
Frequency 3 17 50 45 46 57 23 9
Find
(a) Average weight
(b) Standard deviation
(c) Median weight
(d) Modal weight
(e) Draw an Ogive and use it to find
(i) Upper quartile
(ii) And 10th percentile
(f) Construct a histogram and use it to determine the mode
Solution
Class Class x f fx fx2 cf
boundary
44.0 - 47.9 43.95 - 47.95 45.95 3 137.85 413.55 3
48.0 - 51.9 47.95 - 51.95 49.95 17 849.15 14435.55 20
52.0 - 55.9 51.95 - 55.95 53.95 50 2697.5 134875 70
56.0 -59.9 55.95 - 59.95 57.95 45 2607.75 117348.8 115
60.0 - 63.9 59.95 - 63.95 61.95 46 2849.7 131086.2 161
64.0 - 67.9 63.95 - 67.95 65.95 57 3759.15 214271.6 218
68.0 - 71.9 67.95 - 71.95 69.95 23 1608.85 37003.55 241
72.0 - 75.9 71.95 - 75.95 73.95 9 665.55 5989.95 250
∑ 250 ∑ 15175.5 ∑ 655424.1
∑
(a) Mean, ̅ ∑
(b) Standard
∑
(c) Median = Lb + ( )
∑
Median class boundary is 59.95 - 63.95, f = 46 and C = 4
Median = 59.95 + ( ) = 60.82kg
(d) Modal class boundary is 63.95 – 67.95, since 57 is the highest frequency and C = 4
and
( )
(e) Draw an Ogive and use it to find
(i) Upper quartile = ( ) 187.5th value, from the graph q3 = 65.55
(ii) Median = ( ) 125th value, from the graph, median = 65.55
(iii) 10th percentile ( ) 225th value, from the graph, P90 = 69.15
(f) Construct a histogram and use it to determine the mode
From the graph, the mode =63.95 + 0.8= 64.75
Example 5
The table below shows the number of students and the mark scored in a test.
MARKS NUMBER OF STUDENTS
0–4 10
5–9 7
10 – 14 5
15 – 19 3
20 – 24 7
25 – 29 11
30 – 34 37
35 – 39 20
(a) (i) Draw a cumulative frequency curve (Ogive) for data
MARKS Class NUMBER OF Cf x fx fx2
boundaries STUDENTS (f)
0– 4 0 – 4.5 10 10 2 20 40
5–9 4.5 – 9.5 7 17 7 49 343
10 – 14 9.5 – 14.5 5 22 12 60 720
15 – 19 14.5 – 19.5 3 25 17 51 867
20 – 24 19.5 – 24.5 7 32 22 154 3388
25 – 29 24.5 – 29.5 11 43 27 297 019
30 – 34 29.5 – 34.5 37 80 32 1184 37888
35 – 39 34.5 – 39.5 20 100 23 740 27380
∑ ∑fx =2555 ∑fx2=78645
(i) Use the Ogive to estimate the median mark (06marks)
Note that Cf is plotted against the upper limit of the class
Median =( ) ( ) value= 30.5
(b) Calculate the
(i) Mean mark
∑
Mean = ∑
(ii) Standard deviation (09 marks)
∑ ∑
S.d = √ ∑
( ∑ ) =√ ( ) = 11.56
Example 6
The frequency distribution table below shows the marks of 50 students score in a test
Marks Number of Students
50 – 52 3
53 – 55 16
56 – 58 14
59 – 61 13
62 – 64 2
65 – 67 2
(a) Calculate the:
Solution
Marks Class Number of fx Fx2 CF
boundaries Students (f)
50 – 52 49.5 – 52.5 3 153 7803 3
53 – 55 52.5 – 55.5 16 864 46656 19
56 – 58 55.5 – 58.5 14 798 45486 33
59 – 61 58.5 – 61.5 13 780 46800 46
62 – 64 62.5 – 64.5 2 126 7938 48
65 – 67 64.5 - 67.5 2 132 712 50
50 2853 163395
(i) Mean mark (04 marks)
Solution
∑
Mean, ̅ ∑
(ii) Standard deviation. (05 marks)
∑
s.d = √ ∑
̅
=√
= 12.06
(b) (i) Plot a cumulative frequency curve (Ogive) for the given data. (04 marks)
Note that CF is plotted against the upper limit of each class
(ii) Use the Ogive to estimate the median mark. (02 marks)
56.5
Example 7
The table below shows the age in years of mothers at the time they had their first child.
Age in years 15 - 20 - 25 - 30 - 35 - 40 - 45
Number of 2 14 29 43 33 9
mothers
Calculate the modal age of the mothers. (05 marks)
Using the formula
Mode = Li + ( )
Modal class (30 – 35)
∆1 = 43 - 29 = 14
∆2 = 43 – 33 = 10
Li = 30, Cc = 5
Mode = 30 + ( ) = 32.92 (2D)
Example 8
The table below shows a frequency distribution of marks scored by s5 students in a test.
Marks 10 - 20 - 30 - 40 - 50 - 60 - 70 - 80 -
≤90
Number of students 2 6 12 15 10 6 3 1
(a) Draw a histogram for the data and use it to estimate the modal mark. (05marks)
From the graph modal mark is 44
(b) Calculate the
Marks x f fx fx2
10 - 20 15 2 30 450
20 - 30 25 6 150 3750
30 - 40 35 12 420 14700
40 - 50 45 15 675 30375
50 - 60 55 10 550 30250
60 - 70 65 6 390 25350
70 - 80 75 3 225 16875
80 - 90 85 1 85 7225
∑f = 55 ∑fx=2525 ∑fx2=128975
(i) mean mark
∑
Means, ̅ ∑
(ii) standard deviation (10marks)
∑ ∑
S.D = √ ∑
(∑ ) √ ( ) = 15.4
Example 9
The table below shows the weights in kg of 50 cattle on a farm
60 81 76 68 84 112 76 102 86 67
65 98 107 110 72 99 87 92 76 77
94 102 87 86 73 118 98 120 62 87
65 92 104 116 91 93 78 122 102 92
80 111 73 120 106 123 94 109 80 96
(a) Form a grouped frequency table for the data with classes of equal intervals, starting with
the class 60 – 69. (06 marks)
Classes Class boundaries Frequency, f Cumulative frequency, CF
60 – 69 59.5 – 69.5 6 6
70 – 79 69.5 – 79.5 8 14
80 – 89 79.5 – 89.5 9 23
90 – 99 89.5 – 99.5 11 34
100 – 109 99.5 – 109.5 7 41
110 – 119 109.5 – 119.5 5 46
120 – 129 119.5 – 129.5 4 50
(b) Draw a cumulative frequency curve (Ogive) for the given data. (04 marks)
Note that CF is plotted against the upper limit value of the class
(c) Use your Ogive to estimate the;
(i) lower and upper quartile
Lower quartile, q1 = ( ) ( )
(ii) median weight
Median = ( ) ( )
(iii) number of cattle which weigh 118kg and above. (05 marks)
Upper quartile, q3 = ( ) ( )
Example 10
Given the data below
Marks (x) 10-19 20-24 25-34 35-39 40-54 55-64 65-79
Frequency (f) 4 6 7 3 8 6 6
Find the mode
Solution
Class Class f Frequency 𝑓 𝑑
boundary width density Mode = Lb + 𝐶
𝑓 𝑑 𝑓 𝑑
9.5 - 19.5 10 4 0.4
19.5 - 24.5 5 6 1.2 = 19.5 + (( ) ( )
)𝑥
24.5 - 34.5 10 7 0.7
= 22.58
34.5 - 39.5 5 3 0.6
39.5 - 54.5 15 8 0.53
54.5 - 64.5 10 6 0.6
64.5 - 79.5 15 6 0.4
∑ 40
Example 11
The table shows the weights (kg) of 150 patients who visited a certain health centre.
Weight (kg) 0–9 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 - 69
Frequency (f 30 16 24 32 28 12 8
Calculate
(a) Mean
(b) Mode
(c) Median
Class Class class f fx cf
boundary mark (x)
0-9 0 - 9.5 4.5 30 135 30
10 - 19 9.5 - 19.5 14.5 16 232 46
20 - 29 19.5 - 29.5 24.5 24 588 70
30 - 39 29.5 -39.5 34.5 32 1104 102
40 - 49 39.5 - 49.5 44.5 28 1246 130
50 - 59 49.5 - 59.5 54.5 12 654 142
60 - 69 59.5 - 69.5 64.5 8 516 150
∑ 150 ∑ 4475
∑
(a) Mean ̅ ∑
= 29.83kg
(b) Mode = Lb + ( )
Modal class boundary is 29.5 – 39.5, since 32 is the highest frequency and C = 10
and
( )
∑
(c) Median = Lb + ( )
∑
Median class boundary is 29.5 -39.5, f = 32 and C = 10
Median = 29.5 + ( ) = 30.06kg
Example 12
The table below shows the number of crimes committed by students
Number of 5-<10 10-<20 20-<30 30-<50 50-<100
crimes
Number of 10 15 25 40 26
students
Calculate the variance and standard deviation for the number of crimes committed
Solution
Number of x f fx fx2
crime
5-<10 7.5 10 75 562.5
10-<20 15 15 225 3375
20-<30 25 25 625 15625
30-<50 40 40 1600 64000
50-<100 75 25 1875 140625
∑ 115 ∑ 4400 ∑ 224187.5
∑ ∑
Var(x) =
∑
(∑ ) ( )
s.d = ( ) √
Example 13
The table below shows the weight of 250 students at a certain day school
Weight 44.0 – 48.0 – 52.0 – 56.0 – 60.0 – 64.0 – 68.0 – 72.0 –
(kg) 47.9 51.9 55.9 59.9 63.9 67.9 71.9 75.9
Frequency 3 17 50 45 46 57 23 9
Using assumed mean of 57.95, find
(a) average weight
(b) variance
(c) standard deviation
Solution
weight x f d = x - A fd fd2
43.95-47.95 45.95 3 -12 -36 432
47.95-51.95 49.95 17 -8 -136 1088
51.95-55.95 53.95 50 -4 -200 800
55.95-59.95 57.95 45 0 0 0
59.95-63.95 61.95 46 4 184 736
63.95-67.95 65.95 57 8 456 3648
67.95-71.95 69.95 23 12 276 3312
71.95-75.95 73.95 9 16 144 2304
∑ 250 ∑ 688 ∑ 12320
∑
(a) ̅ ∑
∑ ∑
(b) Var(x) = ∑
(∑ )
= ( )
(c) S.d = ( ) √
Example 14
The following table shows the marks obtained by to students in a physics test marked out of 100
Marks (%) 20 -29 30-39 40-49 50-59 60-69 70-79 80-89 90-100
Number of students 4 6 2 5 7 8 5 2
Find
(a) Mean
(b) Standard deviation
(c) Median and mode
(d) Semi-interquartile range
(e) 40th and 85th percentile range
Solution
Class boundary x f fx fx2 cf
19.5-29.5 24.5 4 98 2401 4
29.5-39.5 34.5 6 207 7141.5 10
39.5-49.5 44.5 2 89 3960.5 12
49.5-59.5 54.5 5 272.5 14851.25 17
59.5-69.5 64.5 7 451.5 29121.75 24
69.5-79.5 74.5 8 596 44402 32
79.5-89.5 84.5 5 422.5 35701.25 37
89.5-99.5 94.5 3 283.5 26790.75 40
∑ 40 ∑ 2420 ∑ 164370
∑
(i) Mean ̅ ∑
∑ ∑
(ii) S.d = √ (∑ ) √ ( ) = 21.19%
∑
∑
(iii) Median = Lb + ( )
∑
Median class boundary is 59.5-69.5, f = 7 and C = 10
Median = 59.5 + ( ) = 63.786%
Mode = Lb + ( )
Modal class boundary is 69.5-79.5, since 8 is the highest frequency and C = 10
and
( )
∑
(iv) q1 = ( )
∑
, Lb =29.5, f= 6, C = 10
q1 = ( )
∑
q3 = ( )
∑
, Lb =69.5, f= 8, C = 10
q1 = ( )
Semi-quartile range =
∑
(v) P40 = ( )
∑
, Lb =49.5, f= 5, C = 10
P40 = ( )
∑
(vi) P85 = ( )
∑
, Lb = 79.5, f= 5, C = 10
P85 = ( )
th th
40 and 85 range = 83.5 – 57.5 = 26%
Example 15
Given the information in the table
Class 20-29 30-34 35-44 45-64 65-74 75-84
Frequency 5 5 12 20 10 8
Find
(a) Mean value
(b) Standard deviation
(c) Mode
(d) Median
(e) Interquartile range
(f) 90th percentile
Solution
Class class x f f.d fx fx2 cf
boundary width
19.5-29.5 10 24.5 5 0.5 122.5 3001.25 5
29.5-34.5 5 32 5 1 160 5120 10
34.5-44.5 10 39.5 12 1.2 474 18723 22
44.5-64.5 20 54.5 20 1 1090 59405 42
64.5-74.5 10 69.5 10 1 695 48302.5 52
74.5-84.5 10 79.5 8 0.8 636 50562 60
∑ 60 ∑ 3177.5 ∑ 185113.8
∑
(a) Mean ̅ ∑
∑ ∑
(b) S.d = √ (∑ ) √ ( ) = 16.75
∑
(c) Mode = Lb +
=34.5 + (( ) (
) = 39.5
)
∑
(d) Median = Lb + ( )
∑
Median = 44.5 + ( ) = 552.5
∑
(e) q1 = ( )
34.5 + ( ) = 38.67
∑
Q3 = ( )
64.5 + ( ) = 67.5
Interquartile range = 67.5 – 38.67 = 28.83
∑
(vii) P90 = ( )
( )
= 74.4 + ( ) = 77
Example 17
Given the data below
Marks 5-14 15-24 25-34 35-44 45-54 55-64 65-74 75-85 85-94
Frequency 3 7 12 20 30 15 8 3 2
Draw a histogram and use it to determine the mode
Example 18
Given the data below
Marks 20-29 30-39 40-49 50-59 60-69 70-79
Frequency 4 6 12 8 7 3
Draw an Orgive and use it to determine
(a) Median
(b) Interquartile range
(c) 10th percentile
Solution
Class 19.5 – 29.5 29.5–39.5 39.5–49.5 49.5 – 59.5 59.5 – 69.5 69.5 – 79.5
boundary
cf 4 10 22 30 37 40
(a) The median = ( ) ( ) = 20th value from the graph = 48.5
(b) q1 = ( ) ( ) = 10th value ; from the graph q1 = 39.5
q3 = ( ) ( ) = 30th value; from the graph q3 = 59.5
Interquartile range = 59.5 – 39.5 = 20
(c) P10 = ( ) ( ) = 4th value ; from the graph P10 = 29.5
Grouped data with unequal class width
(i) Histogram
This is a graph of frequency density against class boundary
Note that frequency density =
Orgive
This is a graph of cumulative frequency against the class boundary
Example 19
The data shows the length in centimetres for different calendars produced by a printing press. A
cumulative frequency distribution was formed
Length (cm) <20 <30 <35 <40 <50 <60
Cumulative frequency 4 20 32 42 48 50
(a) Construct a frequency table.
(b) Find the mean length of the calendars
(c) Draw a histogram and use it to estimate the modal length
(d) Draw an Orgive and use it to estimate the median length.
Solution
(a) Frequency table
Class boundary x f fx class frequency cf
width density
0 - 20 10 4 40 20 0.2 4
20 - 30 25 16 400 10 1.6 20
30 - 35 32.5 12 390 5 2.4 32
35 - 40 37.5 10 375 5 2 42
40 - 50 45 6 270 10 0.6 48
50-60 55 2 110 10 0.2 50
∑ 50 ∑ 1585
∑
(b) Mean ̅ ∑
Median length is the ( ) ( ) 25th value, from the graph = 32
Revision Exercise 2 (answers are given in brackets besides the questions)
1. The table below shows cumulative distribution of ages (in years of 400 student
Age(years) <12 <13 <14 <15 <16 <17 <18 <19
Cumulative 0 27 85 215 320 370 395 400
frequency
(a) Construct a cumulative frequency curve
(b) Use the curve to estimate
(i) Median age (Answer 14.9)
(ii) 20th and 80th percentile range (Ans. 2.1)
2. The table below shows the time taken by students to solve a mathematics problem
Time (mins) 5-9 10-14 15-19 20-24 25-29 30-34
Frequency 5 14 30 17 11 3
(a) Draw a histogram and use it to estimate the modal time. (ans. 17.3)
(b) Find the mean and standard deviation of solving the problem (Mean = 18.5mins, s.d =
5.9896 (4D))
3. The frequency distribution table shows the heights of s.6 students measured to the nearest cm;
Height 149-152 153-156 157-160 161-164 165-168 169-172 173-176
Frequency 5 17 20 25 15 6 2
(a) Calculate
(i) Mean height (Ans. 160.9cm)
(ii) Standard deviation (Ans. 5.5873)
(b) Draw a cumulative frequency curve and use it to estimate the median (Ans. 161cm) and
range of height of the middle 60% of the candidates. (Ans. 10cm)
4. The table below the weights of some S.5 students from a certain school
Weight 50-53 54-57 58-61 62-65 66-69 70-73 74-77 78-81
Number 3 8 12 18 11 5 2 1
of
student
(a) Calculate
(i) Mean (63.1kg)
(ii) Standard deviation of students’ weight (6kg)
(b) Draw a cumulative frequency curve and use it to estimate
(i) Median weight (63.1kg)
(ii) Number of students with weight between 58.9kg and 66.7kg (29students
5. The table below is the distribution of weights of a group of animals
Mass (kg) Frequency
21-25 10
26-30 20
31-35 15
36-40 10
41-50 30
51-60 45
66-74 5
(a) Draw a cumulative frequency curve to estimate semi-quartile range (24kg)
(b) Find
(i) Mode (28.8333kg)
(ii) Standard deviation (11.772)
6. The table below shows the amount of money (in thousands of shillings) that was paid out as
allowances to participants during a certain workshop
Amount 110-114 115-119 120-129 130-134 135-144 145-159
(shs’000s)
Number of 13 20 32 17 16 12
participants
(a) Draw a histogram and use it to estimate the modal allowance (shs. 11800)
(b) Calculate the:
(i) Median allowance (shs. 126,375/=)
(ii) Mean allowance (shs. 128,000/=)
7. The table below shows the income of 40 factory workers in millions of shillings per annum
1.0 1.1 1.0 1.2 5.4 1.6 2.0 2.5
2.1 2.2 1.3 1.7 1.8 2.4 3.0 2.2
2.7 3.5 4.0 4.4 3.9 5.0 5.4 5.3
4.4 3.7 3.6 3.9 5.2 5.1 5.7 1.5
1.6 1.9 3.4 4.3 2.6 3.8 5.3 4.0
(a) Form a frequency distribution table with class interval of 0.5millin shillings starting with the
lowest limit of 1million shillings
(b) Calculate the
(i) Mean income (shs. 3,175,000)
(ii) Standard deviation (shs. 1,413,992.574)
(c) Draw a histogram to represent the above data. Use it to estimate the modal income
Modal income: (shs. 5, 200,000)
8. The table below shows the mars obtained by students in a physic test
Marks (%) 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 60-69 70-74
Frequency 9 12 10 17 13 25 18 14 8 8
(a) Draw a histogram and use it to estimate the modal mark. (52.5)
(b) Find the
(i) Mean mark (49.4627)
(ii) Standard deviation (12.424)
9. The table below shows the marks obtained in an examination by 200 candidates
Marks(%) 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89
Number of 18 34 58 42 24 10 6 8
candidates
(a) Calculate the
(i) Mean mark (40.2%)
(ii) Modal mark (35.5%)
(b) Draw a cumulative frequency curve for the data. Hence estimate the lowest mark for a
distinction one if the top 5% of the candidates qualify for the distinction. (75%)
10. A class performed on an experiment to estimate the diameter of a circular object. A sample of
five students had the following results in centimetre. 3.13, 3.16, 2.94, 3.33 and 3.0.
Determine the sample;
(i) Mean (3.11)
(ii) Standard deviation (0.1356)(05marks)
11. The times taken for 55 students to have their lunch to the nearest minute are given in the table
below
Time (minutes) 3 -4 5-9 10-19 20 – 29 30 – 44
Number of students 2 7 16 21 9
(a) Calculate the mean time for the student to have lunch. (mean=20.65) (04marks)
(b) (i) Draw a histogram for the given data
(ii) Use your histogram to estimate the modal time for the students to have lunch.
(08marks) (modal time = 22 minutes)
12. The frequency distribution below shows the age of 240 students admitted to a certain
University.
Age (years) Number of student
18 - < 19 24
19 -< 20 70
20 -< 24 76
24 -< 26 48
26 -< 30 16
30 -< 32 6
(a) Calculate the mean age of the students. (mean = )(04mark)
(b) (i) Draw a histogram for the given data
(ii) Use the histogram to estimate the modal age (modal age = 19.58) (08mark)
13. The table shows the masses of bolts bought by a carpenter.
Mass (grams) 98 99 100 101 102 103 104
Number of bolts 8 11 14 20 17 6 4
Calculate the:
(a)median mass (101g)
(b) mean mass of the bolt(100.7625g) (05mark
14. The table below shows the marks obtained in a mathematic test by a group of student
marks 5 -<15 15-<25 25-<35 35-<45 45-<55 55-<65 65-<75 75-<100
Number of 5 7 19 17 7 4 2 3
students
(a) Construct a cumulative frequency (O give) for the data (05 marks)
(b) Use your Ogive to find the
(i) Range between the 10th and 70th percentiles (26)
(ii) Probability that a student selected at random scored below 50 marks. 0.8125)
(07 marks)
15. The table below shows the marks obtained by 100 students in a mathematics test
Marks 20-<40 40-<50 50-<55 55-<60 60-<70 70-<90 90-<100
Number of 5 15 10 15 25 25 5
students
(a) Calculate the mean mark (63.125)
(b) Construct a cumulative frequency curve (Ogive) and use it to find the
(i) Median mark (61.5)
(ii) Range of the middle 40% of the mark (15)
Thank You
Dr. Bbosa Science