Single & Bivariate Statistics Solutions
Single & Bivariate Statistics Solutions
Solutions to Pre-test
1 b
a –J i 12 + 7 = 19 scores were at least 60.
ii 2 + 3 + 6 + 12 = 23 scores were less
b –G than 80.
o –E
c 2 459
p –P 4
20
q –Q
4
r –N 5
2
a Eight customers rented 3 DVDs.
d 3 5 8 10 14
5
b Forty customers were surveyed. 40
c 5 × 0 + 1 × 8 + 2 × 14 + 3 × 8 + 4 × 3 5
+ 5 × 2 = 82 8
82 DVDs were rented during the survey.
iv 58 – 38 = 20
b 2, 2, 2, 4, 6, 6, 7, 9, 10, 12
i 2 2 2 4 6 6 7 9 10 12
10
60
10
6
ii 2
iii 2, 2, 2, 4, 6, 6, 7, 9, 10, 12
66
6
2
iv 12 – 2 = 10
6
a Fifteen calculators are represented in
the plot.
d 145 – 98 = 47
d B B Categorical
Survey: A tool used to collect statistical E.g. Holden, Ford, Mercedes…
data
C Numerical
e H E.g. 12 mins, 2 hrs, 37 mins
Data: The factual information collected Students will provide a number answer
from a survey or other source
D Categorical
f D E.g. Jack Russell, Labrador, Poodle…
Variable: An element or feature that can
vary
b Numerical – discrete
Counting numbers (How many?)
c Categorical – nominal
d Numerical – continuous
Measuring
e Categorical – ordinal
The higher the category, the better your
ability
7 D
An election is a process of census,
statistics collected from an entire sample.
8 C
A representative sample of the population
should be chosen. The sample needs to
accurately portray the population of
households with TV’s.
9
A Students who arrive to school earliest are
often those who enjoy school, therefore
perhaps are more likely to have a higher
opinion of Mathematics.
2
a Car
Tally Frequency
colour
red III 3
white IIII 5
green II 2 b
silver II 2
Total 12 12 i
b Class Percentage
Frequency
interval Frequency
80–84 8 16%
85–89 23 46%
90–94 13 26%
95–100 6 12%
Total 50 100%
ii
3 Use the histogram to answer each of the
questions.
c (i) 1 + 3 + 4 = 8
8
(ii) 100 8 4 32%
25
b
b
i The frequency of people who travel by
train is 6.
ii The most popular form of transport is
the car.
iii 40% of people travel by car.
iv 12.5 + 5 = 17.5% of people walk or
cycle to work.
v The percentage of people who travel by
public transport is 15 + 20 + 7.5 =
c 70–79 has the highest frequency 42.5%.
d 5 + 3 = 8 students who scored 80 or 8
more.
8 a A skewed data set – more of the data is
100 40% on the left-hand side at 10 then trails off
25 towards 70.
6 b A symmetrical data set – the highest bar
a Class
Tally Frequency
Percentage is in the centre and then the bars either
interval Frequency
0–4 IIII 5 25% side are approximately the same height.
IIII 9
5–9 9 45%
IIII a Percentage
Mass Frequency
10–14 IIII 5 25% Frequency
15–19 I 1 5% 10– 3 6%
Total 20 100% 15– 6 12%
20– 16 32%
25– 21 42%
b 30–
4 8%
35
Total 50 100%
1|5 = 15
ii Skewed
3 b
a 3|2 = 32, 3|5 = 35, 4|1 = 41 i 12, 16, 21, 23, 25, 27, 31, 32, 35, 35,
∴32, 35, 41, 43, 47, 54, 54, 56, 60, 62, 36, 36, 38, 40, 40, 42, 44, 48, 51, 53,
71, 71 55
2|0 = 2.0
ii Approximately symmetrical
b 10
i Set 1: 164, 166, 168, 171, 172, 175, a Inner Stem Outer
176, 180, 181, 185, 187, 187, 188, city suburb
192, 195, 199, 201, 208 9643 0 349
11
Set 2: 160, 163, 163, 165, 167, 170, 9420 1 2889
171, 171, 174, 178, 182, 182, 186, 41 2 134
187, 190, 194 3 4
4 1
Set 1 Stem Set 2
864 16 03357
6521 17 01148 0|3 = 3 km
877510 18 2267
952 19 04
81 20
16|0 = 160
11
a Stem Leaf
1 24
2 369b
a 14
4 7c8
a = 3 {1, 2, 3, 4}
b=9
c = 7 or c = 8
b Stem Leaf
20 a14
21 229
22 0b57
23 14
a = 0 or a = 1
b is between 0 and 5 inclusive.
12
ii 5° is in the 0* stem
(5 is between 5–9)
b 1 2 2 4 4 7 9 8 21 23 27 32 38 39 39 44 46
9
c 6 10 11 11 13 13 14 309
9
d 56 62 64 73 75 77 77 78 79 34.3
a
e 2 4 4 5 6 | 8 8 10 12 22
7 i range = 32 – 4 = 28
f 1 2 2 3 | 7 12 12 18 Stem Leaf
5 0 44
1 0259
g 27 30 31 | 36 38 40 2 178
33.5 3 2
b 4 4 6 6 6 8 9 9 11 4 4 10 12 15 19 21 27 28 32
Median = 6 hours 10
172
c range = 11 – 4 = 7 hours 10
17.2
d mode = 6
iv median = 17
7
Stem Leaf
a range = 50 – 8 = $42 0 44
1 025|9
b 8 12 12 15 | 20 24 25 50 2 178
17.5 3 2
Median value = $17.50
b
12 15 12 24 20 8 50 25 i range = 125 – 101 = 24
8 Stem Leaf
166 10 124
c
8 11 26
$20.75 12 5
0 1 2 3 4 5 6 7
c
ii Mean = 3.6 (decreased slightly) i The score graph fluctuated wildly with
0 2 233 4 45557 a range of 0–103
10 ii The average graph is fairly constant
40 with small increases and decreases, a
range of 20 to 40.
11
3.6 d The moving average graph follows the
trend of the score (increases with good
scores and decreases with poor scores)
but the fluctuations are much less
significant.
The moving average graph gives a
better overview of the batsman’s
ability.
a 0 1 1 1 1 | 2 2 2 3 3
(1.5) i Q1 = 1.8, Q3 = 2.7
ii IQR = Q3 – Q1
b
= 2.7 – 1.8
i 0 1 1 1 1 | 2 2 2 3 3 = 0.9
Q1 (1.5)
5
ii 0 1 1 1 1 | 2 2 2 3 3 a
Q1 (1.5) Q3
i 1 2 4 8 10 11 14
Q1 M Q3
3
ii IQR = Q3 – Q1
a Q3 – Q1 = 8 – 3 = 5 = 11 – 2
=9
b Q1 – 1.5 × IQR = 3 – 1.5 × 5 = –4.5
Q3 + 1.5 × IQR = 8 + 1.5 × 5 = 15.5 b
c 18 is considered to be an outlier as it
i 1 2 2 3 5 7 8 9 10 12 14
above the upper fence value of 15.5. Q1 M Q3
4 ii IQR = Q3 – Q1
a 3 4 6 | 8 8 10 = 10 – 2
Q1 7 Q3
=8
c
i Q1 = 4, Q3 = 8
ii IQR = Q3 – Q1 i 0.8 0.9 | 1.1 1.1 1.2 1.3 1.5 | 1.7 1.9
1 M 1.6
=8–4
=4
ii Rearranged in ascending order: c 2, 8, 10, 10, 11, 12, 12, 13, 13, 15, 24
0, 8, 9, 10, 10, 11, 12, 13, 14, 14, i Max 24, min 2
15, 15, 15, 15, 17 ii Q2 12
Median 13
iii Q1 10 and Q3 13
iii Q1 10 and Q3 15
iv IQR 13 10 3
9 1 4 5 5 6 | 7 7 7 8 8 | 10 10 b Q1 – 1.5 × IQR
6.5 9 = 17 – 1.5 × 10
10 13 15 | 16 17 19 30 31 =2
15.5
Q3 + 1.5 × IQR
IQR = Q3 – Q1 = 27 + 1.5 × 10
= 15.5 – 6.5 = 42
=9
The value of 1 is an outlier as it is not
Q1 – 1.5 × IQR within the interval (2–42).
= 6.5 – 1.5 × 9
= -7
Min = 1
Q1 = 8
min Q1 Median Q3 max
Median = 13.5
Q3 = 17
2
a Median = 15 Max = 19
b Minimum = 5 ii
c Maximum = 25
d Range = 25 – 5 = 20 b
i 0 1 1 | 2 3 3 | 3 4 4 | 4 4 5
e Lower quartile = 10 1.5 3 4
Min = 0
f Upper quartile = 20 Q1 = 1.5
g IQR = 20 – 10 = 10 Median = 3
Q3 = 4
Max = 5
3
a ii
b c
i 117 118 | 118 119 120 120
118 M
ii iii
7
b 1.1, 1.4, 1.6, 1.7, 1.8, 1.8, 1.8, 1.9, 1.9, a
2.0, 2.2 i 1.6 1.9 2.0 | 2.0 2.1 2.2 | 2.2
i, ii Q 2 1.8 2.0 2.2
iv Q3 = 2.6
iii
v Max = 3.9
Outliers are 3.8 and 3.9 kg as they do d The data points for A have a very
narrow spread and are concentrated
not fit in the interval (0.6–3.5) kg.
around a value of 6.5, while the data
c points for B are more widely and evenly
spread.
10
a Both boxplots share a median of 14 and
8 an upper quartile of 17.
b The range of data set A is 6 and the
a 8 14 15 15 16 16 16 17 19 range of data set B is 12.
Q1 M Q3
Data set B has a wider range of values.
19 24
c i IQR 17 13 4
i Min = 8
ii IQR 17 12 5
ii Q1 = 15
d Data set B is more spread out overall,
iii Median = 16 although the portions of each data set
between the median and the upper
iv Q3 = 19 quartile are spread similarly.
11
v Max = 24 a Set 1: 1, 2, 3, 3, 4, 4, 6, 6, 6, 7, 7, 7, 8, 8,
12
vi IQR
Q2 6
= Q3 – Q1
Q1 3 and Q3 7
= 19 – 15
=4 IQR 7 3 4
Q1 1.5 IQR 3 1.5 4 -3
b Q1 – 1.5 × IQR Q3 1.5 IQR 7 1.5 4 13
= 15 – 1.5 × 4
There are no outliers.
=9
Set 2: 5, 7, 8, 9, 9, 11, 11, 12, 13, 13,
Q3 + 1.5 × IQR 14, 14, 15, 15, 16
= 19 + 1.5 × 4 Q 2 12
= 25
Q1 8 and Q3 14
8 is an outlier because it is less than IQR 14 8 6
the lower fence of 4 in the interval Q1 1.5 IQR 8 1.5 6 1
(4–25) days. Q3 1.5 IQR 14 1.5 6 23
There are no outliers.
a days
e Range = 7 – 0 = 7 days
4+5 9
c = 2 = 4.5 days is the median c Range = 42 – 3 = 39.
2
a Order the dataset: 8, 10, | 11, 13, | 15, see that Q1 = 49, Q2 = 82, Q3 =
7 c
a Order the dataset: 0, 49, 75, 82, 97,
102, 110.
Q1 M Q3
c i The temperature stayed the same d In 2014 the pass rate was 74% by
from 12 noon to 1 p.m. 2018 this had jumped to 85%
ii The temperature dropped from 85 – 74 = an increase of 11%
3 p.m. to 4 p.m. 5
a Linear trends have points on or near a
d The points from 8 a.m. to 3 p.m. form a straight line, the house price in an
nearly straight line, then the last point
shows a noticeable decrease. Adelaide suburb are linear with a
This means that the temperature deviation in 2017.
increased in a linear way from 8 a.m. bi Every year house prices increase by
until 3 p.m., and at 3 p.m. the approximately $50 000, in 2021 would
temperature began to drop. be worth $650 000.
ii House prices would have increase by
3
$150 000. Therefore should be worth
a
approximately $750 000.
6
a
ii 1.7 km
2
a increases
4
b decreases
a
3
c a
11
a i There is a weak negative correlation.
ii There is no correlation.
5
a
3
a Estimating the y value when x = 6 is
an example of interpolation.
iii x ≈ 9 when y = 12
iv x ≈ 7 when y =15
c When x = 5, y ≈ 0.5
d When x = 5, y ≈ 50
7
a The number of people entering the
park increases with temperature
(positive correlation ↗)
10
a
Experiment 2:
4 IQR
a The mean will increase by 3. = Q3 – Q1
E.g. Consider the data set: = 10 – 4
5, 6, 7 =6
5 6 7 18
Mean = 6 Q1 – 1.5 × IQR
3 3 = 4 – 1.5 × 6
Add 3 to each: 8, 9, 10 = -5
8 9 10 27
9 Q3 + 1.5 × IQR
3 3
= 10 + 1.5 × 6
b The median will increase by 3. = 19
E.g. 5, 6 , 7 x must be less than or equal to 19 so as
M
not to be considered an outlier.
Add 3 to each: 8, 9 , 10
M
8
c The range remains unchanged. a This is simply a horizontal shift in the
E.g. 5, 6, 7 dataset, which means the mean,
Range = 7 – 5 = 2 median and mode will all increase by
Add 3 to each: 8, 9, 10 10. You can imagine the dot plot,
Range = 10 – 8 = 2 histogram or stem and leaf plot
starting from 10 units further to the
5 Q1 = 2.6 and Q3 = 3.7 right than before. This has the effect of
IQR increases the entire dataset, including
= Q3 – Q1 the mean, median and mode by 10.
= 3.7 – 2.6
= 1.1 b The mean will increase by a factor of
10.
6 Range of 8 E.g. Consider the data set:
Difference between the max and min 5, 6, 7
is 8 5 6 7 18
Mean = 6
Mode of 3 3 3
Most common value is 3
9 Hence, at the end of 2019, the total rainfall for the last ten years was 546 × 10 = 5460 mm
(2010 to 2019 inclusive). At the end of 2020, the total rainfall for the last ten years was 562 ×
10 = 5620 mm (2011 to 2020 inclusive). This represents a 5620 - 5460 = 160 mm increase.
The rainfall for years 2011 to 2019 inclusive are included in both averages. The only data
point that has changed between the ten year averages is that the rainfall in 2010 has
essentially been 'replaced' by the rainfall in 2020, i.e., 5460 mm includes the 2010 rainfall but
excludes the 2020 rainfall; 5620 mm includes the 2020 rainfall but excludes the 2010 rainfall.
In other words, the rainfall in 2020 is 160 mm larger than that of 2010. Hence, the rainfall in
2010 was 654 - 160 = 494 mm.
i 10 55 67 24 11 16
7
b 183
6
30.5
iii Median = 20
c It is positively skewed. 10, 11, 16, | 24, 55, 67
(20)
2 c
a 15, 19, 20, 24, 28, 29, 32, 34, 37, 38, 38, i 1.7 1.2 1.4 1.6 2.4 1.3
42, 49, 50
6
Stem
Leaf 9.6
5 91 6
0 428 9 1.6
2 437 8 8
2 94
0 5 ii 1.2, 1.3, 1.4, 1.6, 1.7, 2.4
Range = 2.4 – 1.2 = 1.2
1|5 = 15
iii Median =
b The data is symmetrical with a centre at 1.2, 1.3, 1.4, | 1.6, 1.7, 2.4
30–39. (1.5)
3 4
a 40 41 37 32 48 43 32 76 29 33 26 38 87
a
13
2 7 48365
562
i
13
7 43.2
35
b 26, 29, 32, 32, 33, 37, 38 , 40, 41, 43,
7 M
5 48, 76, 87
d TRUE
c
Q3 = 80
i 0.7, 1.9, 2.1 , 2.2, 2.3, | 2.4, 2.6, 9
(2.35)
4 5 7 9 10
Q1
a Mean
2.6 , 2.8, 3.1 5
Q3
35
Q1 = 2.1 5
7
Q3 = 2.6
ii IQR (4 7) 2 (5 7) 2 (7 7) 2 (9 7) 2 (10 7) 2
= Q3 – Q1
5
= 2.6 – 2.1 26
= 0.5 5
2.3 (to1 decimal place)
iii Q1 – 1.5 × IQR
= 2.1 – 1.5 × 0.5 b
= 1.35
11 3 5 5 9
Mean
Q3 + 1.5 × IQR 6
= 2.6 + 1.5 × 0.5 24
= 3.35 6
4
0.7 is an outlier as it is below the interval
(1.35 – 3.35)
(1 4) 2 (1 4) 2 (3 4) 2 (5 4) 2 (5 4) 2 (9 4) 2
iv 5
46
6
2.8 (to1decimal place)
8 10
a FALSE a linear
range A = 100 – 20 = 80
range B = 90 – 10 = 80 b From 10 a.m. to 11 a.m. The increase is
4∘ C (from 22∘ C to 26∘ C)
b TRUE
median A = 50 11
median B = 50 a Negative (↘)
c TRUE b None
IQR (A)
= Q3 – Q1 c Positive (↗)
= 80 – 40
= 40
c Strong
d (3, 5)
13
2 C
8 C
Stem Leaf
2 49
3 1178
4 246
5 04
9 D
3 B Weak negative (↘)
10 B
Stem Leaf
2 49
3 1178
4 246
5 04
4 A
2, 3, 4, 5, 6, 10
Range = 10 – 2 = 8
Mean =
2 4 3 5 10 6 30
5
6 6
5 C
12 15 18 | 22 26 26
(20)
6 A
Q1 = 5 and Q3 = 13
IQR
= Q3 – Q1
= 13 – 5
=8
a 15, 21, 24, 32 , 36, 39, 50, | 51, i 7 customers yields $50 profit
Q1 (50.5)
IQR
= Q3 – Q1
= 73 – 32
= 41
b Q1 – 1.5 × IQR
= 32 – 1.5 × 41
= -29.5
Q3 + 1.5 × IQR
= 73 + 1.5 × 41
= 134.5
No outliers
b
i 10 customers yields $80 profit
ii 20 customers yields $150 profit
iii 30 customers yields $240 profit