0 ratings 0% found this document useful (0 votes) 20 views 9 pages Review 1 - Descriptive Stats
The document provides an overview of descriptive statistics, including how to describe distributions by shape, center, and spread, as well as methods for identifying outliers. It discusses appropriate measures of center and spread based on data characteristics and includes multiple-choice questions for practice. Additionally, it covers various graphical representations of data such as histograms, boxplots, and stem-and-leaf plots.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here .
Available Formats
Download as PDF or read online on Scribd
Go to previous items Go to next items
Save Review 1 - Descriptive Stats For Later
Descriptive Statistics
Page 1 of 18
Describing or comparing distributions:
Always give the shape, center, and spread in the context of the question.
Shape - Tell what the graph looks like.
© Symmetric
© Skewed and direction of skewness
© Uniform
Peaks (modes) and the location of the peaks
Gaps in data and the location of the gaps
Unusual values in the data and the location of those values
Center - Tell (numerically) where the center of the data is.
* Mean - average; 42 for population; ¥ for sample
* Median - middle value when data is listed from low to high
# Mode - peak in the distribution or most frequently occurring value
Spread or variability - Tell (numerically) how spread out the data is.
* Standard deviation - “average” distance between the data points and the mean
* Inter-Quartile Range (IQR) = range of the middle 50% of the data = Q;- Q
© Range = maximum — minimum
Identifying an outlier
© An unusual value in the data set which is too far from its quartiles
* Numerically, this is any value that exceeds Q, +1.5/QR or is less than
Q,-1.510R
Choosing appropriate measures of center and spread
If the data is fairly symmetric, use the mean and standard deviation, These are
not resistant to the presence of skewness or outliers.
«Ifthe data is skewed or outliers are present, use the median and IQR instead.
* Use the mode or range for additional information or if the other measures cannot
be calculated from the information given
Drawing and/or interpreting graphs (by hand or on your calculator)
© Histogram (with either frequency or relative frequency on the vertical axis)
© Stemplot (also called stem-and-leaf plot)
* Dot plot (for small data sets)
© Boxplot and modified boxplot (which shows outliers)
* Cumulative frequency plot or ogive
© Normal quantile plot
Copyright ©2008 Laying the Foundation, Inc, Ds
These materials may be used fr fae-o-a
4s, Texas. All rights reserved,
hing wit students onlyDescriptive Statistics
nae Page 3 of 18
FouybanioN
Multiple Choice Questions on Descriptive Statistics
1. Ifthe largest value of a data set is doubled, which of the following is false?
(A) The mean increases.
(B) The standard deviation increases.
(C) The interquartile range increases.
(D) The range ines
(8) The median remains unchanged,
2. The five-number summary for scores on a statistics exam is 35, 68, 77, 83, 97. In all,
196 students took the test. About how many had scores between 77 and 83?
(A) 6
(B) 39
(© 49
(D) 98
(B) It cannot be determined from the information given.
3. ‘The following list is a set of data ordered from smallest to largest. All values are
integers. 212 y yy 15 18 18 19
1. ‘The median and the first quartile cannot be equal.
IL. The mode is 18.
TIL, 2 is an outlier.
(A) Lonly
(B) only
(©) Monly
(D) Land Ill only
(2) LU, and Il
Copyright © 2008 Laying the Foundation, In, Dallas, Texas. A rights reserved.
These materials maybe used fr face-to-face teaching with siden ony.Descriptive Statistics
Page 4 of 18
4. A substitute teacher was asked to keep track of how long it took her to get to her
assigned school each morning. Here is a stem plot of the data, Would you expect the
mean to be higher or lower than the median?
2]0002344578
3]0257 Key: 4|1 =41 kilometers
4}12789
3]028
6]o5
(A) Lower, because the data are skewed to the left.
(B) Lower, because the data are skewed to the right.
(C) Higher, because the data are skewed to the left.
(D) Higher, because the data are skewed to the right.
(B) Neither, because the mean would equal the median,
5. A professor scaled (curved) the scores on an exam by multiplying the student’s raw
score by 1.2, then adding 15 points. If the mean and standard deviation of the scores
before the curve were 51 and 5, respectively, then the mean and standard deviation of the
sealed scores are respectively:
(A) 76.2 and 21
(B) 76.2 and 6
(©) 76.2 and
(D) 61 and 6
(E) cannot be determined without knowing if the scores are normally distributed
Copyright © 2008 Laying the Foundation, In, Dallas, Texas. A rights reserved.
These materials maybe used fr face-to-face teaching with siden ony.Descriptive Statistics
Page 5 of 18
6. In the northern U.S., schools are often closed during severe snowstorms. These
missed days must be made up at the end of the school year. The following histogram
shows the number of days missed per year for a particular school district using data from
the past 75 years. Which of the following should be used to describe the center of the
distribution?
Frequency
10
Days Missed
0 2 4 6 8 1 12 14
(A) Mean, because it uses information from all 75 years.
8) Median, because the distribution is skewed.
© TQR, because it excludes outliers and includes the middle 50% of the data.
(D) Quartile 1, because the distribution is skewed to the right.
®) Standard deviation, because it is unaffected by outliers,
Copyright ©2008 Laying the Foundation, Inc, Ds
These materials may be used fr fae-o-a
4s, Texas. All rights reserved,
hing wit students onlyDescriptive Statistics
nae Page 6 of 18
FouybanioN
7. Which boxplot was made from the same data as this histogram?
Is
Frequency
(A)
oa #8 ooo
(B)
(©)
@)
® ‘None of the above.
Copyright © 2008 Laying the Foundation, In, Dallas, Texas. A rights reserved.
These materials maybe used fr face-to-face teaching with siden ony.Descriptive Statistics
Page 7 of 18
8. One advantage of using a stem-and-leaf plot rather than a histogram is that the stem-
and-leaf plot
(A) shows the shape of the distribution more easily than the histogram
B) changes easily from frequency to relative frequency.
© shows all of the data on the graph.
(D) _ presents the percentage distribution of the data.
® shows the mean on the graph.
9. This histogram shows the closing price of a stock on 50 days.
0 1 2% 4 40 5d 60 7 Ho 90 100
In which range does the first quartile
“& 01010
(B) 10 to 20
© 20 0 30
©) 30 to 40
® 80 to 90
Copyright © 2008 Laying the Foundation, In, Dallas, Texas. A rights reserved.
These materials maybe used fr face-to-face teaching with siden ony.Descriptive Statistics
Page 8 of 18
10, The scores of male (M) and female (F) students on a statistics exam are displayed
in the following boxplots. The pluses indicate the location of the means.
M
3040 50 wo
Which of the following is correct?
(A) The mean grade of the females is about 72.
(B) About 75% of the males score above 82.
(C) The median of the male students is about 66.
(D) The scores of the males have a higher variability than the scores of the females,
(E) About 25% of the females scored above 72.
Copyright © 2008 Laying the Foundation, In, Dallas, Texas. A rights reserved.
These materials maybe used fr face-to-face teaching with siden ony.Descriptive Statistics
Page 9 of 18
uestions on Descriptive Statis
1. Asa project in their physical education classes, elementary school students were
asked to kick a soccer balll into a goal from a fixed distance away. Each student was
given 8 chances to kick the ball, and the number of goals was recorded for each student.
‘The number of goals for 200 first graders is given in the table.
‘Number of goals scored | Number of first
graders
14
37
51
33
30
14
i
7
3
In order to compare whether older children are better at kicking goals, the exercise was
repeated with 200 fourth graders.
Number of goals scored | Number of fourth
graders
0 5
u
18
24
27
34
39
28
14
Joc] aon) un] sfers]ro]—
(a). Graph these two distributions so that the number of goals scored by the first graders
and the number of goals scored by the fourth graders can be easily compared.
(b) Based on your graphs, how do the results from the fourth graders differ from those of
the first graders? Write a few sentences to answer this question.
Copyright ©2008 Laying the Foundation, Inc, Ds
These materials may be used fr fae-o-a
4s, Texas. All rights reserved,
hing wit students onlyDescriptive Statistics
Page 10 of 18
2. Students at a weekend retreat were asked to record their total amount of sleep on
Friday and Saturday nights, The results are shown in the cumulative frequency plot
below.
CO ———
90
80
70
60
50
40
30
20
10
°
Cumulative frequency (%)
° 2 4 6 8 o 2 4 16
Total amount of sleep (hours)
(a) The graph goes through the point (11, 70). Interpret this point in the context of the
problem.
(b) Find the interquartile range for the total hours of sleep. Show your work. (Work on
the graph counts as work shown.)
(©) Check the appropriate space below and explain your reasoning,
In this distribution,
the mean amount of sleep will be less than the median amount of sleep.
the mean amount of sleep will be equal to the median amount of sleep.
the mean amount of sleep will be greater than the median amount of sleep.
Copyright ©2008 Laying the Foundation, Inc, Ds
These materials may be used fr fae-o-a
4s, Texas. All rights reserved,
hing wit students only