INTRODUCTION TO PROBABILITY
AND STATISTICS: STAT 166
By
Jonathan Kwaku Afriyie
Department of Statistics and Actuarial Science
KNUST
[Link]@[Link]
[Link]@[Link]
May 22, 2023
1 / 24
Summarizing Data
Summarizing Data Graphically
2 / 24
Frequency Distribution Table
Definition
A frequency distribution table is the organisation of raw
data into mutually exclusive categories showing the number of
observations in each class.
3 / 24
Types of Frequency Distribution Table
4 / 24
Constructing Frequency Distribution Tables
Rules for constructing frequency distribution
To construct frequency distribution, follow the following rules:
The table should be between 5 and 20 classes.
The classes must be mutually exclusive. This implies that
the class limits must be nonoverlapping so that each
observation cannot be placed into two classes.
The classes must be continuous. This implies that, there
must be no gaps in a frequency distribution.
The classes must be exhaustive. That is, there must be
enough classes to accommodate all the data.
The classes must be equal in width and size.
5 / 24
Definitions
Class Limit
Class limit is the starting and ending
point of a particular class. The
starting value of each class is called
the lower limit of the class and the
ending value of each class is called the
upper limit. Class Width/Size
It is the difference
Class Boundary
between the lower and the
Class boundary describes the midpoint upper class boundaries of
between the upper class limit of a class a class interval. The class
and the lower class limit of the next width should be equal.
class in sequence. It is obtained by
subtracting 0.5 from each lower limit
and adding 0.5 to each upper limit.
They are also used to separate classes.
6 / 24
Definitions
Class Midpoint
Class midpoint is a point that
divides a class into two equal
parts. That is, the average of the
lower and upper class limits.
Class Frequency(f)
Class frequency is the number of
observations in each class.
7 / 24
Definitions
Relative Frequency (RF)
It describes the proportion of
values falling into that class. It is
obtained by dividing the frequency
of the class by the total frequency.
Cumulative Frequency (CF)
Cumulative frequency is obtained
by summing the frequency of a
class and the frequencies of all the
classes below it.
8 / 24
Constructing Grouped Frequency Distribution
Steps For Constructing Grouped Frequency Distribution
1. Decide the number of classes using the formula
2k ≥ n
where k is the number of classes and n is the number of
observations.
2. Determine the class width using the formula
Range H −L
w≥ =
k k
where H is the highest value and L is the lowest value.
NB: In the case of decimal value, round up the value to
get the class width.
9 / 24
Constructing Grouped Frequency Distribution
Steps For Constructing Grouped Frequency Distribution
3. Set the individual class limits. That is, set the lower limit
of the first class by starting from the lowest value in the
data, and then add the width (w) to get the lower limit of
the next class. Keep adding until there are k classes.
Subtract 1 from the lower limit of the second class to get
the upper limit of the first class and so on.
4. Tally and record the number of items in each class.
10 / 24
Constructing Grouped Frequency Distribution
Example 1
11 / 24
Constructing Grouped Frequency Distribution
Solution
1. First, we decide the number of classes using the formula
2k ≥ n
where k is the number of classes and n = 50 is the number
of people interviewed in Kotsah Island. Since 26 > 50, we
have k = 6
2. Obtain the class width:
H −L 78 − 2
w= = = 12.6667 ≈ 13
k 6
12 / 24
Constructing Grouped Frequency Distribution
13 / 24
Graphs For Quantitative Data
1. Histogram
A histogram is a graph that displays the data by using
contiguous vertical bars of various heights to represent the
frequencies of the classes. For example, Figure 1 shows the
histogram plot for the number of travel times in the table below:
14 / 24
Graphical Presentation of Data
Figure: A histogram showing the number of travel times
15 / 24
Graphs For Quantitative Data
2. Cumulative Frequency Curve (Ogive)
The cumulative frequency curve is a graph that represents
the cumulative frequencies for the classes in a frequency
distribution. For example, Figure 3 shows the Ogive plot for the
number of travel times in the table below:
16 / 24
Graphs For Quantitative Data
Graphical Presentation of Data
Figure: A cumulative frequency curve showing the number of travel
times
17 / 24
Graphs For Quantitative Data
4. Scatter plot
A scatter plot shows how much one variable is affected by
another. This is often used when we have a bivariate dataset
and we wish to determine the relationship between the two
variables.
18 / 24
Graphs For Categorical Data
1. Bar Graph
A bar graph is a graph of vertical or horizontal bars whose
heights represent the frequencies of respective categories. For
instance,the figure shows a bar graph for different types of floor
tiles produced by a construction firm in a given day.
Figure: A bar graph showing the number of floor tiles produced in a
given day 19 / 24
Graphs For Categorical Data
2. Pie Chart
A pie chart is a circle divided into sectors. Each sector
represents a category of data. The area of each sector is
proportional to the frequency of the category.
Example
Problem: The data presented in Table 6 represent the
educational attainment of residents of the United States 25
years or older in 2006, based on data obtained from the U.S.
Census Bureau. The data are in thousands. Construct a pie
chart of the data.
20 / 24
Graphs For Categorical Data
21 / 24
Graphs For Categorical Data
Approach:
The pie chart will have seven parts, or sectors,
corresponding to the seven categories of data. The area of
each sector is proportional to the frequency of each
category.
For example, 11, 742/191, 885 = 0.0612 of all U.S. residents
25 years or older have less than a 9th-grade education. The
category-less than 9th grade will make up 6.12% of the
area of the pie chart.
Since a circle has 360 degrees, the degree measure of the
sector for the category-less than 9th-grade will be
(0.0612)360o ≈ 22o . Use a protractor to measure each
angle.
Solution:
We follow the approach presented for the remaining categories
of data to obtain Table 7. 22 / 24
Graphs For Categorical Data
To construct a pie chart by hand, we use a protractor to approx-
imate the angles for each sector. See Figure 6.
23 / 24
Graphs For Categorical Data
24 / 24