Tabular & Graphical Description of Data
Unit-3
Tabular and Graphical Procedures
Data
Qualitative Data Quantitative Data
Tabular Graphical Tabular Graphical
Methods Methods Methods Methods
•Frequency •Bar Graph •Frequency •Dot Plot
Distribution •Pie Chart Distribution •Histogram
•Rel. Freq. Dist. •Rel. Freq. Dist. •Ogive
•% Freq. Dist. •Cum. Freq. Dist. •Scatter
•Crosstabulation •Cum. Rel. Freq. Diagram
Distribution
•Stem-and-Leaf
Display
•Crosstabulation
Slide
2
Introduction
• A table is a display of data in numerical form
in the rows and columns of a matrix.
• A graph is a representation of data by spatial
relationship in a diagram.
• Graphs & Tables help us to summarize data
and understand relationship between
variables.
introduction
• Many graphs used in research have two axes
plotted at right angles to one another.
• The horizontal axis is called x-axis (abscissa)
and the vertical axis is called y-axis (ordinate).
• Typically X axis represents values of
independent variable and Y axis represents
dependent variable.
• A graph may have two independent variables
or no independent variables.
Tables and graphs of frequency data of
one variable
• Frequency distribution – A graph that shows
the number of scores that fall into specific
bins, or divisions of variables
– Histogram : Frequencies are represented by
continuous bars
– Polygon : Frequencies are connected by straight
lines
– Normal & Skewed curves
Tabular and Graphical Presentations
Summarizing Qualitative Data
Summarizing Quantitative Data
TT yy pp ee ss oo ff DD aa tt aa
Data
Data
Numerical
Numerical Categorical
Categorical
(Quantitative)
(Quantitative)
(Quantitative)
(Quantitative) (Qualitative)
(Qualitative)
(Qualitative)
(Qualitative)
Discrete
Discrete Continuous
Continuous
6
Construction
Construction of
of a
a Frequency
Frequency
Distribution
Distribution
Raw data Graph
Question
to be Collect
Collect Organize
Organize Present
Present Draw
Draw
addressed data
data data
data data
data conclusion
conclusion
Frequency
distribution
7
Example: Marada Inn
Guests staying at Marada Inn were
asked to rate the quality of their
accommodations as being excellent,
above average, average, below average, or
poor. The ratings provided by a sample of 20
guests are:
Below Average Average Above Average
Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average Average
8
Frequency Distribution
Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20
9
Relative Frequency Distribution
The
The relative
relative frequency
frequency of of aa class
class is
is the
the fraction
fraction or
or
proportion
proportion of
of the
the total
total number
number of of data
data items
items
belonging
belonging to
to the
the class.
class.
A
A relative
relative frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing thethe relative
relative
frequency
frequency forfor each
each class.
class.
10
Percent Frequency Distribution
The
The percent
percent frequency
frequency of
of aa class
class is
is the
the relative
relative
frequency
frequency multiplied
multiplied by
by 100.
100.
A
A percent
percent frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the percent
percent
frequency
frequency for
for each
each class.
class.
11
Relative Frequency and
Percent Frequency Distributions
Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25 .10(100) =
10
Above Average .45 45
Excellent .05 5
Total 1.00 100
1/20
= .05
12
Bar Graph
A bar graph is a graphical device for presenting
qualitative data.
On one axis (usually the horizontal axis), we specify
the labels that are used for each of the classes.
A frequency, relative frequency, or percent frequen
scale can be used for the other axis (usually the
vertical axis).
Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
The bars are separated to emphasize the fact that each
class is a separate category.
13
Bar Graph
Marada Inn Quality Ratings
10
9
8
7
Frequency
6
5
4
3
2
1
Rating
Poor Below AverageAbove Excellent
Average Average
14
Pie Chart
The pie chart is a commonly used graphical device
for presenting relative frequency distributions for
qualitative data.
First draw a circle; then use the relative
frequencies to subdivide the circle
into sectors that correspond to the
relative frequency for each class.
Since there are 360 degrees in a circle,
a class with a relative frequency of .25 would
consume .25(360) = 90 degrees of the circle.
15
Pie Chart
Marada Inn Quality
Ratings
Excellent
5%
Poor
10%
Below
Average
Above 15%
Average
45%
Average
25%
16
Example: Marada Inn
Insights Gained from the Preceding Pie Chart
• One-half of the customers surveyed gave Marada
a quality rating of “above average” or “excellen
(looking at the left side of the pie). This might
please the manager.
• For each customer who gave an “excellent” ratin
there were two customers who gave a “poor”
rating (looking at the top of the pie). This should
shoul
displease the manager.
17
Summarizing Quantitative Data
Frequency Distribution
Relative Frequency and Percent Frequency
Distributions
Dot Plot
Histogram
Cumulative Distributions
NN uu m
m ee rriicc aa ll ((QQ uu aa nn tita
tita tiv tivee ))
Ogive DD aa tt aa PP rree ss ee nn tt aa tt iioo nn
Numerical
Numerical
Data
Data
Ordered
Ordered Frequency
Frequency
Array
Array Distributions
Distributions
Stem-&-Leaf
Stem-&-Leaf Histo-
Histo-
Display gram Polygon
Polygon Ogive
Ogive
Display gram
18
Frequency
Frequency Distribution
Distribution Table
Table
Steps
Steps
1- Determine range
2- Select number of classes
• Usually between 5 and 20 inclusive
3- Compute class intervals (width)
4- Determine class boundaries (limits)
5- Compute class midpoints
6- Count observations & assign to classes
19
Example: Hudson Auto Repair
The manager of Hudson Auto
would like to have a better
understanding of the cost
of parts used in the engine
tune-ups performed in the
shop. She examines 50
customer invoices for tune-ups. The costs of
parts,
rounded to the nearest dollar, are listed on the
next
slide.
20
Example: Hudson Auto Repair
Sample of Parts Cost for 50 Tune-ups
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
21
Frequency Distribution
Guidelines for Selecting Number of Classes
• Use between 5 and 20 classes.
• Data sets with a larger number of elements
usually require a larger number of classes.
• Smaller data sets usually require fewer classes
22
Frequency Distribution (Continued)
Guidelines for Selecting Width of Classes
•Use classes of equal width.
•Approximate Class Width =
Largest Data Value Smallest Data Value
Number of Classes
23
Example: Frequency Distribution
For Hudson Auto Repair, if we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5 1
Parts Cost ($)
Frequency
50-59 2
60-69 13
70-79 16
80-89 7
90-99 7
100-109 5
Total 50
24
Relative Frequency and
Percent Frequency Distributions
Parts Relative Percent
Cost ($) Frequency Frequency
50-59 .04 4
60-69 .26 26
2/50 .04(10
70-79 .32 32 0)
80-89 .14 14
90-99 .14 14
100-109 .10 10
Total 1.00 100
25
Relative Frequency and
Percent Frequency Distributions
Insights Gained from the Percent Frequency
Distribution
• Only 4% of the parts costs are in the $50-59 class.
class
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.
26
Dot Plot
One of the simplest graphical summaries of
data is a dot plot.
A horizontal axis shows the range of data
values.
Then each data value is represented by a dot
placed above the axis.
27
Dot Plot
Tune-up Parts Cost
.
. .. . . .
. .. .. .. .. . .
. . ..... .......... .. . .. . . ... . ..
50 60 70 80 90 100 110
Cost ($)
28
Histogram
Another common graphical presentation of
quantitative data is a histogram.
The variable of interest is placed on the horizontal
axis.
A rectangle is drawn above each class interval with
its height corresponding to the interval’s frequency
relative frequency, or percent frequency.
Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent classes.
29
Histogram
Tune-up Parts Cost
18
16
14
Frequency
12
10
8
6
4
2
Parts
Cost ($)
5059 6069 7079 8089 9099 100-110
30
Histogram (Continued)
Symmetric
• Left tail is the mirror image of the right tail
• Example: heights and weights of people
.35
Relative Frequency
.30
.25
.20
.15
.10
.05
0
31
Histogram (Continued)
Moderately Skewed Left
• A longer tail to the left
• Example: exam scores
.35
Relative Frequency
.30
.25
.20
.15
.10
.05
0
32
Histogram (Continued)
Moderately Right Skewed
• A Longer tail to the right
• Example: housing values
.35
Relative Frequency
.30
.25
.20
.15
.10
.05
0
33
Histogram (Continued)
Highly Skewed Right
• A very long tail to the right
• Example: executive salaries
.35
Relative Frequency
.30
.25
.20
.15
.10
.05
0
34
Tables and graphs of frequency data of one
variable
Cumulative Frequency distribution – A
frequency distribution that shows the number
of scores that fall at or below a certain score.
• Polygon are generally used to show
cumulative Frequencies.
• Normal & Skewed curves
35
Cumulative Distributions
Cumulative
Cumulative frequency
frequency distribution
distribution shows
shows the
the
number
number of
of items
items with
with values
values less
less than
than oror equal
equal to
to
the
the upper
upper limit
limit of
of each
each class..
class..
Cumulative
Cumulative relative
relative frequency
frequency distribution
distribution –– shows
shows
the
the proportion
proportion of
of items
items with
with values
values less
less than
than or
or
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.
Cumulative
Cumulative percent
percent frequency
frequency distribution
distribution –– shows
shows
the
the percentage
percentage ofof items
items with
with values
values less
less than
than oror
equal
equal to
to the
the upper
upper limit
limit of
of each
each class.
class.
36
Example: Frequency Distribution
For Hudson Auto Repair, if we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5 1
Parts Cost ($)
Frequency
50-59 2
60-69 13
70-79 16
80-89 7
90-99 7
100-109 5
Total 50
37
Cumulative Distributions
Example: Hudson Auto Repair
Cumulative Cumulative
Cumulative Relative Percent
Cost ($) Frequency Frequency Frequency
< 59 2 .04 4
< 69 15 .30 30
< 79 31 2 +.62 62
15/50 .30(10
< 89 38 13 .76 76 0)
< 99 45 .90 90
< 109 50 1.00 100
38
exercise
The heights in inches of 30 students are
as follows:
66, 68, 65, 70, 67, 64, 68, 64, 66,
64, 70, 72, 71, 69, 69, 64, 67,
63, 70, 71, 63, 68, 67, 65, 69, 65,
67, 66, 69, 67
Prepare a cumulative frequency
distribution table showing cumulative
relative frequency, cumulative percent
frequencies.
39
Ogive
An ogive is a graph of a cumulative
distribution.
The data values are shown on the horizontal
axis.
Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies
The frequency (one of the above) of each class
is plotted as a point.
The plotted points are connected by straight
lines.
40
Ogive
Example: Hudson Auto Repair
• Because the class limits for the parts-cost
data are 50-59, 60-69, and so on, there
appear to be one-unit gaps from 59 to 60,
69 to 70, and so on.
• These gaps are eliminated by plotting points
halfway between the class limits.
• Thus, 59.5 is used for the 50-59 class, 69.5
is used for the 60-69 class, and so on.
41
Ogive with
Cumulative Percent Frequencies
Tune-up
Tune-up Parts
Parts Cost
Cost
Cumulative Percent Frequency
100
80
60 (89.5, 76)
40
20
Parts
Cost ($)
50 60 70 80 90 100 110
42
Frequency
Frequency Distribution
Distribution Table
Table
Another
Another Example
Example
Raw Data: 24, 26, 24, 21, 27, 27, 30, 41, 32, 38
Class Frequency
15 but < 25 3
25 but < 35 5
35 but < 45 2
43
Frequency
Frequency Distribution
Distribution Table
Table
Example
Example (Continued)
(Continued)
Raw Data: 24, 26, 24, 21, 27, 27, 30, 41, 32, 38
Class
Class Midpoint
Midpoint Frequency
Frequency
15
15 but
but << 25
25 20
20 33
25
25 but
but << 35
35 30
30 55
Width
35
35 but
but << 45
45 40
40 22
Boundaries (Upper + Lower Boundaries) / 2
44
Stated
Stated and
and True
True (or
(or Real)
Real) Class
Class Limits
Limits
True Classes: Are those classes such that the
upper true limit of a class is the same as the lower
true limit of the next class.
For comparison, the stated class limits and true
class limits are given in the following table—next
slide:
45
Stated
Stated and
and True
True (or
(or Real)
Real) Class
Class Limits
Limits
Stated True
$600-$799 $599.50 up to but not including
$799.50
$800-$999 $799.50 up to but not including
$999.50
In the first column of the above table the data were
rounded to the nearest dollar. For example, $799.50
was rounded up to $800 and tailed in the second
class. Any amount over $799 but under 799.50 was
rounded down to $799 and included in the first class.
Thus, the $600-$799 class actually includes all data
from $599.50 inclusive up to but not including
$799.50. 46
Relative
Relative Frequency
Frequency & &
%
% Distribution
Distribution Tables
Tables
Example
Example (Continued)
(Continued)
The relative frequency of a class is obtained by dividing the
class frequency by the total frequency, which in the following
problem = 10.
Relative Frequency Percentage
Distribution Distribution
Class
Class Prop.
Prop. Class
Class %
%
15
15 but
but << 25
25 .3
.3 15
15 but
but << 25
25 30.0
30.0
25
25 but
but << 35
35 .5
.5 25
25 but
but << 35
35 50.0
50.0
35
35 but
but << 45
45 .2
.2 35
35 but
but << 45
45 20.0
20.0
47
Cumulative
Cumulative Percentage
Percentage
Distribution
Distribution Table
Table
Example
Example (Continued)
(Continued)
Raw Data: 24, 26, 24, 21, 27, 27, 30, 41, 32, 38
Percentage
Class Cumulative less than lower
Percentage class boundary
15 but < 25 0.0
Lower 25 but < 35 30.0
class
35 but < 45 80.0 30% + 50%
boundary
45 but < 55 100.0 80% + 20%
48
Tables and graphs that show the
relationship between two variables
• Scattergrams : A graph showing responses of a
number of individuals on two variables: visual
display of correlation data.
Example: Panthers Football Team
Scatter Diagram
The Panthers football team is interested
in investigating the relationship, if any,
between interceptions made and points scored.
x = Number of y = Number of
Interceptions Points Scored
1 14
3 24
2 18
1 17
3 27
Slide
50
Example: Panthers Football Team
Scatter Diagram
Number of Points Scored y
30
25
20
15
10
5
0 x
0 1 2 3
Number of Interceptions
Slide
51
Example: Panthers Football Team
The preceding scatter diagram indicates a
positive relationship between the number of
interceptions and the number of points scored.
Higher points scored are associated with a
higher number of interceptions.
The relationship is not perfect; all plotted
points in the scatter diagram are not on a
straight line.
Slide
52
Scatter Diagram
A Positive Relationship
y
Slide
53
Scatter Diagram
A Negative Relationship
y
Slide
54
Scatter Diagram
No Apparent Relationship
y
Slide
55
Tables and graphs that show the
relationship between two variables
• Tables with one independent and one dependent
variable:
– Median split – a division of the subjects in a study into
TWO groups of equal size on the basis of one of the
variable.
– Line graphs - A graphical representation using lines to
show relationships between quantitative variables.
– Bar graphs - A graphical representation of categorical data
in which the heights of separated bars, or columns, show
the relationship between variables
Bar Graph
• A bar graph can
Spanish be used to display
and compare
Mandarin
data
• The scale should
include all the
Hindi
data values and
be easily divided
English into equal
intervals.
0 200 400 600 800 1000
Preparing data for analysis
• Data reduction:
– The process of transcribing data from individual data sheets to a
summary form ( process -------raw data to statistical analysis)
• Proceeding with analysis
1. Put the data into matrix form in a summary data sheet
2. Do preliminary statistics and plots
3. Check for invalid data and make corrections
4. Check for missing data and replace with missing data code
5. Check for wild data and remove
6. Do descriptive statistics
7. Describe data numerically
8. Describe data graphically
9. Perform inferential statistics