0% found this document useful (0 votes)
12 views35 pages

2 - Data Presentation

The document provides an overview of data presentation methods in biostatistics, including tabulation, frequency distribution tables, and graphical representations. It explains how to organize raw data into arrays and tables, and details various graphical methods such as pie charts, bar charts, and histograms. The document emphasizes the importance of presenting data clearly to facilitate analysis and interpretation.

Uploaded by

rahafalgrni1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views35 pages

2 - Data Presentation

The document provides an overview of data presentation methods in biostatistics, including tabulation, frequency distribution tables, and graphical representations. It explains how to organize raw data into arrays and tables, and details various graphical methods such as pie charts, bar charts, and histograms. The document emphasizes the importance of presenting data clearly to facilitate analysis and interpretation.

Uploaded by

rahafalgrni1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Biostatistics - 2

Data presentation

Dr. Khalid Aboalbasher


Lecture Outline
• Introduction: What is data presentation
• Why di erent methods of data presentation
• Tabulation: Arrays & Tables
• How to construct frequency distribution tables
• Graphical data presentation: the how, why and when
Introduction
• The data collected in original form is called raw data, usually
these data are unorganized because the data of each participant
obtained using separate data collection form,
• It is difficult to make comparisons and draw conclusions from
raw data
• Once data has been collected, it has to be classified and
organized so that it becomes easily readable and interpretable,
that is, converted to information
• Before analyzing the data, it should be presented as tables,
charts or graphs
Tabulation
• Individual observations can be presented as either arrays or
tables
Arrays:
• An array is a matrix of rows and columns of numbers which have
been arranged in some order (ascending or descending)
• It is the most primitive way of tabulating data and it can be very
useful if it is small in size
• Without any calculations, the array can provide information
about the minimum observation, the maximum observation, the
number of observations
• Arrays can be used to present both qualitative and quantitative
data
Example of an array:

• Minimum = 2
• Maximum = 68
• Number of observation = 25
Tables:
• The tables used to present raw data are called frequency
distribution tables,
• The frequency means the number at which an observation or a
class of observations is repeated in the data set
• Distribution, means the observations in the data set are
distributed as classes or categories
• The frequency distribution table is a two column table, the first
column is called the classes or the categories, the second column
is the frequencies
• A third column can also be added, representing the percentage
or the relative frequency
• The frequency distribution table can be classified as either;
categorical frequency distribution table or grouped frequency
distribution table
• The frequency distribution tables are used to:
1- Organize data in meaningful way
2- Enable readers to make comparison among different classes in
the data set
3- Enable reader to derive initial conclusions
4- Enable researcher to draw charts and graphs for presentation of
data
Categorical Frequency Distribution Table
• It is used to present qualitative data
• The first column is called categories and the second column is
called frequencies
• In case of ordinal qualitative data, the categories are arranged in
ascending or descending order
• While in case of nominal data, the categories do not require
arrangement
• The percentage or relative frequency can be represented in the
third column
Example: Show the frequency distribution table for the following
data (Nominal data)
A B B AB O O A O O B A B O AB
O AB B B A A O B B O O O A O

Answer:
Category Frequency Percentage %

A 6 21
Frequency
X100
B 8 29
Total
O 11 39

AB 3 11

Total 28 100

Sample size (n)


Example: Show the frequency distribution table for the following
table (Ordinal data)
good excell excell very good good very good fail pass fail excelle good pass
ent ent good good nt
very pass good very excell good fail pass very goo excell good pass very
good good ent good d ent good

Answer: Category Frequency Relative


frequency
Excellent 5 0.18
Frequency
Very good 6 0.21
Total
Good 9 0.32

Pass 5 0.18

Fail 3 0.11

Total 28 1
Grouped Frequency Distribution Table
• It is used to present quantitative data, when the range of the
data is large
• The data is grouped into classes
• Each class has lower and upper class limits, class width and class
midpoint
• The class width is the difference between the upper and lower
class limits
• The class midpoint is the sum of the class limits divided by 2
Method of constructing grouped frequency
distribution table

• Determine the highest (H) and lowest (L) values in the data set
• Calculate the range: R = H – L
• Select the number of classes, which is determine by the
researcher,
• The number of classes should be between 5 – 15
• Calculate the class width by dividing the range by the number of
classes
• The lower limit of the first class is the lowest value in the data
set
• While the upper limit of the first class is the lower limit plus the
class width
• The upper limit of the first class is the lower limit of the second
class
• The upper limit of the second class is the lower limit of that class
plus the class width, and so on.
• The frequency of each class is determined by counting the values
in the data set that falls within the class limits
• Each class is read: from the lower limit to less than the upper
limit
• Final class is read: from the lower limit to the upper limit
Example: Present the following data in frequency distribution
table using 7 classes 16 18 12 17 14
5 6 2 2 0
15 20 16 15 11
Answer: 3 8 9 6 4
11 13 13 12 17
- Lower value = 85 3 5 1 5 7
- Higher value = 208 13 13 12 11 18
6 6 7 2 8
- The range: 208 – 85 = 123 17 17 15 15 11
1 9 2 5 6
- Number of classes = 7
90 18 13 15 97
- Class width = 123/7 = 17.57 rounded to 18 7 6 9
14 85 91 17 11
- Construct the table as follows 1 0 1
14 16 16 15 15
7 5 3 9 0
Classes (x) Frequency (f ) Percentage %
85 – 103 4 10%
103 – 121 5 12.5%
121 – 139 8 20%
139 – 157 9 22.5%
157 – 175 9 22.5%
175 – 193 4 10%
193 – 211 1 2.5%
Total 40 100%
Graphical data presentation
• Most people find pictures (Graphs) much more helpful than
numbers (tables) in the sense that, they present data more
meaningfully.
• The graphical presentation of data is used to describe the
concentration and dispersion of data
• The methods of graphical presentation vary according to the
type of data to be presented.
• Pie chart, bar chart, line chart, scatter plot, histogram, polygon,
curve, box plot
Pie chart:
• It is circular graphical presentation of data divided into slices
that show the relative size of the data
• It can be used for both quantitative and qualitative
data , but mainly used with qualitative data
• Each slice in the pie chart proportional to the
relative frequency of the class or category
• The size of the slice is determined by the size of its angle
• The angle is obtained by multiplying the relative frequency of
the class by 360
Advantages:
• Simple graph and easy to understand
• Data represented as a fraction from a whole
• It provides simple method for data comparison

Disadvantages:
• It is less effective when there are many classes to be represented
• It is used to represented one data set, and can not be used with
multiple data sets
O O A B A O A A A O
Example: B O B O O A O O A A
A A AB A B A A O O A
O O A A A O A O O AB

Categor Frequen Percenta


y cy ge
O 16 40
A 18 45
B 4 10
AB 2 5
Total 40 100
Bar chart:
• The bar chart is one of the most common methods of presenting
data
• Its main purpose is to display quantities in the form of bars
• It consists of a set of bars whose heights are proportional to the
frequencies that they represent
• The figure can be drawn horizontally or vertically
• It can be simple or multiple bar chart
Advantages of bar chart:
1- The quantities can be read in terms of heights of the bars
2- Comparison can be made between values of a variable
3- It can be used for quantitative and qualitative data

Disadvantages of bar chart:


1- The class intervals must be equal in the distribution
2- It cannot be used for continuous data
Example of simple bar chart:
Categor Frequen Percenta
y cy ge
O 16 40
A 18 45
B 4 10
AB 2 5
Total 40 100
Multiple bar chart:
• The multiple bar chart is an extension of a simple bar chart
when there are quantities of several variables to be displayed
• The bars representing the quantities for the different variables
are piled next to one another for each attribute.
Advantages:
1- Comparison may be made among components of the same
variables
2- Comparison is also possible for the same component across all
variables
Disadvantages:
1- The figure becomes very complex when there are many variables
Example of multiple chart:
Cancer
Tr Breast Prostate Colon
ea
Surgery 37 25 33
t
m Chemotherapy 11 8 27
en
Radiotherapy 5 10 4
t
Total 53 43 64
Line chart
• It is a chart used to show the trends or changes in data over time
• It is plotted using several points connected by straight line
• It consists of x – axis and y – axis, the time is represented in the x
– axis and the other factor is represented in the y – axis
• The values are then plotted in the graph and joined with a
straight lines
Example of line chart:
Number of COVID-19 cases in the Arab countries
Histogram:
• Histogram is the most commonly used graph for presenting data
• It is a set of vertical bars whose areas are proportional to the
frequencies of the classes that they represent
• The histogram is used to present quantitative continuous data
presented in frequency distribution table
• The class limits are presented on the x – axis while the
frequencies are on the y – axis
• Each class is then represented by a distance on the scale that is
proportional to its class interval
• The frequency of each class is then represented as the height of
the bar
• The histogram differ from bar chart in that there are no gaps
between the bars in the histogram
Classes 600 – 620 – 640 640 – 660 – 680 - 700 700 – Sum
620 660 680 720
Frequenci 10 15 20 25 20 10 100
es
Frequency Polygon:
• A frequency polygon is a graph of frequency distribution,
• There are two ways of drawing frequency polygon
1- First by drawing a histogram for the data
• Draw a histogram of the given data and then join the midpoints
of the upper horizontal side of each bar with adjacent ones with
straight line
• Close the polygon at both ends of the distribution by extending
the lines to the base line (x - axis), this is done by adding two
hypothetical classes with zero frequencies at each end
2- Direct construction:
• The classes are represented by their midpoints in the x – axis,
while the frequencies are represented in the y – axis.
• Two hypothetical classes with zero frequencies are added at each
end
• The points that represent the midpoint and the frequency of the
classes are then plotted
• These points are then joined using straight lines
• When the line are joined using smooth line the figure is called
the curve
Classes Midpoint Frequencies
580 – 600 590 0
600 – 620 610 10
620 – 640 630 15
640 – 660 650 20
660 - 680 670 25
680 - 700 690 20
700 – 720 710 10
720 - 740 730 0
The curve:

You might also like