0% found this document useful (0 votes)
12 views51 pages

Data Organization and Presentation Techniques

The document discusses descriptive statistics, focusing on the organization, summarization, and presentation of data. It explains raw data, ordered arrays, frequency distributions, and various methods for presenting data graphically, including bar charts, pie charts, and histograms. Additionally, it outlines the steps for constructing frequency distribution tables and the importance of graphical representations in data analysis.

Uploaded by

chalamekonnen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views51 pages

Data Organization and Presentation Techniques

The document discusses descriptive statistics, focusing on the organization, summarization, and presentation of data. It explains raw data, ordered arrays, frequency distributions, and various methods for presenting data graphically, including bar charts, pie charts, and histograms. Additionally, it outlines the steps for constructing frequency distribution tables and the importance of graphical representations in data analysis.

Uploaded by

chalamekonnen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

UNIT 2.

1
DATA ORGANIZATION AND PRESENTATION
DESCRIPTIVE SUMMARY STATISTICS

.Descriptive statistics: Techniques used to


organize and summarize a set of data in more
comprehensible and meaningful way.
– Organization of data
– Summarization of data
– Presentation of data

 Numbers that have not been summarized and


organized are called raw data
2

Raw data
Definition
 Data that have been collected or
recorded but have not been arranged or
processed yet are called raw data
3

Example1: Ages of 50 students in years

21 19 24 25 29 34 26 27 37 33
18 20 19 22 19 19 25 22 25 23
25 19 31 19 23 18 23 19 23 26
22 28 21 20 22 22 21 20 19 21
25 23 18 37 27 23 21 25 21 24
Ordered array
Ordered array : is a simplearrangement
of individual observations in the order of
magnitude
- Example: Ages of 50 students

18 19 19 21 22 23 23 25 26 31
18 19 20 21 22 23 24 25 27 33
18 19 20 21 22 23 24 25 27 34
19 19 20 21 22 23 25 25 28 37
19 19 21 21 22 23 25 26 29 37

 Very difficult with large sample


7
5

Presentation of data

Data
Qualitative Data Quantitative Data

Tabular Graphical Tabula Graphical


Methods Methods r Methods
Methods
6

Frequency Distribution

 Frequency distribution: is a table that summarizes a


raw data into non-overlapping classes or categories along
with their corresponding class frequency.

 Class frequency: The number of observations that fall


into the class

 The objective is to provide insights about the data that


cannot be quickly obtained by looking only at the original
data
Frequency Distribution for categorical 7

variables
 Count the number of observations (frequency) in e a c
h category and present as relative
frequencies.

 Often presented in the form of Table, Bar and


Pie charts.
Frequency Distribution for categorical 8

variables
 A relative frequency distribution: Shows the proportion
of counts that fall into each class or category.

 For nominal and ordinal data, frequency distributions are


often used as a summary.

 The % of times that each value occurs, or the relative


frequency, is often listed.

 Ta b l e s make i t easier to see how the data


a r e distributed.
Example 1: Nominal data
Table 1: Type of hospitals owned by MOH in
Ethiopia in 2006/07

Source: Health and health related indicator


9
Example 2: Ordinal data
Table 2: Levelof satisfaction, with nursing
care by 475 psychiatric in-patients, 1991

10
Frequency Distribution for numerical
variables

 A frequency distribution can also show the number of


observations at different values or within certain
ranges

 There are two types of frequency distribution:


– Single value (ungrouped frequency)
– Interval type (classes) – grouped frequency

11
A).Ungrouped Frequency Distribution

 Ungrouped frequency distribution: Consists of a


single data with their respective frequency

 Can be used when the range of values in the


data set is not large

 Classes are one unit in width

12
Example:
 Leisure time in hours per week for 40 college
students:

23 24 18 14 20 36 24 26 23 21 16 15 19 20
22 14 13 10 19 27 29 22 38 28 34 32 23
19
21 31 16 28 19 18 12 27 15 21 25 16

Construct a frequency distribution table?

13
Leisure time Frequency
(hours)
10 1
12 1
13 1
14 2
15 2
16 3
18 2
19 4
20 2
21 3
22 2
23 3
24 2
25 1
26 1
27 2
28 2
29 1
31 1
32 1
34 1
36 1
38 1
Total 40
14
[Link] Frequency Distribution

 Can be used when the range of values in the


data set is large.

 The data must be grouped into classes that are


more than one unit in width.

20
16
Grouped Frequency Distribution

 Steps in Constructing Frequency Distribution


Tables
Step 1: Determine the range of the data.
- R = Highest Value – Lowest Value
17

Step 2: Determine the number of classes (k) and the


corresponding width, we may use:

Where;
K = number of class intervals n = no. of observations
W = width of the class interval L = the largest value
S = the smallest value
18

Step 3: For each class, count the number of


observations (class frequency)

Step 4: Determine the relative frequency for each


class

Frequency of each class interval


Relative frequency =
Total
number of
19

Grouped Frequency Distribution

The classes must be mutually


exclusive

The classes must be

continuous The class must

be equal in width
20

Example:
 Leisure time (hours) per week for 40 college
students:

23 24 18 14 20 36 24 26 23 21 16 15 19 20
22 14 13 10 19 27 29 22 38 28 34 32 23
19
21 31 16 28 19 18 12 27 15 21 25 16

Maximum value = 38, Minimum value = 10 K

= 1 + 3.322 (log40) = 6.32  6 Width


21
22

 Cumulative frequencies: When frequencies of two


or more classes are added

 Cumulative relative frequency: The proportion of the


total number of observations that have a value less than
or equal to the upper limit of the interval

 Mid-point: The value of the interval which lies


midway between the lower and the upper limits of a
class
23

 True limits: Are those limits that make an


interval of a continuous variable continuous in
both directions

 Used for smoothening of the class intervals

 Subtract 0.5 from the lower and add it to the


upper limit
29
25

Graphical presentation

Importance of Graphical presentation:


 Diagrams have greater attraction than mere figures

 They give quick overall impression of the data.

 They have great memorizing value than mere


figures
 They facilitate comparison

 Used to understand patterns and trends


Types of graphs
 Categorical data
– Bar chart
– Pie-chart
 Quantitative data
– Histogram
– Frequency Polygon
– Ogive
– Stem-and-leaf plot
– Box plot
– Scatter Diagram
26
Bar chart

Definition:
 A graph made of bars whose heights represent the
frequencies of respective categories is called a
bar graph.

27
Bar chart
 Used to display frequency contained in the
frequency distribution of categorical variable

 It is used with categorical data

 Each bar represent one category and its height is the


frequency or relative frequency
o y – axis: Frequency or the relative
frequency or percentage
o x – axis: Category
28
Bar chart
Rules
o
Bars should be separated

o
The gap between each bar is uniform

o
All bars should be of the same width

o
All the bars should rest on the same line called the
base

o
It is very important that Y axis begin with 0

o
Label both axes clearly
Simple bar chart
The simple bar chart is appropriate if only one
variable is to be shown
60
53.9
50
40.6
40
Percentage

30

20

10 5.5

0
First trimester Second trimester
Third trimester
Figure 1 : First ANC booking time among pregnant women in
38
Debre Berhan Town, Ethiopia, 2017
Clustered bar chart
95 9
90
0 First
85
day
80 74.
75 Second and subsquent
3
70 days
65
60
55
Percen

50
45
40
t

35
30 25.
25 7
20
15 10.
10 0
5
0
Urban Rural
Residence

Figure 2 : Timing of health care seeking


forresidence,
of U5 children by place
in Jeldu District, Ethiopia, 39
Pie-chart
A pie chart: is a circle that is divided into
sections according to the percentage of
frequencies in each category of the distribution

 Used for a single categorical variable relative


frequency.

 Each slice of pie correspond at


relative frequency of categories of variable.

32
Pie-chart

Steps to construct a pie-chart


 Construct a frequency table

 Change the frequency into percentage (P)

 Change the percentages into degrees, where:


degree = Percentage X 360o
 Draw a circle and divide it accordingly

33
Example
Digestive
System Others
8%
Injury and 4%

Poisoning
3%
Respiratory
system ciculatory
13% system
42%

Neoplasmas
30%

Figure 3: Distribution for cause of death for


females, in England and Wales, 1989
34
Histogram

 Histograms are frequency distributions with


continuous class intervals that have been
turned into graphs

To construct a histogram, we draw the;


A) Interval b o u n d a r i e s on a h o r i z o n t a l l ine and
B)The frequencies on a vertical line

35
Histogram
 In a histogram, the bars are drawn adjacent to
each other

 The bars are drawn to touch each other, to show the


underlying continuity of the data

In a histogram, the area of each bar is proportional to


the frequency of observations in the interval

36
Example
Using the following frequency distribution
of the home runs hit by Major League
Baseball teams during the 2002 season,
construct the histogram

Total124
Home Runs
– 145 6 f
146 – 167 13
168 – 189 4
190 – 211 4
212 - 233 3
37
Class boundaries and their
Frequency and cumulative
frequency distributions
Total Home Cumulative
Class Boundaries Frequency
Runs frequency
124 – 145 123.5 - 145.5 6 6
146 – 167 145.5 - 167.5 13 19
168 – 189 167.5 - 189.5 4 23
190 – 211 189.5 - 211.5 4 27
212 - 233 211.5 - 233.5 3 30
Total 30
38
Histogram

• 12

• 9
Frequency

• 6

• 3
• 15

• 0
123.5 145.5 167.5 189.5 211.5 233.5
Figure 4: Total home runs hit by all players of each of
theLeague
Major 30 Baseball teams during the 2002 4
Frequency polygon

 Frequency polygon: Is a graph formed by joining the


midpoints of the tops of successive bars in a
histogram with straight lines

 The total area under the frequency polygon is


equal to the area under the histogram

48
Frequency polygon

15

12

9
Frequency

0
134.5 156.5 178.5 200.5 222.5

Figure 5: Total home runs hit by all players of each of


the 30League Baseball teams during the 2002
Major
49
42

Ogive

 Ogive: Is a curve drawn for the


cumulative frequency distribution by joining
with straight lines the dots marked above
the upper boundaries of classes at heights
equal to the cumulative frequencies of
respective classes
43
Ogive

It is obtained as follows:
On a vertical axis we mark cumulative frequency
On a horizontal axis we mark the upper boundaries
of all classes.
However, the lower boundary of the first class will
be the starting point
Then, a smooth curve is drawn joining all these
points
44

Class boundaries and their Frequency


and cumulative frequency
distributions
Total Home Cumulative
Class Boundaries Frequency
Runs frequency
124 – 145 123.5 - 145.5 6 6
146 – 167 145.5 - 167.5 13 19
168 – 189 167.5 - 189.5 4 23
190 – 211 189.5 - 211.5 4 27
212 - 233 211.5 - 233.5 3 30
Total 30
Ogive
• 30

• 25
Cumulative frequency

• 20

• 15

• 10

•5
▫ 123.5 145.5 167.5 189.5 211.5 233.5

Major League Baseball teams during the 2002


•Figure 6: Total home runs hit by all players of each of 53
season
46
Stem-and leaf plot
⬥ Another common tool for visually
displaying continuous data is the “stem
and leaf” plot

⬥ Allows for easier identification of


individual values
in the sample
⬥ Very similar to a histogram

⬥ Are most effective with relatively small


data sets

⬥ Helps to understand the nature of data


47
Stem-and leaf plot
 Can be constructed as follows:
(1)Separate each data point into a stem
component and a leaf component
The stem component consists of the
number formed by all but the rightmost
digit of the number, and the leaf
component consists of the rightmost
digit. Thus the stem of the number 483
is 48, and the leaf is 3

(2)Write the smallest stem in the data set in


the
Data of birth weights from
100 consecutive
deliveries

48
Stem Leaves
Stem-and-leaf plot for the
birth weight data
(N=100)

49
Stem-and-leaf plot can be constructed
as follows:
(3)Write the second stem, which equals the
first stem
+ 1, below the first stem

(4)Continue with step until you reach the


largest stem in the data set

(5)Draw a vertical bar to the right of the


column of stems

(6)For each number in the data set, find the


appropriate stem and write the leaf to the
right of the vertical bar
51

you
hank
T

You might also like