Basic Biostatistics
Faisal Mushtaq/Muhammad Umar
Farooq
Institute of Public Health, Lahore
What is Statistics?
Statistics deals with the collection
of data, its presentation, analyses
and interpretation.
Types of Statistics
• Descriptive Statistics:
It deals with the collection,
presentation and description of the data.
• Inferential Statistics:
It deals with the analyses and
drawing interpretation or conclusion from the
data.
Biostatistics
It is the application of statistical knowledge
to biology, medicine, nursing and health-
related professions.
Data
A set of observations collected for certain
characteristic of interest.
Primary Data: Original data that has been
collected specially for the purpose in mind and
has not undergone any statistical treatment.
Secondary Data: The data that has undergone
some sort of statistical treatment, either
descriptive or inferential.
Types of Data
Qualitative Quantitative
Qualitative Data
It is the categorical or non numerical
description of the data, there are no
integers involve in it.
• Example: Hair Color, Gender, Socio-economic
status, Disease Presence (Yes/No)
Quantitative Data
It is the numerical measurement of
characteristics under consideration.
Example: Age, Weight, Height, Temperature
Discrete Data
• Data in which the values take only a discrete
or discontinuous set of integers or whole
numbers, that is the values are taken by jumps
or breaks. It is also referred as “Count Data”
• Example: Number of persons in a family,
number of rooms in house, number of
patients of a certain disease.
Continuous Data
• Data in which the values take on any value
fractional or integral within a given interval
that is all possible values without gap in a
given interval. It is also referred as
“Measurement Data”
• Example: Age of a person, Temperature on a
specific day, weight of a commodity.
Types of Scales
• Nominal
• Ordinal
• Interval
• Ratio
Nominal Scale
• Classification or grouping of the observations
into mutually exclusive qualitative categories
or classes constitute a Nominal Scale. It
describes some quality and is not in specific
order.
• Example: Disease Status (Yes/No), Gender
(Male/Female), Eye Color (Black, Brown,
Hazel)
Ordinal Scale
• It includes the characteristic of a nominal scale
and in addition has the property of ordering or
ranking of measurements.
• Example: Performance of students (Excellent,
Good, Fair, Poor), Opinion about any thing
(Strongly Agree, Agree, Disagree, Strongly
Disagree), Pain Score.
Interval Scale
• Measurement scale possessing a constant
interval size but not a true zero point, is said
to be Interval Scale.
• Example: Temperature, pH Scale
Ratio Scale
• It contains characteristics of interval scale, but
also have true zero point.
• Example: Length, weight, height, distance
Variable
• Variable:
A variable is a characteristic or
condition that can change or take on different
values.
• Example:
Age, height, weight, temperature,
number of students in class, number of
patients of AIDS etc.,
Basic Terminologies
• Population:
The entire group of individuals or
objects or events is called the population.
• Sample:
It is the subset of population,
results from which are generalized for the
population.
Basic Terminologies contd.
• Parameter:
A numerical value summarizing the
data of an entire population.
• Statistic:
A numerical value summarizing the
sample data.
Displaying Data
• Tables
• Graphs/Charts
Collected Data
Example (Qualitative Data):
A sample of 100 university students were asked
to state in what way they preferred to spend their
Saturday evening. The answers are
“cinema”, “theatre”, “theatre”, “restaurant”,
“studying”, “cinema”, “cinema”, “musical
concert”, “watching TV”, “theatre”, “cinema”,
“restaurant”, “restaurant”,…........., “restaurant”
Tabulation of Data
Preferred leisure Number of students
Cinema 31
Musical concert 23
Watching TV 19
Theatre 5
Restaurant 20
Studying 2
Total 100
Simple Bar Charts
• It consists of horizontal or vertical bars of
equal widths and lengths proportional to their
frequencies.
• Bar charts are used for discrete set of data or
variables.
Bar Charts
Bar Charts
EXAMPLE: Average maximum and minimum monthly temperatures for
Karachi (in Centigrades) are given in the following table.
Bar chart of the average maximum monthly temperatures for Karachi (in
Centigrades)
Multiple Bar Charts
• It shows two or more characteristics of a
common variable in the form of grouped bars,
whose lengths are proportional to their
frequencies.
• Each characteristic is shaded or colored
differently to aid identification.
• It is good device for comparison of two or
more kinds of information for same variable.
Multiple Bar Charts
Pie Charts
Drawing PIE CHARTS
• Calculate the relative frequency or percentage of
observations in each class.
• Calculate the angle corresponding to each class by
multiplying 360.
• Draw the pie chart.
Pie Charts
Studying
2%
Restaurant
20% Cinema
31%
Theater
5%
Music Concert
Watching Television 23%
19%
Histogram
Internet Usage
14 13
12
10
10 8
Frequency
8 6 6
6 5
4
2
2
0
6.5 18.5 30.5 42.5 54.5 66.5 78.5 90.5
Time online (in minutes)
You can see that more than half of the subscribers spent
between 19 and 54 minutes on the Internet during their most
recent session.
Bar Chart Vs. Histogram
• These are used for • These are used for
Qualitative Data Quantitative Data
• There are spaces • There are no spaces
between the bars between the
• The length of each bar consecutive bars
is proportional to the • The area of each bar is
frequency of each proportional to the
variable. class intervals of each
variable.
Data Presentation (Quantitative)
The measure of location or central tendency is
a central value that the data values group
around. It gives an average value.
The measure of dispersion shows how the data
is spread or scattered around the mean.
The measure of skewness is how symmetrical
(or not) the distribution of data values is.
• Mean = Sum of all values / number of values.
– Best measure of central tendency as it takes all values into
account.
– Easily affected by any extreme value/outlier.
– Mean can only be defined on interval and ratio level of
measurement
• Median is the mid point of data when it is arranged in order.
– Best when the data set has extreme values or is skewed.
– Median is defined on ordinal, interval and ratio level of
measurement
• Mode is the most frequently occurring point in data.
– Best for nominal data set when both median and mean are
undefined.
– Mode is defined on nominal, ordinal, interval and ratio level of
measurement
Measures of Dispersion
Variation
Range Variance Standard Coefficient of
Deviation Variation
Measures of variation give
information on the spread or
variability or dispersion of the
data values.
Same centre,
different variation
Distribution Shape
Any Questions?
THANKS
FOR YOUR
ATTENTION…
42