University of Gondar
College of medicine and health science
Department of Epidemiology and Biostatistics
Basic Biostatistics
Wullo S. (MPH)
9/30/21 1
Chapter One
1.1 Introduction to Biostatistics
Objectives of the chapter
After completing this chapter, the student will be able to:
– Define Statistics and Biostatistics
– Identify the Branch of biostatistics
– Enumerate the importance and limitations of biostatistics
– Define and Identify the different types of data and
understand why we need to classify variables
09/30/2021 2
Definition and classification of Biostatistics
Statistics is the science of
collecting
organizing
Presenting
analysing and drawing conclusion (inferences) from
data for the purpose of making decision.
Biostatistics: The application of statistical methods
to the fields of biological and health sciences.
09/30/2021 3
Classification of Biostatistics
Descriptive biostatistics
A statistical method that is concerned with the collection,
organization, summarization, and analysis of data from a
sample of population.
Inferential biostatistics
A statistical method that is concerned with the drawing
conclusions/inference about a particular population by
selecting and measuring a random sample from the population.
09/30/2021 4
Cont…
B io s t a t is t ic s
D e s c r ip t iv e S t a t is t ic s I n f e r e n t ia l S ta t is t ic s
c o lle c t io n m a k in g in f e r e n c e s
o r g a n iz in g h y p o t h e s is t e s t in g
s u m m a r iz in g d e t e r m i n i n g r e l a t i o n s h ip
p r e s e n t in g o f d a ta m a k in g t h e p re d ic t io n
09/30/2021 5
Descriptive Biostatistics
• Some statistical summaries which are especially common in
descriptive analyses are:
Measures of central tendency
Measures of dispersion
Measures of association
Cross-tabulation, contingency table
Histogram
Quantile, Q-Q plot
Scatter plot
Box plot
09/30/2021 6
Inferential Biostatistics
09/30/2021 7
1.2 Stages in statistical investigation
There are five stages or steps in any statistical investigation.
1. Collection of data
The process of obtaining measurements or counts.
2. Organization of data
Includes editing, classifying, and tabulating the data
collected.
3. Presentation of data:
overall view of what the data actually looks like.
facilitate further statistical analysis.
Can be done in the form of tables and graphs or diagrams.
09/30/2021 8
Cont…
4. Analysis of data
To dig out useful information for decision making
It involves extracting relevant information from the data
(like mean, median, mode, range, variance…),
5. Interpretation of data
Concerned with drawing conclusions from the data
collected and analyzed; and giving meaning to analysis
results.
A difficult task and requires a high degree of skill and
experience.
09/30/2021 9
1.3 Definition of Some Basic terms
Population: is the complete set of possible measurements for which
inferences are to be made.
Census: a complete enumeration of the population. But in most real
problems it cannot be realized, hence we take sample.
Sample: A sample from a population is the set of measurements that are
actually collected in the course of an investigation.
Parameter: Characteristic or measure obtained from a population.
Statistic: A statistic (rather than the filed of Statistics) refers to a
numerical quantity computed from sample data (e.g. the mean, the
median,
09/30/2021
the maximum). 10
Parameter and statistic
09/30/2021 11
Cont...
Sampling: The process or method of sample selection from the
population.
Sample size: The number of elements or observation to be
included in the sample.
variable is a characteristic or attribute that can assume different
values in different persons, places, or things.
Some examples of variables include:
Diastolic blood pressure,
heart rate, heights,
The weights
Data: Refers to a collection of facts, values, observations, or
measurements that the variables can assume.
09/30/2021 12
Uses of statistics:
The main function of statistics is to enlarge our knowledge of
complex phenomena. The following are some uses of statistics:
Estimating unknown population characteristics.
Testing and formulating of hypothesis.
Studying the relationship between two or more variable.
Forecasting future events.
Measuring the magnitude of variations in data.
Furnishes a technique of comparison.
09/30/2021 13
Limitations of statistics
As a science statistics has its own limitations. The following are
some of the limitations:
Deals with only quantitative information.
Deals with only aggregate of facts and not with individual data
items.
Statistical data are only approximately and not mathematical
correct.
Statistics can be easily misused and therefore should be used
be experts.
09/30/2021 14
1.5 Types of Variables and Measurement Scales
A variable is a characteristic or attribute that can assume
different values in different persons, places, or things.
Examples :
age,
diastolic blood pressure,
heart rate,
the height of adult males,
the weights of preschool children,
gender of Biostatistics students,
marital status of instructors at University of Gondar,
ethnic group of patients
09/30/2021 15
A. Depending on the characteristic of the measurement, variable can be:
Qualitative(Categorical) variable
A variable or characteristic which cannot be measured in
quantitative form but can only be identified by name or categories,
for instance place of birth, ethnic group, type of drug, stages of
breast cancer (I, II, III, or IV), degree of pain (minimal, moderate,
sever or unbearable).
The categories should be clear cut, not overlapping, and cover all the
possibilities. For example, sex (male or female), vital status (alive or
dead), disease stage (depends on disease), ever smoked (yes or no).
09/30/2021 16
Quantitative(Numerical) variable:
is one that can be measured and expressed numerically.
Example: survival time, systolic blood pressure, number of
children in a family, height, age, body mass index.
they can be of two types
Discrete Variables
Have a set of possible values that is either finite or
countabl infinite.
The values of a discrete variable are usually whole
numbers.
Numerical discrete data occur when the observations are
integers that correspond with a count of some sort.
09/30/2021 17
Some common examples are:
Number of pregnancies,
The number of bacteria colonies on a plate,
The number of cells within a prescribed area upon microscopic
examination,
The number of heart beats within a specified time interval,
A mother’s history of numbers of births ( parity) and
pregnancies
The number of episode of illness a patient experiences during
some time period, etc.
09/30/2021 18
Continuous Variables
A continuous variable has a set of possible values including
all values in an interval of the real line.
No gaps between possible values.
Each observation theoretically falls somewhere along a
continuum.
Example: body mass index, height, blood pressure, serum
cholesterol level, weight, age etc.
09/30/2021 19
Con…
Observations are not restricted to take on certain numerical
values: Often measurements (e.g., height, weight, age).
Continuous data are used to report a measurement of the
individual that can take on any value within an acceptable
range.
09/30/2021 20
Nominal Scale
Level of measurement which classifies data into mutually exclusive, all
inclusive categories in which no order or ranking can be imposed on
the data.
Assign subjects to groups or categories
No order or distance relationship
No arithmetic origin
Only count numbers in categories
Only present percentages of categories
Chi-square most often used test of statistical significance
09/30/2021 21
Nominal Scale
Other Examples
Sex Social status
Marital status Days of the week (months)
Geographic location Seasons
Ethnic group Types of restaurants
Brand choice Religion
Job type : executive, technical, clerical
Coded as “0”
09/30/2021 Coded as “1” 22
Ordinal Scale
Level of measurement which classifies data into categories that can be
ranked. Differences between the ranks do not exist.
Classifies data according to some order or rank
With ordinal data, it is fair to say that one
response is greater or less than another.
E.g. if people were asked to rate the hotness of 3 chili
peppers, a scale of "hot", "hotter" and "hottest"
could be used. Values of "1" for "hot", "2" for
"hotter" and "3" for "hottest" could be assigned.
09/30/2021 23
Ordinal Scales
• Arithmetic operations are not applicable but relational
operations are applicable.
• Ordering is the sole property of ordinal scale.
Examples:
Letter grades (A, B, C, D, F).
Rating scales (Excellent, Very good, Good, Fair, poor).
Military status.
09/30/2021 24
Interval Scales
• Level of measurement which classifies data that can be ranked
and differences are meaningful. However, there is no meaningful
zero, so ratios are meaningless.
• All arithmetic operations except division are applicable.
• Relational operations are also possible.
Examples:
IQ
Temperature in oF.
09/30/2021 25
Interval Scale
Numerically equal distances on the scale represent equal values in
the characteristic being measured. An interval scale contains all the
information of an ordinal scale, but it also allows you to compare the
differences between objects.
assumes that the measurements are made in equal units.
i.e. gaps between whole numbers on the scale are equal.
e.g. Fahrenheit and Celsius temperature scales
an interval scale does not have a true zero.
e.g. A temperature of "zero" does not mean that there
is no temperature...it is just an arbitrary zero point.
permissible statistics: count/frequencies, mode, median,
mean,
09/30/2021
standard deviation 26
Ratio Scales
• Level of measurement which classifies data that can be ranked,
differences are meaningful, and there is a true zero. True ratios
exist between the different units of measure.
• All arithmetic and relational operations are applicable.
Examples: Weight
Height
Number of students
Age
09/30/2021 27
Primary Scales of Measurement
Nominal Numbers
assigned to 4 81 9
runners
Ordinal Rank order of
winners
Third Second First
Place Place Place
Interval Performance
rating on a 0 to 8.2 9.1 9.6
10 Scale
Ratio Time to finish in
20 seconds 15.2 14.1 13.4
09/30/2021 28
STATISTICS
SCALE DESCRIPTIVE INFERENTIAL
Nominal Percentages, Mode Chi-square, Binomial test
Ordinal Percentile, Median Rank-order, Correlation,
ANOVA
Interval Range, Mean, SD Correlations, t-tests, ANOVA
Regression, Factor Analysis
Ratio Geometric Mean, Coefficient of Variation (CV)
Harmonic Mean
09/30/2021 29
Excercise
Categorize the following variables into nominal, ordinal, interval or
ratio
Gender Height
Grade(A, B, C, D and F ) Weight
Rating scale(poor, good, excelent) Time
Eye colour Age
Political affilation IQ
Temprature
Religious affilation
Salary
Ranking of tennis players
Majour field
Nationality
09/30/2021 30