0% found this document useful (0 votes)
32 views20 pages

Statistics and Experimental Design Overview

The document outlines key concepts in experimental design, statistics, and data collection, emphasizing the importance of precision and accuracy in experiments. It details types of statistics, sampling methods, and variables, including their classifications and implications for data analysis. Additionally, it discusses statistical terminology, mean, variance, standard deviation, and the significance of sample size in minimizing margin of error.

Uploaded by

Hassan Danwanka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views20 pages

Statistics and Experimental Design Overview

The document outlines key concepts in experimental design, statistics, and data collection, emphasizing the importance of precision and accuracy in experiments. It details types of statistics, sampling methods, and variables, including their classifications and implications for data analysis. Additionally, it discusses statistical terminology, mean, variance, standard deviation, and the significance of sample size in minimizing margin of error.

Uploaded by

Hassan Danwanka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

LECTURE OUTLINE

1
Experiment

Test or Series of tests in which:

variables) are changed

response on the output

Examples:

DATA COLLECTION/RECORDING

hus, the knowledge of experimental design (DOE) is highly requested.

Experimental Precision and Accuracy:

Precision :level of reproducibility of a given observation or analysis

Accuracy : how exact is the obtained experimental data from either the expected or theoretical value .

2
STATISTICS

Field of study that deals with data:

Data can be obtained through observation/experimentation

Biostatistics is the statistical analysis of biological data.

3
TYPES OF STATISTICS

1. Descriptive Statistics:

 Described the central tendency of the data i.e. the position or value of the center point e.g. mode, mean

and median

 Arrangement or dispersion of the data, how closely or far apart the values are from one another.

 Described the distribution of the data arrangement e.g. skewedness (probability curve-tail), Kurtosis

(probability curve-peak), symmetrical, logarithmic, bell-shaped, sigmoid etc.

2. Inferential (Estimation) Statistics

 It described the population statistics using the sample’s statistics

 Compare the significant differences between two or more populations

 Define or test relationships between variables


4
SOME STATISTICAL TERMINOLOGY

Population:

A population is the collection of objects /subjects under the study. Example products, people, animals,

microbes, proteins, explants, etc.

Population size (number) normally denoted by (N), can varied from as small as tenth to as large as billionth or

trillionth.

Example of study population: all car owners in Nigeria.

Sample:

A sample is a subset of the population under study.

Sample size is represented by (n)

For example, 16 randomly selected Sprague Dawley rats and 5 species out of 50 isolates etc.

Mode: data attributes or value with highest observed occurrence

Median: The middle value or attribute of the distribution


5
6
Variables

Are the objects ( e.g. plant species, microbial isolates etc.) or traits (characteristics such as yield, stability,

transformation efficiency etc.) which are being measured or observed.

For example, body weight, Age, bird species, temperature, lipid contents, molecular weight, enzyme activity,

number of amplicon per PCR cycle.

7
Types of Variables

Variables with two options such as true or false, and Yes or NO options are called Dichotomous variables.

Thus, variable can also be dependent or independent

 The Dependent (Response) mostly plotted on Y-axis, is the main trait under observation.

 The Independents are the inputted factors or treatments administered. Normally placed on X-axis

Example, The effect of chicken feeds supplement on the animal’s weight. The dependent variable is the

Animal weight while the independent variable is the supplemented feed type

Analysis with one dependent variable is called Univariate, while that of multiple variables is described as

Multivariate analysis

8
Variable can also be classified as either:

 Qualitative (categorical) :Mostly consist of text, classified further into:

 Norminal: are labels variables e.g. species isolates, cultivars type, gender, etc. we can only count these

variables nothing more.

 Ordinal: we can count and rank them based on their levels or stages, e.g. education, disease conditions,

reproduction, life cycle and growth profile, etc.

 Quantitative: Are numerical values, can use to perform calculations e.g. temperature, pressure, age,

weight etc. further classified into:

Interval: are numerical variables with specified intervals, can count , rank them and perform numerical

calculations. E.g. Temperature, we can compute a difference between normal body temperature of 37°C to that

of feverish body (42°C). Microbial culture (between 1 day old culture and 3 days old one), differences in
9
experimental time. It should be noted that INTERVAL variable have no true zero point. Zero in them is just a

reference, e.g. zero day microbial culture means at the initial day, 0°C does not mean no temperature, but

rather a reference point at which water freezes.

Ratio: constitute most of the quantitative variables , can be counted, ranked, take difference and ratio, and

they do have a true zero point. E.g. weight, Age etc.

Discrete: these are normally obtained by counting e.g. Number of colonies per plate. They are always finite

(whole) numbers i.e. you can only have 10, 50, 40, 300, NOT 10.5, 9.1, 3.7….

Continuous : can take infinite numbers (1,1.4,2,4.5..), they are values normally obtained by measurements

e.g. number of amplicons per PCR cycles, weight and Age

10
MEAN

The population Mean (μ) or sample mean (X) is the expected value of the statistical data. In other words, it is

the average of the data under analysis i.e.:

𝜇= Σ𝑋𝑁 (Eq. 1) 𝑋 = Σ𝑋𝑛 (Eq. 2)

where N or n is the respective population or sample size, and ΣX (X1+X2+X3……Xi).

Example, the following is a data describing ISI publication from a certain university over a period of 5 years.

2009 (579), 2010(456),2011(648),2012(567)and 2013(65).

𝑚𝑒𝑎𝑛= 520+313+648+587+1305=𝟒𝟑𝟗.𝟔

Variance (population variance σ2, and sample variance S2)

 Is a measure on how far a set of numbers are spread out or related to one another and the mean.

Variance is always a positive number.

 Zero variance value means responses have equal values


11
 Small variance value means response values are closer to the mean and closer to each other

 Alternatively, large variance value denote the response are widely or unevenly spread out form each

other and the mean as well.

Empirically, population variance is calculated by:

1. Find the differences of each response value from the population mean

2. Square the differences and Sum-up

3. Divide the sum-up value with population number (N).

𝜎2= Σ𝑥−𝜇2𝑁 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒= 80.42+−126.62+208.42+147.42+−309.625= 𝟑𝟔𝟕𝟎𝟎.𝟐𝟒

Note: If the variance to be calculated is from a sample data taken out of a total population, then the denominator

will be n-1.

12
Standard Deviation (population SD = σ, sample SD = S)

square root of variance𝜎2.

describes how the data are spread out from one another and the mean.

𝑆𝐷= 36700.24=191.57

In conclusion

r research to

others in a comprehensible way.


13
variables.

Standard deviation, explain the concept of statistics.

14
SAMPLING IN STATISTICS

Samples are parts of a population. For example, you might have a list of information on 100 people (your

“sample”) out of 10,000 people (the “population”).

 ideal i.e. not too large or too small.

 Then once you’ve decided on a sample size, you must use a sound technique to collect the sample from

the population.

 Probability Sampling uses randomization to select sample members. Example a chance of picking red

apple out of 100 apples in a basket.

 Non-probability sampling uses non-random techniques (i.e. the judgment of the researcher). You can’t

calculate the odds of any particular item, person or thing being included in your sample.

COMMON SAMPLING TYPES

15
Bernoulli samples: have independent Bernoulli trials (experiment with two outcomes) on population elements.

Samples are selected based on the trials outcomes. The sample sizes in Bernoulli samples follow a binomial

distribution.

Cluster samples: divide the population into groups (clusters). Then a random sample is chosen from the

clusters. It’s used when researchers don’t know the individuals in a population but do know the population

subsets or groups. Systematic samples: you select sample elements from participants list.

Simple Random Sampling (SRS): Select items completely randomly, so that each element has the same

probability of being chosen as any other element.

Stratified sampling is like cluster sampling, you divide the main population each into homogenous

subpopulation. You then apply simple random or a systematic method to choose sample from each

subpopulation independently. Stratified Randomization: a sub-type of stratified used in clinical trials. First,

divide patients into strata, then randomize with permuted block randomization.

16
Bootstrap Sample: Select a smaller sample from a larger sample with Bootstrapping (a type of resampling

where you draw large numbers of smaller samples of the same size, with replacement, from a single original

sample). Maximum Variation Samples when you want to include extremes (like rich/poor or young/old).

Respondent Driven Sampling. A chain-referral sampling method where participants recommend other people

they know.

SAMPLE ERROR

margin of error.

out of a 1000, and you got 19.357%. If the actual

percentage is 19.300%, the difference (19.357 – 19.300) of 0.057 or 0.3% = the margin of error. 𝑚𝑎𝑟𝑔𝑖𝑛 𝑜𝑓

𝑒𝑟𝑟𝑜𝑟= 1𝑛 where n is the sample size.

17
of error, except in cluster sampling, where it may increase due

to similarities among clusters members

Non-sampling error could be one reason as to why there’s a difference between the sample and the

population. This is due to poor data collection methods (like faulty instruments or inaccurate data recording),

selection bias, nonresponse bias (where individuals don’t want to or can’t respond to a survey), or other

mistakes in collecting the data.

hey key is to avoid making the errors in the first

place with a well-planned design of the survey or experiment.

Computing Sample Size

sampling so that marginal errors are minimized.

18
om the example presented in CI, we have seen that marginal error is affected by the sample or replicates

size, the more the sample the less the marginal error.

or can be

calculated.

be obtained either by:

a clinical study, you may be able to use a table

published in Machin et. al’s Sample Size Tables for Clinical Studies, Third Edition.

19
u know (or don’t know)

about your population. If you know some parameters about your population (like a known standard deviation),

you can use the techniques below. If you don’t know much about your population, use Slovin’s formula.

20

You might also like