STATISTICS
Complete Detailed Short Notes
Part-I & Part-II | Total Marks: 200
Definitions • Formulas • Worked Examples • Q&A • Viva Guide
PART - I (Marks: 100)
Chapter 1: Introduction to Statistics
1.1 Definition & Scope
Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting
numerical data to assist in making more effective decisions.
— Croxton & Cowden: "Statistics may be defined as the science of collection, presentation,
analysis and interpretation of numerical data."
Scope of Statistics: Statistics is applied in Economics, Business Administration, Agriculture, Medicine,
Psychology, Social Sciences, Engineering, and Government planning.
1.2 Classification of Statistics
• Descriptive Statistics: Deals with collecting, summarizing, and presenting data (e.g., mean,
median, graphs).
• Inferential Statistics: Uses sample data to draw conclusions about a population (e.g.,
hypothesis testing, confidence intervals).
1.3 Types of Variables
Variable: Any characteristic or attribute that can take different values across individuals
or observations.
Type Description Examples
Qualitative Non-numeric; describes Gender, Blood
(Categorical) categories or groups. Type, Color
Quantitative Numeric; can be measured or Height, Weight,
(Numerical) counted. Temperature
Discrete Countable, finite or countably Number of
infinite values. students, Cars sold
Continuous Any value within an interval; Height (cm), Time
infinite possibilities. (sec)
Nominal Categories with no natural order. Religion, Nationality
Ordinal Categories with a natural order Education level,
but unequal intervals. Ranks
Interval Ordered; equal intervals; no true Temperature (°C),
zero. IQ scores
Ratio Ordered; equal intervals; true Weight, Income,
zero exists. Distance
1.4 Primary vs Secondary Data
• Primary Data: Data collected firsthand by the researcher for a specific purpose (e.g., surveys,
experiments).
• Secondary Data: Data collected by someone else and used by the researcher (e.g., government
publications, census).
Key Examination Q&A
Q1. What is Statistics? Give its importance.
Ans: Statistics is the science of collecting, presenting, analyzing, and interpreting
numerical data. It is important in decision-making, research, government planning,
business, and medicine.
Q2. Distinguish between Descriptive and Inferential Statistics.
Ans: Descriptive Statistics summarizes and describes data using measures like mean
and graphs. Inferential Statistics uses sample data to make predictions or test
hypotheses about a population.
Q3. What is a Variable? Classify with examples.
Ans: A variable is a characteristic that varies across individuals. Types: Qualitative
(gender), Quantitative (age), Discrete (number of children), Continuous (weight),
Nominal, Ordinal, Interval, Ratio.
Q4. What is the difference between a population and a sample?
Ans: A population is the complete set of all elements under study. A sample is a subset
of the population selected for observation.
Q5. What are the four levels of measurement?
Ans: Nominal (categories only), Ordinal (order matters), Interval (equal spacing, no true
zero), Ratio (equal spacing, true zero exists).
Chapter 2: Presentation of Data
2.1 Tabular Presentation
• Simple Frequency Table: Lists values/classes and their frequencies.
• Relative Frequency Table: Shows proportion or percentage of each class.
• Cumulative Frequency Table: Running total of frequencies up to each class.
• Cross-tabulation (Contingency Table): Displays frequencies of two categorical variables
simultaneously.
2.2 Diagrammatic & Graphical Presentation
Chart / Diagram Best Used For Key Feature
Bar Chart Comparing categories; Bars separated; width
(Simple/Multiple/Stacked) discrete data constant
Pie Chart Showing parts of a whole; Central angle = (f/n) ×
percentages 360°
Histogram Continuous grouped data; Bars touching; area ∝
frequency frequency
Frequency Polygon Comparing distributions; Connects midpoints of
continuous data histogram bars
Ogive (Cumulative Finding Median, S-shaped cumulative
Frequency Curve) Quartiles, Percentiles curve
Stem-and-Leaf Plot Small datasets; preserves Stems = leading digits,
raw data leaves = trailing
Box Plot (Box-and- Showing spread, Q1, Median, Q3, Min,
Whisker) skewness, outliers Max
Scatter Diagram Relationship between two Each point = one
quantitative variables observation pair
2.3 Construction of Pie Chart
Example: Expenditure: Food 40%, Housing 25%, Education 20%, Others 15%.
Central angles: Food = 0.40×360 = 144°, Housing = 90°, Education = 72°, Others = 54°.
Key Examination Q&A
Q1. What is a Histogram? How does it differ from a Bar Chart?
Ans: A Histogram represents the frequency distribution of continuous data with touching
bars (area ∝ frequency). A Bar Chart has separated bars for discrete/categorical data.
Q2. What is an Ogive? What is it used for?
Ans: An Ogive is a cumulative frequency curve. It is used to find the Median, Quartiles,
Percentiles, and the proportion of values below/above a threshold.
Q3. How do you construct a Pie Chart?
Ans: Calculate each category's percentage. Convert to degrees: angle =
(frequency/total) × 360°. Draw sectors accordingly.
Q4. What is a Frequency Polygon?
Ans: A line graph formed by connecting the midpoints of the tops of histogram bars. It is
useful for comparing two or more frequency distributions on the same graph.
Chapter 3: Grouping Data & Frequency Distribution
3.1 Frequency Distribution
Frequency Distribution: A tabular arrangement that groups raw data into classes
(intervals) and records the number of observations (frequency) falling in each class.
3.2 Key Terminology
• Class Interval / Width (h): The range covered by one class: h = Upper Limit − Lower Limit.
• Class Frequency (f): The count of data values falling in a class.
• Class Midpoint (m): m = (Upper Limit + Lower Limit) / 2. Used in calculations for grouped data.
• Relative Frequency: rf = f / n. Proportion of observations in each class.
• Cumulative Frequency (F): Running sum of frequencies up to and including a class.
• Class Boundaries: True upper and lower limits removing gaps: Lower boundary = Lower Limit −
0.5.
3.3 Steps to Construct a Frequency Distribution
1. Find Range: R = Maximum value − Minimum value.
2. Determine number of classes (k): Use Sturges' Formula: k = 1 + 3.322 × log(n). Typical range:
5–15 classes.
3. Find class width: h = R / k (round up to a convenient number).
4. Set lower class limit of the first class (usually just below the minimum value).
5. List all class intervals ensuring they are mutually exclusive and exhaustive.
6. Tally the data into the classes and record frequencies.
7. Compute relative frequency and cumulative frequency columns.
Example: Marks of 20 students:
45,52,67,71,38,55,60,72,48,65,80,55,42,75,58,63,70,50,44,68.
Range = 80−38 = 42. k = 1+3.322×log(20) ≈ 6. h = 42/6 = 7 → use 10.
Classes: 30–40 (f=1), 40–50 (f=4), 50–60 (f=5), 60–70 (f=5), 70–80 (f=4), 80–90 (f=1).
Total = 20.
Sturges' Formula: k = 1 + 3.322 × log₁₀(n)
Key Examination Q&A
Q1. What is a Frequency Distribution?
Ans: A frequency distribution is a table showing how data values are distributed across
classes/intervals, recording the frequency (count) of observations in each class.
Q2. What is the difference between Relative Frequency and Cumulative
Frequency?
Ans: Relative Frequency = f/n (proportion in a class). Cumulative Frequency = running
total of all frequencies up to a given class.
Q3. State Sturges' Rule for determining number of classes.
Ans: k = 1 + 3.322 × log(n), where n = total number of observations. This gives an
optimal number of classes (typically 5–15).
Q4. What are class boundaries and why are they used?
Ans: Class boundaries eliminate gaps between classes for continuous data: Lower
boundary = Lower Limit − 0.5; Upper boundary = Upper Limit + 0.5.
Q5. What is a class midpoint and where is it used?
Ans: Midpoint m = (Upper + Lower Limit)/2. It is used as the representative value of each
class when computing the mean, variance, and other measures for grouped data.
Chapter 4: Measures of Central Tendency
Definition: A Measure of Central Tendency is a single value that represents the center
or typical value of a dataset. It summarizes the data with one representative figure.
Also called: Average or Central Value.
4.1 Arithmetic Mean (AM)
The Arithmetic Mean is the sum of all observations divided by the total number of observations. It uses
every data value and is the most widely used average.
Ungrouped Data: X̄ = ΣX / n
Grouped Data: X̄ = Σ(f × m) / Σf where m = class midpoint
Short-cut (Assumed Mean): X̄ = A + (Σfd / Σf) × h where d = (m −
A)/h
Example: Scores: 5, 10, 15, 20, 25 → X̄ = (5+10+15+20+25)/5 = 75/5 = 15.
• Properties: Sum of deviations from mean = 0: Σ(X − X̄) = 0.
AM is affected by every value, including extreme values (outliers).
AM may not be an actual observation in the dataset.
4.2 Median
The Median is the middle value of an ordered dataset. It divides the distribution into two equal halves
and is resistant to extreme values.
Ungrouped (n odd): M = value at position (n+1)/2 after sorting
Ungrouped (n even): M = average of values at positions n/2 and n/2 +
1
Grouped Data: M = L + [(n/2 − F) / f] × h
Where: L = lower boundary of median class, F = cumulative frequency before median
class, f = frequency of median class, h = class width, n = total frequency.
Example: Sorted values: 3, 7, 8, 11, 15 → n = 5 (odd) → Median = 3rd value = 8.
If values: 3, 7, 8, 11 → n = 4 (even) → M = (7+8)/2 = 7.5.
4.3 Mode
The Mode is the value that occurs most frequently in a dataset. A dataset may be unimodal, bimodal, or
multimodal.
Grouped Data (Czuber's Formula): Z = L + [(f1 − f0) / (2f1 − f0 − f2)] ×
h
Where: L = lower boundary of modal class, f1 = frequency of modal class, f0 = frequency
of preceding class, f2 = frequency of succeeding class, h = class width.
Example: 4, 7, 7, 9, 12 → Mode = 7 (appears twice). Dataset 2, 3, 3, 5, 5, 8 is bimodal:
Mode = 3 and 5.
4.4 Geometric Mean (GM)
The Geometric Mean is the n-th root of the product of n values. It is appropriate for data involving ratios,
growth rates, and percentages.
GM = (X1 × X2 × ... × Xn)^(1/n)
Using Logarithms: log(GM) = [Σ log(X)] / n → GM =
antilog(Σlog(X)/n)
Grouped Data: log(GM) = Σ[f × log(m)] / Σf
Example: 2, 4, 8 → GM = (2 × 4 × 8)^(1/3) = 64^(1/3) = 4.
4.5 Harmonic Mean (HM)
The Harmonic Mean is the reciprocal of the arithmetic mean of reciprocals. Best used for averaging
rates and speeds.
HM = n / Σ(1/X)
Grouped Data: HM = Σf / Σ(f/m)
Example: A car travels 60 km/h for the first half and 40 km/h for the second half of a
journey. Average speed = HM = 2/(1/60+1/40) = 2×120/5 = 48 km/h.
4.6 Relationship: AM, GM, HM
For positive numbers: AM ≥ GM ≥ HM
Also: GM² = AM × HM (for two values)
4.7 Comparative Summary
Average Formula Advantages Disadvantages Best Used For
Arithmetic ΣX/n Simple; uses all Distorted by General-purpose
Mean values; algebraic outliers averaging
Median Middle value Not affected by Ignores many Skewed data;
outliers values income
distribution
Mode Most frequent Easy to May not be Most common
value understand; real unique value; fashion,
value business
Geometric n-th root of Suitable for Undefined for Growth rates,
Mean product growth rates zero/negative ratios, index
numbers
Harmonic n / Σ(1/X) Best for rates Undefined for Speeds, rates,
Mean zero values efficiency ratios
Key Examination Q&A
Q1. Define Arithmetic Mean and state its properties.
Ans: AM = ΣX/n. Properties: (1) Σ(X−X̄) = 0; (2) Σ(X−X̄)² is minimum; (3) affected by all
values; (4) may not be an actual observation; (5) unique for a dataset.
Q2. How is the Median determined for grouped data?
Ans: Use: M = L + [(n/2 − F)/f] × h, where L = lower class boundary of median class, F =
cumulative frequency before median class, f = median class frequency, h = class width.
Q3. Explain the relationship AM ≥ GM ≥ HM.
Ans: For any set of positive numbers, the Arithmetic Mean is always ≥ Geometric Mean ≥
Harmonic Mean. Equality holds only when all values are equal.
Q4. When would you prefer Median over Mean?
Ans: When data is skewed (e.g., income distribution) or contains extreme outliers, the
Median is preferred as it is not affected by extreme values.
Q5. What is the empirical relationship between Mean, Median, and Mode?
Ans: For a moderately skewed distribution: Mode ≈ 3 × Median − 2 × Mean (Karl
Pearson's empirical formula).
Q6. What are the uses of Geometric Mean?
Ans: GM is used for: (1) calculating average growth rates, (2) index numbers, (3)
averaging ratios and percentages, (4) compound interest problems.
Chapter 5: Measures of Dispersion
Definition: Dispersion (or Variation) refers to the extent to which data values are spread
out around the central value. A low dispersion indicates values are close to the mean; a
high dispersion indicates they are widely scattered.
5.1 Range
R = X_max − X_min
Simplest measure. Sensitive to extreme values. Used in quality control (control charts).
Example: Values: 5, 10, 15, 20, 25 → R = 25 − 5 = 20.
5.2 Mean Deviation (MD)
Average of absolute deviations from a central value (usually Mean or Median).
Ungrouped: MD = Σ|X − Ā| / n
Grouped: MD = Σ f|m − Ā| / Σf
Example: Values: 2, 4, 6. X̄ = 4. MD = (|2−4|+|4−4|+|6−4|)/3 = (2+0+2)/3 = 1.33.
5.3 Variance and Standard Deviation
Variance is the average of squared deviations from the mean. Standard Deviation (SD) is the positive
square root of the variance — it has the same unit as the data.
Population Variance: σ² = Σ(X − μ)² / N
Sample Variance: s² = Σ(X − X̄)² / (n − 1) (Bessel's correction)
Population SD: σ = √[Σ(X − μ)² / N]
Sample SD: s = √[Σ(X − X̄)² / (n − 1)]
Grouped Data SD: σ = √[Σ f(m − X̄)² / Σf]
Computational Formula: σ² = [ΣX² / n] − (X̄)² or s² = [ΣX² −
n(X̄)²] / (n−1)
Note: The divisor (n−1) for sample variance corrects for the bias introduced when
estimating a population parameter from a sample — this is called Bessel's Correction.
Example: Values: 2, 4, 6. X̄ = 4.
Population: σ² = [(2−4)²+(4−4)²+(6−4)²]/3 = 8/3 ≈ 2.67 → σ ≈ 1.63.
Sample: s² = 8/(3−1) = 4 → s = 2.
5.4 Coefficient of Variation (CV)
A relative measure of dispersion used to compare variability between datasets with different units or
scales.
CV = (σ / X̄) × 100%
Example: Group A: X̄ = 50, σ = 10 → CV = 20%.
Group B: X̄ = 200, σ = 30 → CV = 15%. Group B is relatively less variable.
5.5 Chebyshev's Rule
For any distribution (regardless of shape), at least (1 − 1/k²) of all values lie within k
standard deviations of the mean, for k > 1.
P(|X − μ| < kσ) ≥ 1 − 1/k²
k Chebyshev's Guarantee Empirical Rule (Normal)
k=1 No guarantee ≈ 68%
k = 1.5 At least 55.6% ≈ 86.6%
k=2 At least 75% ≈ 95%
k=3 At least 88.9% ≈ 99.7%
5.6 Quartile Deviation (Semi-Interquartile Range)
QD = (Q3 − Q1) / 2 | Coefficient of QD = (Q3 − Q1) / (Q3 +
Q1)
Key Examination Q&A
Q1. Define Standard Deviation and explain its importance.
Ans: SD = √[Σ(X−X̄)²/n]. It measures the average spread of data from the mean. It has
the same unit as the data, is used in most statistical tests, and is the basis for the normal
distribution and CV.
Q2. Why is (n−1) used in sample variance instead of n?
Ans: Because sample variance using n underestimates the population variance. Using
(n−1) — Bessel's correction — provides an unbiased estimate of the population variance.
Q3. What is the Coefficient of Variation and when is it useful?
Ans: CV = (σ/X̄) × 100%. It is a unit-free relative measure. It is used to compare
variability between two datasets with different means or different units (e.g., comparing
exam scores vs. salaries).
Q4. State Chebyshev's theorem. Apply it for k = 2.
Ans: For any distribution, at least (1−1/k²) of values lie within k SDs of the mean. For
k=2: at least 1−1/4 = 75% of data lies within μ ± 2σ.
Q5. What is the relationship between Variance and Standard Deviation?
Ans: SD = √Variance. Variance is in squared units; SD is in the original units of
measurement, making it easier to interpret.
Chapter 6: Skewness and Kurtosis
6.1 Skewness
Skewness: A measure of the asymmetry of a frequency distribution about its mean. A
symmetric distribution has skewness = 0.
Type Relationship Tail Direction Example
Symmetric (No Mean = Median = Both equal Normal distribution
Skew) Mode
Positively Mode < Median < Long right tail Income distribution
Skewed (Right) Mean
Negatively Mean < Median < Long left tail Exam scores (easy
Skewed (Left) Mode test)
6.2 Measures of Skewness
Karl Pearson's (1st): Sk = (Mean − Mode) / SD
Karl Pearson's (2nd): Sk = 3(Mean − Median) / SD (Range: −3 to +3)
Bowley's (Quartile): Sk = (Q3 + Q1 − 2Q2) / (Q3 − Q1) (Range: −1 to
+1)
Moment-based (β1): β1 = μ3² / μ2³ | γ1 = √β1 (positive for
right skew)
If Sk > 0: right-skewed (positive). If Sk < 0: left-skewed (negative). If Sk = 0: symmetric.
6.3 Kurtosis
Kurtosis: A measure of the peakedness (or flatness) of a frequency distribution relative
to the Normal distribution.
β2 = μ4 / μ2² | Excess Kurtosis (γ2) = β2 − 3
Excess
Type β2 Value Description Tail Nature
Kurtosis
Mesokurtic β2 = 3 γ2 = 0 Normal distribution; Normal tails
standard peak
Leptokurtic β2 > 3 γ2 > 0 Sharper, higher peak Heavy/fat tails
than normal
Platykurtic β2 < 3 γ2 < 0 Flatter, lower peak Thin/light tails
than normal
Key Examination Q&A
Q1. Define Skewness and distinguish between positive and negative skewness.
Ans: Skewness measures asymmetry. Positive skew: tail on right, Mean > Median >
Mode. Negative skew: tail on left, Mean < Median < Mode.
Q2. What are the different measures of skewness?
Ans: (1) Karl Pearson's: Sk = (Mean−Mode)/SD or 3(Mean−Median)/SD. (2) Bowley's:
Sk = (Q3+Q1−2Q2)/(Q3−Q1). (3) Moment-based: γ1 = √β1.
Q3. Define Kurtosis. What are its three types?
Ans: Kurtosis measures the peakedness. Mesokurtic (β2=3, normal); Leptokurtic (β2>3,
sharper peak, fatter tails); Platykurtic (β2<3, flatter peak, thinner tails).
Q4. What is the difference between Skewness and Kurtosis?
Ans: Skewness measures asymmetry (left-right balance); Kurtosis measures
peakedness (flat vs. peaked). Both describe the shape of a distribution.
Chapter 7: Regression and Correlation
7.1 Correlation
Correlation: A statistical measure that describes the strength and direction of a linear
relationship between two quantitative variables. Correlation does not imply causation.
• Positive Correlation: Both variables move in the same direction (r > 0).
• Negative Correlation: Variables move in opposite directions (r < 0).
• No Correlation: No linear relationship (r ≈ 0).
7.2 Karl Pearson's Coefficient of Correlation
r = Σ(X−X̄)(Y−Ȳ) / √[Σ(X−X̄)² × Σ(Y−Ȳ)²]
Computational Formula: r = [nΣXY − ΣXΣY] / √{[nΣX² − (ΣX)²][nΣY² −
(ΣY)²]}
Value of r Interpretation
r = +1 Perfect positive linear relationship
0.7 ≤ r < 1 Strong positive relationship
0.4 ≤ r < 0.7 Moderate positive relationship
0 < r < 0.4 Weak positive relationship
r=0 No linear relationship
−0.4 < r < 0 Weak negative relationship
r = −1 Perfect negative linear relationship
7.3 Spearman's Rank Correlation Coefficient
Used when data is in ordinal form or when the relationship is monotonic but not necessarily linear.
rs = 1 − [6ΣD² / n(n² − 1)] where D = rank difference for each
pair
Example: Two judges rank 5 contestants. D values: 0, 1, −1, 1, −1.
ΣD² = 0+1+1+1+1 = 4. rs = 1 − (6×4)/(5×24) = 1 − 0.20 = 0.80 (Strong positive
agreement).
7.4 Simple Linear Regression
Regression: A method that models the relationship between a dependent variable (Y)
and one or more independent variables (X) to predict future values.
Regression Line of Y on X: Ŷ = a + bX
Regression Coefficient (slope) b: b = Σ(X−X̄)(Y−Ȳ) / Σ(X−X̄)² = [nΣXY −
ΣXΣY] / [nΣX² − (ΣX)²]
Intercept a: a = Ȳ − b X̄
Regression Line of X on Y: X̂ = c + dY where d = r × (σx/σy) c =
X̄ − d Ȳ
Least Squares Principle: The regression line minimizes the sum of squared residuals:
Σ(Y − Ŷ)² = minimum.
The two regression lines always intersect at the point (X̄, Ȳ).
7.5 Coefficient of Determination (R²)
R² = r² (ranges from 0 to 1)
R² represents the proportion of total variation in Y explained by X. R² = 0.81 means 81% of variation in
Y is explained by X.
7.6 Correlation Ratio, Partial & Multiple Correlation
• Correlation Ratio (η): Measures non-linear association between variables.
• Partial Correlation: Correlation between two variables while holding other variables constant.
r12.3 = partial correlation between X1 and X2, controlling for X3.
• Multiple Correlation (R): Correlation between one dependent variable and two or more
independent variables.
• Multiple Regression: Y = b0 + b1X1 + b2X2 + ... + bkXk.
Key Examination Q&A
Q1. Distinguish between Correlation and Regression.
Ans: Correlation measures the strength and direction of a linear relationship (−1 ≤ r ≤ 1).
Regression establishes a mathematical equation to predict the value of one variable from
another.
Q2. What is the Coefficient of Determination? Interpret R² = 0.64.
Ans: R² = r². It is the proportion of variation in Y explained by X. R² = 0.64 means 64% of
the variation in Y is explained by the regression on X.
Q3. State the properties of Regression Coefficients.
Ans: (1) Both byx and bxy have the same sign. (2) r = √(byx × bxy). (3) AM of byx and
bxy ≥ |r|. (4) byx = r × (σy/σx).
Q4. What is Spearman's Rank Correlation? When is it preferred?
Ans: rs = 1 − 6ΣD²/n(n²−1). Preferred when data is ordinal, contains outliers, or the
relationship is non-linear but monotonic.
Q5. How do you interpret r = −0.85?
Ans: Strong negative linear relationship: as X increases, Y decreases substantially.
About 72.25% (R²) of Y's variation is explained by X.
Q6. What is Partial Correlation?
Ans: It is the correlation between two variables after removing the linear effect of a third
variable. Used to find the 'pure' relationship between two variables.
Chapter 8: Demography
Demography: The scientific study of the size, structure, composition, distribution, and
changes of human populations, including births, deaths, migration, and aging.
8.1 Mortality Rates
Crude Death Rate (CDR): CDR = (Number of Deaths / Mid-year Population)
× 1000
Age-Specific Death Rate (ASDR): ASDR = (Deaths in age group / Mid-year
pop. in age group) × 1000
Standardized Death Rate: SDR = Σ(Standard pop. proportion × ASDR)
8.2 Fertility Rates
Crude Birth Rate (CBR): CBR = (Live Births / Mid-year Population) ×
1000
General Fertility Rate (GFR): GFR = (Live Births / Women aged 15–49) ×
1000
Age-Specific Fertility Rate (ASFR): ASFR = (Births to women aged x–x+n /
Women aged x–x+n) × 1000
Total Fertility Rate (TFR): TFR = 5 × Σ(ASFR) (sum over all 5-year age
groups 15–49)
Gross Reproduction Rate (GRR): GRR = 5 × Σ(Female ASFR)
Interpretation: TFR = 2.1 is the replacement-level fertility — the level at which a
population exactly replaces itself from one generation to the next.
8.3 Population Growth
Rate of Natural Increase (RNI): RNI = CBR − CDR
Geometric Growth: Pt = P0 × (1 + r)^t
Exponential Growth: Pt = P0 × e^(rt)
• Migration: Net Migration = In-migration − Out-migration. Positive net migration = population gain
from migration.
• Nuptiality: Study of marriage patterns. Crude Marriage Rate = (Marriages / Population) × 1000.
• Bangladesh: Population grew from ~75 million (1971) to over 170 million (2024). One of the
densest countries in the world.
8.4 Life Table
• lx: Number of survivors at age x (starting from l0 = 100,000).
• dx: Deaths between age x and x+1: dx = lx − lx+1.
• qx: Probability of dying between x and x+1: qx = dx/lx.
• ex: Life expectancy at age x: average number of years remaining.
Key Examination Q&A
Q1. Define CDR and CBR. Why is CBR preferred over GFR?
Ans: CDR = (Deaths/Population)×1000. CBR = (Live Births/Population)×1000. GFR is
more refined than CBR as it uses only the female reproductive population (15–49),
making it a better fertility indicator.
Q2. What is TFR and what does TFR = 2.1 signify?
Ans: TFR = 5 × Σ(ASFR). TFR = 2.1 is replacement-level fertility — the population
replaces itself without increasing or decreasing in the long run.
Q3. Distinguish between GRR and NRR.
Ans: GRR considers only female births; NRR adjusts GRR for female mortality before
reaching reproductive age. NRR = 1 means population is replacing itself exactly.
Q4. What is the Demographic Transition Theory?
Ans: A model describing the shift from high birth & death rates (pre-industrial) → falling
death rates → falling birth rates → low birth & death rates (post-industrial). Bangladesh is
in Stage 3.
Chapter 9: Index Number
Index Number: A statistical measure designed to show changes in a variable or group
of related variables with respect to time, geographic location, or other characteristics.
The base period value = 100.
9.1 Types of Index Numbers
• Price Index: Measures change in price level (e.g., Consumer Price Index — CPI).
• Quantity Index: Measures change in quantity produced or consumed.
• Value Index: Measures change in total value (price × quantity).
9.2 Methods of Construction
Simple Price Relative: P01 = (P1 / P0) × 100
Simple Aggregate Price Index: P01 = (ΣP1 / ΣP0) × 100
Laspeyres Price Index (base-year weighted): P01 = [Σ(P1 Q0) / Σ(P0 Q0)] ×
100
Paasche Price Index (current-year weighted): P01 = [Σ(P1 Q1) / Σ(P0 Q1)] ×
100
Fisher's Ideal Index: P01 = √(Laspeyres × Paasche)
Weighted Average of Price Relatives: P01 = Σ(W × P) / ΣW W = base
year value (P0 Q0)
Marshall-Edgeworth Index: P01 = Σ[P1(Q0+Q1)] / Σ[P0(Q0+Q1)] × 100
9.3 Tests for Index Numbers
Test Formula Laspeyres Paasche Fisher
Time Reversal P01 × P10 = 1 Fails Fails Passes
Test
Factor Reversal P01 × Q01 = Fails Fails Passes
Test V01
Circular Test P01 × P12 × Fails Fails Partially
P20 = 1
Why Fisher's is 'Ideal': Fisher's Index passes both the Time Reversal
and Factor Reversal Tests, making it theoretically the best index number
formula.
9.4 Consumer Price Index (CPI)
• CPI measures the average change in prices paid by consumers for goods and services over time.
• Real Wage = (Nominal Wage / CPI) × 100.
• Purchasing Power of Money = (1/CPI) × 100.
9.5 Shifting & Splicing Base Year
Shifting Base: New Index = (Old Index / Index at New Base Year) × 100
Key Examination Q&A
Q1. Define Index Number and state its uses.
Ans: An index number is a statistical tool to measure relative changes in variables over
time. Uses: measuring inflation (CPI), comparing living standards, measuring economic
growth, deflating nominal values.
Q2. What are Laspeyres and Paasche Index Numbers? Compare them.
Ans: Laspeyres: uses base-year quantities (overestimates price change); Paasche: uses
current-year quantities (underestimates price change). Fisher's Ideal = geometric mean
of both, eliminating these biases.
Q3. Why is Fisher's Index called 'Ideal'?
Ans: Because it satisfies both the Time Reversal Test (P01×P10 = 1) and the Factor
Reversal Test (P01×Q01 = V01), making it the most accurate composite index.
Q4. What is the Time Reversal Test?
Ans: P01 × P10 = 1. If prices go from period 0 to 1, and then back from 1 to 0, the index
should return to 1. Fisher's index satisfies this; Laspeyres and Paasche do not.
Chapter 10: Time Series Analysis
Time Series: A set of observations recorded at successive points in time (equally or
unequally spaced). Used for understanding past behavior and forecasting future values.
10.1 Components of a Time Series
Component Symbol Description Duration Example
Secular Trend T Long-term upward or Years/decades Population
downward movement growth, GDP
Seasonal S Regular periodic < 1 year Eid sales,
Variation fluctuations within a monsoon crops
year
Cyclical C Medium-term 2–10 years Business cycle,
Variation oscillations around recession
trend
Irregular I Unpredictable, erratic Instantaneous Natural
(Random) fluctuations disasters,
strikes
10.2 Models of Time Series
Additive Model: Y = T + S + C + I
Multiplicative Model: Y = T × S × C × I
The Multiplicative Model is more commonly used in practice when the seasonal
amplitude increases proportionally with the trend level.
10.3 Measurement of Secular Trend
(a) Moving Average Method
3-Year Moving Average: MA₃ = (Yt-1 + Yt + Yt+1) / 3
Simple, but loses data at both ends. For even-period MA, centering is required.
(b) Least Squares Method
Trend Line: Ŷt = a + bt
Normal Equations: Σy = na + bΣt and Σty = aΣt + bΣt²
If t is coded so that Σt = 0: b = Σ(ty) / Σt² | a = Σy / n
Example: Sales data (t coded: −2,−1,0,1,2), y: 10,12,13,15,18.
a = Σy/n = 68/5 = 13.6. b = Σty/Σt² = (20+12+0+15+36)/10 = 83/10... See full calculation.
(c) Measurement of Seasonal Variation
• Method of Simple Averages: Average values by season; divide each by grand mean × 100 to
get Seasonal Index.
• Ratio-to-Moving-Average Method: Most common: divide each Y value by the corresponding
trend value (from moving average) to isolate seasonal component.
Key Examination Q&A
Q1. Define Time Series and name its four components.
Ans: A time series is a set of data points recorded over time. Components: (1) Secular
Trend (T), (2) Seasonal Variation (S), (3) Cyclical Variation (C), (4) Irregular Variation (I).
Q2. What is the difference between Additive and Multiplicative models?
Ans: Additive: Y = T+S+C+I (constant seasonal amplitude). Multiplicative: Y = T×S×C×I
(seasonal amplitude changes with trend). Multiplicative is more realistic for economic
data.
Q3. Why is the Moving Average method used?
Ans: To smooth out short-term fluctuations and reveal the underlying long-term trend by
averaging consecutive values over a fixed window.
Q4. What are the advantages of the Least Squares method for trend?
Ans: (1) Objective and unique; (2) minimizes sum of squared errors; (3) allows
forecasting; (4) uses all data points; (5) provides a mathematical equation for the trend
line.
Q5. What is a Seasonal Index?
Ans: A numerical value showing the degree to which a particular season (month/quarter)
is above or below the average (= 100). An index of 120 means 20% above average for
that season.
Chapter 11: Sampling
Population (N): The complete collection of all elements/individuals about which
information is desired.
Sample (n): A subset of the population selected for observation and analysis.
Sampling: The process of selecting a sample from the population in order to make
inferences about the population.
11.1 Census vs. Sampling
Basis Census Sampling
Coverage Every element of population Selected elements only
Cost Very expensive Less expensive
Time Very time-consuming Faster
Accuracy No sampling error Sampling error present
Suitability Small populations Large or infinite populations
Destructive testing Not feasible Feasible
11.2 Probability Sampling Methods
• Simple Random Sampling (SRS): Every unit has an equal probability of selection. Methods:
Lottery, Random Number Tables, Computer-generated random numbers.
• Stratified Random Sampling: Population divided into homogeneous strata; random sample
drawn from each stratum proportionally or equally. More representative than SRS.
• Systematic Sampling: Select every k-th unit after a random start. k = N/n. Sampling interval k =
population size / sample size.
• Cluster Sampling: Population divided into clusters (e.g., geographic areas); entire clusters
selected randomly. Cost-effective for geographically spread populations.
• Multistage Sampling: Successive stages of sampling (e.g., first select districts, then villages,
then households).
11.3 Non-Probability Sampling Methods
• Convenience Sampling: Units selected based on easy accessibility. Prone to bias.
• Purposive (Judgment) Sampling: Researcher uses judgment to select units believed to be
most representative.
• Quota Sampling: Predetermined number of units selected from each subgroup; no
randomization.
• Snowball Sampling: Existing subjects recruit new subjects. Useful for hidden populations.
11.4 Sampling Error vs. Non-Sampling Error
Aspect Sampling Error Non-Sampling Error
Cause Difference between sample Data collection/processing
and population mistakes
Effect of Decreases Does not necessarily decrease
larger n
Measurability Can be measured Difficult to measure
Types Biased, unbiased errors Coverage, response,
processing errors
11.5 Sample Size Determination
For estimating mean (σ known): n = (Zα/2 × σ / E)²
For estimating proportion: n = Zα/2² × p(1−p) / E²
With finite population correction: n_adjusted = n / [1 + (n−1)/N]
E = maximum allowable error (margin of error). For 95% confidence: Zα/2 = 1.96. For
99%: Zα/2 = 2.576.
Example: Estimate population mean with 95% confidence, σ = 15, error E = 3.
n = (1.96 × 15 / 3)² = (9.8)² ≈ 96.04 → n = 97.
Key Examination Q&A
Q1. What is the difference between Probability and Non-Probability Sampling?
Ans: In probability sampling, each unit has a known, non-zero chance of selection,
allowing valid statistical inferences. In non-probability sampling, selection is subjective —
suitable for exploratory research but not for statistical inference.
Q2. Compare Stratified and Cluster Sampling.
Ans: Stratified: population split into homogeneous groups; sample from every group
(better precision). Cluster: population split into heterogeneous groups; all members of
selected groups observed (cheaper, less precise).
Q3. What is Systematic Sampling? State its advantages and disadvantages.
Ans: Select every k-th unit (k = N/n). Advantages: Simple, spread throughout population.
Disadvantage: If population has periodicity matching k, bias occurs.
Q4. Define Sampling Error. How can it be reduced?
Ans: Sampling error = discrepancy between sample statistic and population parameter.
Reduce by: (1) increasing sample size; (2) using appropriate sampling method (e.g.,
stratified); (3) using probability sampling.
Q5. How is sample size determined for estimating a mean?
Ans: n = (Zα/2 × σ / E)². Requires: confidence level (Z), population SD (σ), and
acceptable margin of error (E).
PART - II (Marks: 100)
Chapter 1: Concept of Probability
Probability: A numerical measure of the likelihood that a specific event will occur,
ranging from 0 (impossible) to 1 (certain).
1.1 Basic Terminology
• Random Experiment: An experiment whose outcome cannot be predicted with certainty (e.g.,
tossing a coin, rolling a die).
• Sample Space (S): The set of all possible outcomes. Example: S = {H, T} for a coin toss.
• Event (A): Any subset of the sample space. Simple event = one outcome; Compound event =
multiple outcomes.
• Mutually Exclusive Events: Events that cannot occur simultaneously: P(A ∩ B) = 0.
• Collectively Exhaustive Events: Events that cover the entire sample space: P(A₁ ∪ A₂ ∪ ... ∪
Aₙ) = 1.
• Independent Events: Occurrence of one does not affect the probability of the other: P(A ∩ B) =
P(A)·P(B).
• Equally Likely Events: Each outcome has the same probability of occurring.
1.2 Approaches to Defining Probability
Approach Definition Formula Limitation
Classical Based on equally P(A) = n(A) / n(S) Requires equally likely
likely outcomes outcomes
Relative Based on long-run P(A) = lim(f/n) as n → Requires large number
Frequency observed frequencies ∞ of trials
Subjective Based on personal No fixed formula Not objective; varies by
belief/expert person
judgment
Axiomatic Based on three 0≤P(A)≤1; P(S)=1; Abstract; most rigorous
(Kolmogorov) axioms P(A∪B)=P(A)+P(B)
1.3 Axioms of Probability (Kolmogorov)
• Axiom 1: For any event A: 0 ≤ P(A) ≤ 1.
• Axiom 2: P(S) = 1 (probability of sample space = 1).
• Axiom 3 (Additivity): If A and B are mutually exclusive: P(A ∪ B) = P(A) + P(B).
1.4 Graphical Displays for Events
• Venn Diagram: Visual representation of events and their relationships (union, intersection,
complement).
• Tree Diagram: Displays sequential events and their probabilities along branches.
• Probability Table: Lists outcomes and their probabilities; useful for compound events.
Key Examination Q&A
Q1. Define Probability. State its basic properties.
Ans: Probability is the numerical measure of likelihood of an event. Properties: (1) 0 ≤
P(A) ≤ 1; (2) P(S) = 1; (3) P(∅) = 0; (4) P(A') = 1 − P(A); (5) P(A∪B) =
P(A)+P(B)−P(A∩B).
Q2. A fair die is rolled. Find P(even) and P(prime).
Ans: P(even) = P({2,4,6}) = 3/6 = 0.5. P(prime) = P({2,3,5}) = 3/6 = 0.5. P(even or prime)
= P({2,3,4,5,6}) = 5/6.
Q3. Distinguish between Mutually Exclusive and Independent Events.
Ans: Mutually Exclusive: P(A∩B)=0 (cannot both occur). Independent:
P(A∩B)=P(A)×P(B) (occurrence of one does not affect the other). ME events are
generally NOT independent (unless one has zero probability).
Q4. What is the Classical Definition of Probability? State its limitations.
Ans: P(A) = n(A)/n(S) where outcomes are equally likely. Limitations: (1) restricted to
equally likely outcomes; (2) inapplicable to infinite sample spaces; (3) circular definition of
'equally likely'.
Chapter 2: Rules of Probability
2.1 Fundamental Rules
Complementation Rule: P(A') = 1 − P(A)
Special Addition Rule (ME events): P(A ∪ B) = P(A) + P(B) [if A and B
are mutually exclusive]
General Addition Rule: P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
For three events: P(A∪B∪C) = P(A)+P(B)+P(C) − P(A∩B) − P(A∩C) − P(B∩C)
+ P(A∩B∩C)
Multiplication Rule (Independent): P(A ∩ B) = P(A) × P(B)
General Multiplication Rule: P(A ∩ B) = P(B) × P(A|B) = P(A) × P(B|A)
Conditional Probability: P(A|B) = P(A ∩ B) / P(B) [P(B) ≠ 0]
2.2 Bivariate Data & Contingency Table
A Contingency Table (cross-tabulation) displays joint frequencies of two categorical variables, allowing
calculation of joint, marginal, and conditional probabilities.
Example: 200 students surveyed on Pass/Fail and Male/Female.
Male Female Total
Pass 80 60 140
Fail 30 30 60
Total 110 90 200
P(Pass) = 140/200 = 0.70. P(Male ∩ Pass) = 80/200 = 0.40. P(Pass|Male) = 80/110 ≈
0.727.
2.3 Bayes' Theorem
Bayes' Theorem: Provides a method for revising/updating probabilities using new
evidence. It combines prior probability with likelihood to give posterior probability.
P(Aᵢ | B) = P(Aᵢ) × P(B | Aᵢ) / Σⱼ [P(Aⱼ) × P(B | Aⱼ)]
Terminology: P(Aᵢ) = Prior probability. P(B|Aᵢ) = Likelihood. P(Aᵢ|B) = Posterior
probability.
Worked Example: Factory has Machines A and B. A produces 60% of output (3%
defective); B produces 40% (5% defective).
A defective item is found. What is the probability it came from Machine A?
P(A) = 0.6, P(B) = 0.4. P(D|A) = 0.03, P(D|B) = 0.05.
P(D) = 0.6×0.03 + 0.4×0.05 = 0.018 + 0.020 = 0.038.
P(A|D) = (0.6 × 0.03) / 0.038 = 0.018/0.038 ≈ 0.474 (47.4%).
Key Examination Q&A
Q1. State and explain the General Addition Rule of Probability.
Ans: P(A∪B) = P(A)+P(B)−P(A∩B). We subtract P(A∩B) to avoid double-counting
elements in the intersection. If A and B are mutually exclusive, P(A∩B) = 0, giving the
special addition rule.
Q2. Define Conditional Probability. Illustrate with an example.
Ans: P(A|B) = P(A∩B)/P(B). Example: P(Ace | Red card) = P(Red Ace)/P(Red) =
(2/52)/(26/52) = 2/26 = 1/13.
Q3. Explain Bayes' Theorem and its applications.
Ans: Bayes' theorem updates prior probabilities with new evidence: P(Aᵢ|B) =
P(Aᵢ)P(B|Aᵢ)/ΣP(Aⱼ)P(B|Aⱼ). Applications: medical diagnosis, spam filtering, quality control,
machine learning.
Q4. What is a Contingency Table? How is it used in probability?
Ans: A Contingency Table shows joint frequencies of two variables. From it we can
compute joint probability (cell/total), marginal probability (row or column total/grand total),
and conditional probability.
Chapter 3: Random Variables & Probability Distributions
Random Variable (X): A function that assigns a real number to each outcome in a
sample space. Discrete RV takes countable values; Continuous RV takes any value in
an interval.
3.1 Discrete Probability Distribution
• PMF (Probability Mass Function): P(X = x) = probability that X takes value x. Requirements:
P(X=x) ≥ 0 and ΣP(X=x) = 1.
• CDF (Cumulative Distribution Function): F(x) = P(X ≤ x) = Σ P(X=k) for k ≤ x.
• Expected Value: E(X) = μ = ΣxP(x).
• Variance: Var(X) = Σ(x−μ)²P(x) = E(X²) − [E(X)]².
3.2 Binomial Distribution
Conditions: (1) Fixed n trials, (2) each trial is Success/Failure, (3) constant probability p
of success, (4) trials are independent. X ~ B(n, p).
PMF: P(X = x) = C(n, x) × pˣ × (1−p)ⁿ⁻ˣ x = 0, 1, 2, ..., n
Mean & Variance: μ = np | σ² = np(1−p) | σ =
√[np(1−p)]
Example: Toss 5 fair coins. X = number of heads. X ~ B(5, 0.5).
P(X = 3) = C(5,3) × (0.5)³ × (0.5)² = 10 × 0.125 × 0.25 = 0.3125.
μ = 5×0.5 = 2.5. σ² = 5×0.5×0.5 = 1.25. σ = 1.118.
3.3 Poisson Distribution
Models the number of rare events in a fixed interval of time or space, when events occur
independently at a constant average rate λ. X ~ P(λ).
PMF: P(X = x) = e⁻λ × λˣ / x! x = 0, 1, 2, 3, ...
Mean & Variance: μ = σ² = λ (unique property: mean = variance)
• Relationship to Binomial: When n is large and p is small (np = λ is moderate), Binomial ≈
Poisson with λ = np. Rule of thumb: n ≥ 20 and p ≤ 0.05.
Example: On average, 3 calls arrive per minute at a call center. Find P(X = 2).
P(X=2) = e⁻³ × 3² / 2! = 0.0498 × 9 / 2 = 0.2240.
P(X ≤ 2) = P(0)+P(1)+P(2) = 0.0498+0.1494+0.2240 = 0.4232.
3.4 Hypergeometric Distribution
Used for sampling WITHOUT replacement from a finite population. N = population, K =
successes in population, n = sample size.
PMF: P(X = x) = C(K, x) × C(N−K, n−x) / C(N, n)
Mean & Variance: μ = nK/N | σ² = n(K/N)(1−K/N) × (N−n)/(N−1)
Finite Population Correction (FPC): The factor (N−n)/(N−1) corrects for sampling
without replacement. As N → ∞, this → 1 and Hypergeometric → Binomial.
3.5 Normal Distribution
The most important continuous distribution. Bell-shaped, symmetric about μ. X ~ N(μ,
σ²).
PDF: f(x) = (1/σ√(2π)) × exp[−(x−μ)²/(2σ²)] −∞ < x < ∞
Standard Normal Transform: Z = (X − μ) / σ Z ~ N(0, 1)
• Properties: Mean = Median = Mode = μ.
Total area under curve = 1.
Symmetric about μ: P(X > μ) = P(X < μ) = 0.5.
Empirical Rule: μ±1σ ≈ 68%; μ±2σ ≈ 95%; μ±3σ ≈ 99.7%.
Determined completely by μ and σ.
Example: X ~ N(50, 100). Find P(40 < X < 60).
Z₁ = (40−50)/10 = −1. Z₂ = (60−50)/10 = +1.
P(−1 < Z < 1) = 2×Φ(1) − 1 = 2×0.8413 − 1 = 0.6826 ≈ 68.26%.
Key Examination Q&A
Q1. State the conditions for a Binomial Distribution.
Ans: (1) Fixed number of trials n. (2) Each trial has only two outcomes: Success (p) or
Failure (q = 1−p). (3) p is constant for all trials. (4) Trials are independent.
Q2. What is the unique property of the Poisson distribution?
Ans: For Poisson, Mean = Variance = λ. This property can be used to identify if data
follows a Poisson distribution.
Q3. When does Binomial approximate Poisson?
Ans: When n is large (≥ 20) and p is small (≤ 0.05), Binomial(n,p) ≈ Poisson(λ = np).
Q4. State the properties of the Normal Distribution.
Ans: (1) Bell-shaped and symmetric. (2) Mean = Median = Mode. (3) Area = 1. (4)
Completely defined by μ and σ. (5) Empirical Rule: 68-95-99.7%.
Q5. How do you find P(a < X < b) for a Normal distribution?
Ans: Convert to Z-scores: Z = (X−μ)/σ. Then use Z-table (Standard Normal table):
P(a<X<b) = P(Z₁ < Z < Z₂) = Φ(Z₂) − Φ(Z₁).
Q6. Compare Hypergeometric and Binomial Distributions.
Ans: Hypergeometric: sampling without replacement, finite population, no constant p.
Binomial: sampling with replacement (or large N), constant p. When N is large,
Hypergeometric ≈ Binomial.
Chapter 4: Sampling Distribution
Sampling Distribution: The probability distribution of a sample statistic (e.g., X̄, p̂)
computed from all possible samples of size n drawn from the same population.
4.1 Sampling Distribution of the Sample Mean
• If population is Normal with mean μ and variance σ², then X̄ ~ N(μ, σ²/n) exactly.
• E(X̄) = μ (sample mean is an unbiased estimator of population mean).
• SE(X̄) = σ/√n (Standard Error — decreases as n increases).
Z-statistic (σ known): Z = (X̄ − μ) / (σ/√n) ~ N(0, 1)
t-statistic (σ unknown): t = (X̄ − μ) / (s/√n) ~ t(n−1 degrees of
freedom)
4.2 Central Limit Theorem (CLT)
Central Limit Theorem: Regardless of the population distribution, the sampling
distribution of the sample mean X̄ approaches a Normal distribution as n → ∞. For
practical purposes, n ≥ 30 is sufficient.
This is the most important theorem in statistics — it justifies using Normal-based methods for
large samples even when the population is non-normal.
X̄ ~ N(μ, σ²/n) approximately, for large n
4.3 Sampling Distribution of Sample Proportion
E(p̂) = p | SE(p̂) = √[p(1−p)/n]
Z-statistic for proportion: Z = (p̂ − p) / √[p(1−p)/n]
4.4 Confidence Intervals
95% CI for μ (σ known): X̄ ± 1.96 × (σ/√n)
95% CI for μ (σ unknown): X̄ ± t(α/2, n−1) × (s/√n)
CI for proportion p: p̂ ± Zα/2 × √[p̂(1−p̂)/n]
Sample Size (for mean): n = [Zα/2 × σ / E]²
Sample Size (for proportion): n = Zα/2² × p(1−p) / E²
Example: n=36, X̄=50, σ=12. Construct a 95% CI for μ.
SE = 12/√36 = 2. CI = 50 ± 1.96×2 = 50 ± 3.92 = (46.08, 53.92).
Interpretation: We are 95% confident that the true population mean lies between 46.08
and 53.92.
Key Examination Q&A
Q1. State the Central Limit Theorem. Why is it important?
Ans: CLT: For large n (≥ 30), X̄ ~ N(μ, σ²/n) regardless of population distribution.
Importance: enables use of Normal distribution for inference about means from any
population.
Q2. What is Standard Error? How does it differ from Standard Deviation?
Ans: SE = σ/√n — measures variability of the sample mean across repeated samples.
SD measures spread of individual observations. SE decreases with larger n; SD is a
population characteristic.
Q3. Interpret a 95% Confidence Interval.
Ans: A 95% CI means: if we repeated sampling many times, 95% of the constructed
intervals would contain the true population parameter. It does NOT mean there is a 95%
probability that μ is in a specific interval.
Q4. How does sample size affect the Confidence Interval?
Ans: As n increases, SE = σ/√n decreases, making the CI narrower (more precise
estimate). Doubling precision (halving E) requires quadrupling the sample size.
Q5. What is the sampling distribution of p̂?
Ans: p̂ ~ approximately N(p, p(1−p)/n) for large n. E(p̂) = p; SE(p̂) = √[p(1−p)/n].
Chapter 5: Basic Concepts of Hypothesis Testing
Hypothesis Testing: A formal statistical procedure for deciding whether the evidence
from sample data supports or contradicts a specific claim (hypothesis) about a
population parameter.
5.1 Key Terminology
• Null Hypothesis (H₀): The initial claim assumed true until proven otherwise — usually states 'no
difference', 'no effect', or 'equals a specific value'. e.g., H₀: μ = 50.
• Alternative Hypothesis (H₁ or Hₐ): The claim we want to support — states there IS a difference
or effect. e.g., H₁: μ ≠ 50 or μ > 50 or μ < 50.
• Significance Level (α): The maximum acceptable probability of a Type I Error. Common values:
α = 0.05 (5%) or α = 0.01 (1%).
• Type I Error (α): Rejecting a TRUE H₀ — False Positive. Probability = α (significance level).
• Type II Error (β): Failing to reject a FALSE H₀ — False Negative. Probability = β.
• Power of Test (1 − β): Probability of correctly rejecting a false H₀. Higher is better.
• p-value: The probability of observing a test statistic as extreme as, or more extreme than, the
one calculated, assuming H₀ is true. Reject H₀ if p-value < α.
• Critical Region: The range of test statistic values for which H₀ is rejected. Bounded by critical
value(s).
• Test Statistic: A numerical value computed from sample data used to decide whether to reject
H₀.
5.2 Error Table
Decision / Reality H₀ is TRUE H₀ is FALSE
Reject H₀ Type I Error (α) — False Correct Decision (Power =
Positive 1−β)
Do Not Reject H₀ Correct Decision (1−α) Type II Error (β) — False
Negative
5.3 General Procedure (6 Steps)
1. State the null hypothesis H₀ and alternative hypothesis H₁.
2. Choose the significance level α (0.05 or 0.01).
3. Select the appropriate test statistic.
4. Determine the critical value and define the rejection region.
5. Calculate the test statistic from sample data.
6. Make decision: if test statistic falls in rejection region (or p-value < α), reject H₀; otherwise, fail to
reject H₀.
5.4 Tests Based on Normal Distribution
One-Sample Z-test: Z = (X̄ − μ₀) / (σ/√n) [σ known, n ≥ 30]
Z-test for Two Population Means: Z = (X̄₁ − X̄₂) / √(σ₁²/n₁ + σ₂²/n₂)
5.5 t-Distribution Tests
One-Sample t-test: t = (X̄ − μ₀) / (s/√n) df = n − 1
Pooled (Independent Samples) t-test: t = (X̄₁ − X̄₂) / [Sp × √(1/n₁ +
1/n₂)] df = n₁+n₂−2
Pooled Standard Deviation: Sp = √[(n₁−1)s₁² + (n₂−1)s₂²] / (n₁+n₂−2)
Paired t-test (Matched Pairs): t = D̄ / (sD/√n) df = n − 1
Paired t-test: Used when observations come in pairs (e.g., before/after measurements
on the same subject). D = difference for each pair; D̄ = mean difference; sD = SD of
differences.
5.6 Chi-Square (χ²) Test
Test Statistic: χ² = Σ (O − E)² / E df = (r−1)(c−1) for contingency
table
• Goodness of Fit: Tests whether observed frequencies match expected frequencies from a
theoretical distribution.
• Test of Independence: Tests whether two categorical variables are independent of each other.
5.7 F-Distribution Test
F-test for Equality of Variances: F = s₁² / s₂² df₁ = n₁−1, df₂ = n₂−1
5.8 One-tailed vs. Two-tailed Tests
Rejection
Test Type H₁ Form Use When
Region
Left-tailed (one- μ < μ₀ Z < −Zα Testing for decrease
tailed)
Right-tailed (one- μ > μ₀ Z > +Zα Testing for increase
tailed)
Two-tailed μ ≠ μ₀ |Z| > Zα/2 Testing for any change
Key Examination Q&A
Q1. Define Type I and Type II errors. What is the trade-off between them?
Ans: Type I (α): rejecting a true H₀ (false positive). Type II (β): not rejecting a false H₀
(false negative). Reducing α increases β; reducing β increases α. The trade-off is
controlled by sample size.
Q2. What is a p-value? How is it used in hypothesis testing?
Ans: The p-value is the probability of getting a result as extreme as the observed, given
H₀ is true. If p-value < α, reject H₀. A small p-value provides strong evidence against H₀.
Q3. When is the t-test used instead of the Z-test?
Ans: The t-test is used when: (1) population SD (σ) is unknown, (2) n < 30, or (3) both
conditions apply. As n → ∞, the t-distribution approaches the standard normal.
Q4. What is the Paired t-test? Give an example.
Ans: Used when two measurements are taken from the same subject (matched pairs).
E.g., blood pressure before and after treatment. Compute d = X₁−X₂ for each pair; test
H₀: μD = 0 using t = D̄/(sD/√n).
Q5. State the conditions for applying a Chi-Square test of independence.
Ans: (1) Random sample; (2) observations are independent; (3) all expected frequencies
≥ 5 (merge cells if needed); (4) data is in frequency/count form.
Chapter 6: Analysis of Variance (ANOVA)
ANOVA: A statistical method for comparing the means of three or more populations
simultaneously using variance analysis. It tests H₀: μ₁ = μ₂ = ... = μk vs. H₁: at least one
mean is different.
Key concept: Total variation = Variation between groups + Variation within groups.
6.1 Assumptions of ANOVA
• Normality: The populations from which samples are drawn are normally distributed.
• Homoscedasticity: All populations have equal variances (σ₁² = σ₂² = ... = σk²).
• Independence: Observations within and between groups are independent.
• Random Sampling: Data is collected by random sampling.
6.2 One-Way ANOVA
SST = SSB + SSW: Total SS = Between-group SS + Within-group SS
SSB (Between Groups): SSB = Σnᵢ(X̄ᵢ − X̄)²
SSW (Within Groups): SSW = ΣΣ(Xᵢⱼ − X̄ᵢ)²
F Statistic: F = MSB / MSW = (SSB/(k−1)) / (SSW/(N−k))
6.3 ANOVA Table
Source of
SS df MS F-ratio
Variation
Between Groups SSB k−1 MSB = F=
(Treatment) SSB/(k−1) MSB/MSW
Within Groups SSW N− MSW = —
(Error) k SSW/(N−k)
Total SST N− — —
1
k = number of groups; N = total observations; N − k = error degrees of
freedom; k − 1 = treatment degrees of freedom.
Decision rule: Reject H₀ if F_calculated > F_critical(k−1, N−k) at
significance level α.
6.4 Two-Way ANOVA
Analyzes the effect of two factors (A and B) simultaneously.
Without Interaction: SST = SS(A) + SS(B) + SSE
With Interaction: SST = SS(A) + SS(B) + SS(A×B) + SSE
• Main Effect A: Average effect of Factor A across all levels of B.
• Main Effect B: Average effect of Factor B across all levels of A.
• Interaction (A×B): Effect of Factor A depends on the level of Factor B (and vice versa).
6.5 Multiple Comparison Tests (Post-hoc)
After rejecting H₀ in ANOVA, post-hoc tests identify which specific group means differ:
• Tukey's HSD (Honestly Significant Difference): Controls familywise error rate; compares all
pairwise means.
• Scheffe's Test: Most conservative; works for all contrasts; appropriate when comparisons are
not pre-planned.
• Bonferroni Correction: Adjusts α for multiple comparisons: α* = α/m where m = number of
comparisons.
• LSD (Least Significant Difference): Most liberal; like multiple t-tests; higher chance of Type I
error.
6.6 Significance of Correlation & Rank Correlation Coefficients
Test for r (correlation significance): t = r √(n−2) / √(1−r²) df = n−2
H₀: ρ = 0
Test for rs (rank correlation): t = rs √(n−2) / √(1−rs²) df = n−2
Key Examination Q&A
Q1. What is ANOVA? Why is it preferred over multiple t-tests?
Ans: ANOVA tests equality of multiple means simultaneously. Multiple t-tests inflate Type
I error (each test has probability α of error). ANOVA controls the overall error rate at α.
Q2. State the assumptions of One-Way ANOVA.
Ans: (1) Populations are normally distributed; (2) Equal variances (homoscedasticity); (3)
Independent random samples; (4) Observations within groups are independent.
Q3. Interpret the F-ratio in ANOVA.
Ans: F = MSB/MSW. If F ≈ 1, between-group variation ≈ within-group variation — no
evidence against H₀. If F >> 1, between-group variation is large — evidence of real
differences between means.
Q4. What is the difference between One-Way and Two-Way ANOVA?
Ans: One-Way: one factor, tests if means differ across k groups. Two-Way: two factors,
tests both main effects and their interaction. Two-Way also tests whether the effect of
Factor A depends on Factor B.
Q5. What are post-hoc tests? When are they used?
Ans: Post-hoc tests (Tukey, Scheffe, Bonferroni, LSD) are used AFTER ANOVA rejects
H₀ to identify which specific pairs of group means are significantly different.
Chapter 7: Experimental Designs
Experimental Design: The plan for conducting an experiment that specifies how
subjects/units are assigned to treatments, with the goal of obtaining valid and efficient
data on cause-and-effect relationships.
7.1 Basic Principles of Experimental Design
(1) Randomization
Randomly assigning treatments to experimental units ensures that extraneous variables are balanced
across groups, eliminating systematic bias and enabling valid statistical inference.
Without randomization, observed differences between groups may be due to pre-existing
differences rather than the treatment itself.
(2) Replication
Applying each treatment to more than one experimental unit allows estimation of experimental error
(variability), increases the precision of treatment comparisons, and provides degrees of freedom for
significance testing.
• More replications: → More reliable estimates of treatment means → More statistical power.
• Minimum: → At least 2 replications per treatment; more for precise experiments.
(3) Local Control (Error Control)
Techniques used to reduce or eliminate the effect of known extraneous (nuisance) variables on the
response variable:
• Blocking: Group homogeneous experimental units into blocks; apply all treatments within each
block. Variation between blocks is removed from error.
• Covariates (ANCOVA): Include related continuous variables as covariates to adjust treatment
means.
• Balanced Designs: Equal replication for each treatment ensures equal precision.
7.2 Completely Randomized Design (CRD)
The simplest experimental design. All experimental units are homogeneous; treatments
are randomly assigned to units with no blocking. Appropriate for laboratory or
greenhouse experiments.
Statistical Model: Yij = μ + τᵢ + εᵢⱼ
Where: μ = overall mean, τᵢ = effect of the i-th treatment, εᵢⱼ = random error ~ N(0, σ²). i =
1,...,k treatments; j = 1,...,nᵢ replications.
Feature Details
Number of factors 1 (treatment factor only)
Blocking None
ANOVA model One-Way ANOVA
Degrees of freedom N−k
(Error)
Advantages Simple layout; flexible — unequal replications allowed
Disadvantages Requires homogeneous units; inefficient if units vary widely
7.3 Randomized Complete Block Design (RCBD)
Experimental units are grouped into blocks such that units within each block are similar
(homogeneous). All treatments appear in each block exactly once. Removes block-to-
block variation from error.
Statistical Model: Yᵢⱼ = μ + τᵢ + βⱼ + εᵢⱼ
Where: μ = overall mean, τᵢ = treatment effect, βⱼ = block effect, εᵢⱼ = random error. i =
1,...,k; j = 1,...,b (blocks).
ANOVA Table for RCBD:
Source SS df MS F
Treatments SS(T) k−1 MS(T) MS(T)/MSE
Blocks SS(B) b−1 MS(B) MS(B)/MSE
Error SSE (k−1)(b−1) MSE —
Total SST kb − 1 — —
• Advantages: Removes block variation from error → more sensitive F-test; flexible number of
treatments.
• Disadvantages: Missing data complicates analysis; blocks must be complete.
• When to use: When there is one known source of variability (e.g., field fertility gradient, animal
litter, operator skill).
Example: Comparing 4 fertilizers (treatments) on crops. Blocks = 3 fields with different
soil types.
Each fertilizer is applied to one plot in each field. This removes soil-type variation from
error.
7.4 Latin Square Design (LSD)
Controls two sources of variability (row and column effects) simultaneously. Requires k
treatments, k rows, and k columns. Each treatment appears exactly once in each row
and each column.
Statistical Model: Yᵢⱼₖ = μ + τᵢ + ρⱼ + γₖ + εᵢⱼₖ
Where: τᵢ = treatment effect, ρⱼ = row effect, γₖ = column effect, εᵢⱼₖ = error. Number of
treatments = rows = columns = k.
Example of a 4×4 Latin Square (k=4 treatments: A, B, C, D):
Columns → Col1 Col2 Col3 Col4
Row 1: A B C D
Row 2: B C D A
Row 3: C D A B
Row 4: D A B C
Each treatment appears exactly once per row and once per column.
ANOVA Table for Latin Square:
Source SS df MS F
Treatments SS(T) k−1 MS(T) MS(T)/MSE
Rows SS(R) k−1 MS(R) MS(R)/MSE
Columns SS(C) k−1 MS(C) MS(C)/MSE
Error SSE (k−1)(k−2) MSE —
Total SST k² − 1 — —
• Advantages: Controls two nuisance variables simultaneously; more efficient than RCBD when
two blocking factors exist.
• Disadvantages: Requires k = rows = columns = treatments (restrictive); error df = (k−1)(k−2)
may be small for small k; no interaction between row, column, and treatment can be estimated.
• Minimum k: k ≥ 4 is recommended for sufficient error degrees of freedom.
7.5 Comprehensive Comparison of Designs
Feature CRD RCBD Latin Square
Number of blocking 0 1 2
factors
Statistical Model Yᵢⱼ = μ+τᵢ+εᵢⱼ Yᵢⱼ = μ+τᵢ+βⱼ+εᵢⱼ Yᵢⱼₖ = μ+τᵢ+ρⱼ+γₖ+εᵢⱼₖ
ANOVA type One-Way Two-Way (no Three-Way (no
interaction) interaction)
Error df N−k (k−1)(b−1) (k−1)(k−2)
Efficiency Low (if units vary) Higher than CRD Highest (with 2 block
factors)
Layout requirement None Complete blocks k×k square (k =
treatments)
Suitable for Homogeneous 1 nuisance 2 nuisance variables;
units; lab variable; field field/industry
Key Examination Q&A
Q1. What are the three basic principles of experimental design? Explain each.
Ans: (1) Randomization: assign treatments randomly to eliminate bias. (2) Replication:
apply each treatment multiple times to estimate experimental error. (3) Local Control: use
blocking or other techniques to reduce the effect of known extraneous variables.
Q2. Compare CRD, RCBD, and Latin Square Design in detail.
Ans: CRD: no blocking, one-way ANOVA, suitable for homogeneous units. RCBD: one
blocking factor, removes its variation, more efficient than CRD. Latin Square: two
blocking factors, most efficient but restrictive (requires k=rows=columns=treatments).
Q3. When would you use an RCBD instead of a CRD?
Ans: Use RCBD when experimental units are heterogeneous and can be grouped into
homogeneous blocks based on one known nuisance variable (e.g., soil type, litter, time
period). Blocking removes this variation from error.
Q4. State the statistical model for RCBD and identify each term.
Ans: Yᵢⱼ = μ + τᵢ + βⱼ + εᵢⱼ. μ = overall mean; τᵢ = i-th treatment effect; βⱼ = j-th block effect
(nuisance); εᵢⱼ ~ N(0, σ²) = random error.
Q5. What are the limitations of Latin Square Design?
Ans: (1) Requires equal number of treatments, rows, and columns. (2) Error df =
(k−1)(k−2) is very small for k=3 (only 2 df). (3) Cannot estimate interactions. (4) Only
practical for k = 4 to 8 treatments. (5) Missing data is complex to handle.
Q6. What is the advantage of RCBD over CRD in terms of efficiency?
Ans: RCBD removes block-to-block variation from the error term, making MSE smaller. A
smaller MSE → larger F-ratio → more sensitive test → higher power to detect treatment
differences. This efficiency gain depends on how much variability exists between blocks.
MASTER FORMULA REFERENCE SHEET
Topic Key Formulas
Arithmetic Mean X̄ = ΣX/n | Grouped: X̄ = Σfm/Σf | Short-cut: X̄ = A + (Σfd/Σf)h
Median (Grouped) M = L + [(n/2 − F) / f] × h
Mode (Grouped) Z = L + [(f1−f0) / (2f1−f0−f2)] × h
GM GM = (X1·X2·...·Xn)^(1/n) | log(GM) = Σlog(X)/n
HM HM = n / Σ(1/X)
AM ≥ GM ≥ HM Equality only when all values are equal
Standard Deviation σ = √[Σ(X−X̄)²/N] | s = √[Σ(X−X̄)²/(n−1)]
Variance σ² = ΣX²/N − (X̄)²
(computational)
CV CV = (σ / X̄) × 100%
Pearson r r = Σ(X−X̄)(Y−Ȳ) / √[Σ(X−X̄)²·Σ(Y−Ȳ)²]
Rank Correlation rs = 1 − 6ΣD²/n(n²−1)
Regression b b = [nΣXY−ΣXΣY] / [nΣX²−(ΣX)²] | a = Ȳ − bX̄
R² R² = r² (proportion of variation explained)
Karl Pearson Skewness Sk = 3(Mean − Median)/SD | Range: −3 to +3
Fisher's Index P01 = √(Laspeyres × Paasche)
Normal Z Z = (X−μ)/σ | Z = (X̄−μ)/(σ/√n)
Binomial P(X=x) = C(n,x)pˣ(1−p)ⁿ⁻ˣ | μ=np | σ²=np(1−p)
Poisson P(X=x) = e⁻λλˣ/x! | μ=σ²=λ
Bayes' Theorem P(Aᵢ|B) = P(Aᵢ)P(B|Aᵢ) / ΣP(Aⱼ)P(B|Aⱼ)
Conditional Probability P(A|B) = P(A∩B) / P(B)
One-Sample Z-test Z = (X̄−μ₀)/(σ/√n) | σ known
One-Sample t-test t = (X̄−μ₀)/(s/√n) | df = n−1
Paired t-test t = D̄/(sD/√n) | df = n−1
Pooled t-test t = (X̄₁−X̄₂)/[Sp√(1/n₁+1/n₂)] | df = n₁+n₂−2
Chi-Square χ² = Σ(O−E)²/E | df = (r−1)(c−1)
One-Way ANOVA F F = MSB/MSW = (SSB/(k−1))/(SSW/(N−k))
95% CI for μ X̄ ± 1.96σ/√n (Z) | X̄ ± t·s/√n (t)
Sample Size (mean) n = (Zα/2·σ/E)²
Sample Size n = Zα/2²·p(1−p)/E²
(proportion)
CBR / CDR = (Births or Deaths / Mid-year Population) × 1000
TFR TFR = 5 × Σ(ASFR)
RCBD Model Yᵢⱼ = μ + τᵢ + βⱼ + εᵢⱼ
Latin Square Model Yᵢⱼₖ = μ + τᵢ + ρⱼ + γₖ + εᵢⱼₖ
Best of luck in your examinations!