Understanding Variability in Statistics

The document discusses variability in data, explaining how it is measured through range, variance, and standard deviation. It provides formulas for calculating these measures, along with examples and problems for practice. Additionally, it touches on degrees of freedom and measures of variability for qualitative and ranked data.

Uploaded by

MOnika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views9 pages

Understanding Variability in Statistics

Uploaded by

MOnika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Describing Variability

Variability describes how the data points (scores) are scattered around that central point. It is the
extent to which data points in a statistical distribution or data set, vary from the average value, as well as
the data points differ from each other. Variability can be measured with the range, the interquartile range
and the standard deviation/variance.

RANGE
The range in statistics for a given data set is the difference between the highest and lowest values.
For example, if the given data set is {2, 5, 8, 10, 3}, then the range will be 10 – 2 = 8.

Problem 1: Find the range of given observations: 32, 41, 28, 54, 35, 26, 23, 33, 38, 40.
Solution: Let us first arrange the given values in ascending order: 23, 26, 28, 32, 33, 35, 38, 40, 41, 54
Since 23 is the lowest value and 54 is the highest value, therefore, the range of the observations will be;
Range (X) = Max (X) – Min (X)= 54 – 23= 31

VARIANCE
Variance is a measure of how data points differ from the mean. A variance is a measure of how
far a set of data (numbers) are spread out from their mean (average) value. Variance is calculated by
finding the square of the standard deviation of a variable.
Variance=(Standard deviation)2, σ 2 =Σ ( Χ−μ)2 / N
In the formula above, μ represents the mean of the data points, x is the value of an individual data point
and N is the total number of data points.
Problem 2: Find the variance: X = 5, 8, 6, 10, 12, 9, 11, 10, 12, 7
Solution
Mean = sum (x)/ n, n= 10
sum (x) = 5+8+6+10+12+9+11+10+12+ 7= 90
Mean, μ = 90 / 10 = 9
Deviation from mean x- μ = -4, -1, -3, 1, 3, 0, 2, 1, 3,-2
(x-μ)2 = 16,1,9,1,9,0,4,1,9,4
Σ(x-μ)2 = 16+1+9+1+9+0+4+1+9+4 =54

σ2= Σ(x-μ)2 /n =54/10 = 5.4

STANDARD DEVIATION
Standard deviation is simply the square root of the variance. Standard deviation measures the
standard distance between a score and the mean.
Standard deviation=√Variance
The standard deviation is a measure of how the values in data differ from one another or how spread out
data is. There are two types of variance and standard deviation in terms of sample and population.
Sum of Squares (SS):
The sum of squared deviation scores or more simply the sum of squares, symbolized by SS.
Sum of Squares (SS) Formulas for Population:Sum of Square (SS) for population definition formula is
given below:
Sum of Square (SS) = Σ(X-μ)2
Sum of Square (SS) for population computation formula is given below:
2
(ΣΧ )
SS=ΣΧ 2−
N
Sum of Squares (SS) Formulas for Sample:
Sum of Squares for sample definition formula:
2
SS=Σ (X− X)
Sum of Squares for sample computation formula :
2
(ΣΧ )
2
SS=ΣΧ −
n
Standard deviation for Populationσ :

Standard deviation for Sample s:

DEGREES OF FREEDOM (df)
Degrees of freedom (df) refers to the number of values that are free to vary, given one or more
mathematical restrictions, in a sample being used to estimate a population characteristic.
Formula: Degree of freedom, df = n-1
In fact, we can use degrees of freedom to rewrite the formulas for the sample variance and standard
deviation:

where s2 and s represent the sample variance and standard deviation, SS is the sum of
squares as defined in either Formula 4.3 or 4.4, and df is the degrees of freedom and
equals n − 1.
INTERQUARTILE RANGE (IQR)
The interquartile range (IQR), is simply the range for the middle 50 percent of the scores.
More specifically, the IQR equals the distance between the third quartile (or 75th percentile) and the first
quartile (or 25th percentile), that is, after the highest quarter (or top 25 percent) and the lowest quarter (or
bottom 25 percent) have been trimmed from the original set of scores.
To find the interquartile range (IQR), first find the median (middle value) of the lower and upper half of
the data. These values are quartile 1 (Q1) and quartile 3 (Q3). The IQR is the difference between Q3 and
Q1.

Progress Check *4.8 Determine the values of the range and the IQR for the following sets of data.
(a) Retirement ages: 60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63
(b) Residence changes: 1, 3, 4, 1, 0, 2, 5, 8, 0, 2, 3, 4, 7, 11, 0, 2, 3, 4

M E A S U R E S O F VA R I A B I L I T Y F O R Q U A L I TAT I V E
AND RANKED DATA

Qualitative Data
Measures of variability are virtually nonexistent for qualitative or nominal data. It is probably
adequate to note merely whether scores are evenly divided among the various classes (maximum
variability), unevenly divided among the various classes (intermediate variability), or concentrated mostly
in one class (minimum variability).
Ordered Qualitative and Ranked Data
If qualitative data can be ordered because measurement is ordinal (or if the data are ranked), then
it’s appropriate to describe variability by identifying extreme scores (or ranks).

Problems

1. Progress Check *4.8 Determine the values of the range and the IQR for the following
sets of data.
(a) Retirement ages: 60, 63, 45, 63, 65, 70, 55, 63, 60, 65, 63
(b) Residence changes: 1, 3, 4, 1, 0, 2, 5, 8, 0, 2, 3, 4, 7, 11, 0, 2, 3, 4
Ans:
(a) range = 25; IQR = 65 – 60 = 5
(b) range = 11; IQR = 4 – 1 = 3

2. Using the definition formula for the sum of squares, calculate the
sample standard deviation for the following four scores: 1, 3, 4, 4.
Ans:

3. Using the computation formula for the sum of squares, calculate

the population standard deviation for the scores in (a) and the sample standard deviation for
the scores in (b).
(a) 1, 3, 7, 2, 0, 4, 7, 3 (b) 10, 8, 5, 0, 1, 1, 7, 9, 2

Ans:

4. Days absent from school for a sample of 10 first-grade children are:

8, 5, 7, 1, 4, 0, 5, 7, 2, 9.
a) Before calculating the standard deviation, decide whether the definitional or computational formula
would be more efficient. Why?
b) Use the more efficient formula to calculate the sample standard deviation.

Ans: (a) computation formula since the mean is not a whole number.

5. As a first step toward modifying his study habits, Phil keeps daily
records of his study time.
(a) During the first two weeks, Phil’s mean study time equals 20 hours per week. If he studied
22 hours during the first week, how many hours did he study during the second week?
(b) During the first four weeks, Phil’s mean study time equals 21 hours. If he studied 22, 18,
and 21 hours during the first, second, and third weeks, respectively, how many hours did
he study during the fourth week?
(c) If the information in (a) and (b) is to be used to estimate some unknown population characteristic, the
notion of degrees of freedom can be introduced. How many degrees of
freedom are associated with (a) and (b)?
(d) Describe the mathematical restriction that causes a loss of degrees of freedom in (a) and (b).

Ans: (a) 18 hours

(b) 23 hours
(c) df = 1 in (a) and df = 3 in (b)
(d) When all observations are expressed as deviations from their mean, the sum of all
deviations must equal zero.

Common questions

Population formulas for standard deviation and variance assume data represents the entire population, dividing the sum of squared deviations by N (the total number of data points). Sample formulas, however, divide by n-1 to incorporate degrees of freedom, acknowledging that data is a sample subset used to estimate a population characteristic, which imposes constraints . The sample formulas correct for bias, ensuring these statistics are unbiased estimators of the population parameters. The choice between these formulas affects statistical inference, as using a sample formula with population data underestimates variability and vice versa, potentially skewing results.

To estimate a population parameter from a sample statistic, begin by calculating the sample mean or variance. When estimating variance, use the sample formula dividing by n-1, not n, to compensate for the constraint posed by the mean calculation, termed degrees of freedom . This adjustment ensures that sample statistics remain unbiased and suitable for inferring about the entire population. In practice, degrees of freedom address dependency in the data, quantified as n minus the number of independent constraints, such as the calculation of a mean. For example, if average study time is calculated over several weeks, each mean functions as a constraint reducing degrees of freedom, a critical consideration in estimating population parameters .

Choosing between definitional and computational formulae affects ease and accuracy of calculations. The definitional formula, directly using deviations from the mean, provides clarity in concept and proves foundational in understanding variability . However, it is less efficient for large or complex data, prone to computational errors due to manual calculations. Conversely, the computational formula, which uses squared scores and squares of sums, is more efficient for large data sets, reducing potential for arithmetic errors . Computational formulae are especially preferred where the mean is not a whole number or when data manipulation requires precision, as evidenced in the decision to use it for sample standard deviation when calculating days absent from school .

To determine the population standard deviation using the computation formula for the sum of squares (SS), sum the squares of each observation, subtract the square of the sum of all observations divided by N, and divide by N . Finally, take the square root of this variance to get the standard deviation. This formula aids in efficiently handling large data sets and minimizes arithmetic errors common with the definitional approach, especially when the mean is convoluted or a non-integer, making it practical for comprehensive data distribution analysis. Through this method we gain insights into the average deviation of data points, reflecting overall data spread and variability .

Finding a data set's standard deviation involves computing the square root of its variance. For population data, calculate each observation's deviation from the mean, square these deviations, sum them, and divide by the total number of observations (N). For sample data, follow similar steps but divide by n-1 instead of N, accounting for degrees of freedom . This adjustment corrects for the sample mean's constraint, ensuring accuracy in larger population estimations. The choice between these processes depends on whether the data encapsulates an entire population or a sample . These differences ensure unbiased variability and accurate standard deviation regardless of context.

Variability provides insight into how data points are scattered around a central value. The range simply shows the difference between the maximum and minimum values, offering a basic understanding of spread. The interquartile range (IQR) focuses on the middle 50% of the data, reducing the effect of outliers by measuring the range between the third quartile (Q3) and the first quartile (Q1). Standard deviation provides a more comprehensive measure as it calculates the average distance of each data point from the mean, considering the overall distribution's variability . Each measure highlights different aspects of distribution spread: range is sensitive to outliers, IQR provides a more robust measure by excluding extreme values, and standard deviation offers a detailed view of data dispersion around the mean.

The sum of squares (SS) is used differently for population and sample standard deviation calculations due to the adjustment for degrees of freedom in sample estimations. For a population, SS is simply the sum of squared differences from the mean, divided by the number of observations (N). For a sample, the calculation divides by n-1, where n is the number of observations, to account for the sample mean's use as an estimate, which incurs a degree of freedom. This distinction ensures that sample statistics are unbiased and accurately reflect population variability .

The interquartile range (IQR) is effective for evaluating data sets with outliers because it measures the spread of the middle 50% of a data set, thus minimizing the influence of extreme values. To calculate IQR, identify the first quartile (Q1) and third quartile (Q3) of a data set, then subtract Q1 from Q3. For example, in the data set with residence changes: 1, 3, 4, 1, 0, 2, 5, 8, 0, 2, 3, 4, 7, 11, 0, 2, 3, 4, the IQR is Q3 (4) minus Q1 (1), resulting in an IQR of 3 . This provides a measure of variability that excludes outliers, offering a robust assessment of dispersion.

Degrees of freedom refer to the number of independent values that can vary in an analysis after certain constraints are applied. It's crucial when estimating population parameters from sample statistics as it corrects the bias in variance estimates. For sample variance and standard deviation, degrees of freedom are calculated as n-1 to account for the fact that the sample mean, which is used in these calculations, is itself an estimate that imposes a constraint . This adjustment ensures that variance and standard deviation are unbiased estimators of the population parameters, preventing underestimation of variability.

The range of the data set {1, 3, 7, 2, 0, 4, 7, 3} is calculated by identifying the maximum and minimum values and subtracting the latter from the former. Here, the range is 7 (max) - 0 (min) = 7 . This statistic provides a simplistic view of data spread, indicating the total extent of variability, but it does not account for individual variance within the data set. Unlike the interquartile range or standard deviation, the range does not provide information about the distribution of data points, their deviation from the mean, or sensitivity to outliers.

Understanding Variance in Data Analysis
No ratings yet
Understanding Variance in Data Analysis
10 pages
Understanding Variability in Statistics
No ratings yet
Understanding Variability in Statistics
4 pages
Measures of Variability in Statistics
100% (2)
Measures of Variability in Statistics
71 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
17 pages
Understanding Data Variability and Measures
No ratings yet
Understanding Data Variability and Measures
104 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
6 pages
Understanding Measures of Variability
100% (1)
Understanding Measures of Variability
20 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
50 pages
Variance
No ratings yet
Variance
4 pages
PDF Measure of Variability
No ratings yet
PDF Measure of Variability
28 pages
Understanding Interquartile Range
No ratings yet
Understanding Interquartile Range
4 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
5 pages
Understanding Measures of Variability
100% (1)
Understanding Measures of Variability
13 pages
Measures of Data Variability Explained
No ratings yet
Measures of Data Variability Explained
33 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
40 pages
Measures of Variation in Statistics
100% (3)
Measures of Variation in Statistics
24 pages
Measures of Variation in Statistics
No ratings yet
Measures of Variation in Statistics
24 pages
Understanding Statistics in Quality Assurance
No ratings yet
Understanding Statistics in Quality Assurance
37 pages
PSYC2010 LecturePPT 04
No ratings yet
PSYC2010 LecturePPT 04
83 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
18 pages
HR Analytics: Understanding Statistics
No ratings yet
HR Analytics: Understanding Statistics
68 pages
Measure of Dispersion Commed - 250404 - 091330
No ratings yet
Measure of Dispersion Commed - 250404 - 091330
38 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
50 pages
A Level Maths Data Representation Guide
No ratings yet
A Level Maths Data Representation Guide
9 pages
Measures of Variability
No ratings yet
Measures of Variability
67 pages
Variance and Standard Deviation Explained
100% (3)
Variance and Standard Deviation Explained
15 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
40 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
8 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
24 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
37 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
37 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
61 pages
Measures of Dispersion in Statistics
No ratings yet
Measures of Dispersion in Statistics
5 pages
Measures of Variation Worksheet
No ratings yet
Measures of Variation Worksheet
20 pages
Understanding Averages and Variability
No ratings yet
Understanding Averages and Variability
5 pages
Advanced Statistics Measures Guide
No ratings yet
Advanced Statistics Measures Guide
10 pages
Inbound 5339657356728445615
No ratings yet
Inbound 5339657356728445615
18 pages
Understanding Measures of Variation
No ratings yet
Understanding Measures of Variation
31 pages
Understanding Standard Deviation in Data
No ratings yet
Understanding Standard Deviation in Data
24 pages
2026-01-16 Lecture 2-Notes EB11
No ratings yet
2026-01-16 Lecture 2-Notes EB11
12 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
31 pages
MEASURES OF VARIABILITYjdjdiisisoeodjjdhhhhhhdj
No ratings yet
MEASURES OF VARIABILITYjdjdiisisoeodjjdhhhhhhdj
15 pages
Measures of Variability in Statistics
No ratings yet
Measures of Variability in Statistics
5 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
33 pages
Measures of Variation and Position
No ratings yet
Measures of Variation and Position
26 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
64 pages
Measures of Variability in Statistics
No ratings yet
Measures of Variability in Statistics
34 pages
Understanding Interquartile Range
No ratings yet
Understanding Interquartile Range
79 pages
Data Summarization and Variation Methods
No ratings yet
Data Summarization and Variation Methods
58 pages
Measures of Variation in Statistics
No ratings yet
Measures of Variation in Statistics
42 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
38 pages
Understanding Variability in Data
No ratings yet
Understanding Variability in Data
27 pages
Biostatistics: Measures of Dispersion
No ratings yet
Biostatistics: Measures of Dispersion
103 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
31 pages
Measures of Variation in Statistics
No ratings yet
Measures of Variation in Statistics
14 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
35 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
29 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
26 pages
Measures of Variation and Dispersion
No ratings yet
Measures of Variation and Dispersion
100 pages
Inventory Management Strategies Explained
No ratings yet
Inventory Management Strategies Explained
82 pages
Excel Functions and Features Guide
No ratings yet
Excel Functions and Features Guide
74 pages
Econometrics Practice Problems: RCTs
100% (1)
Econometrics Practice Problems: RCTs
4 pages
ASME PTC 19.1 Uncertainty Analysis Guide
No ratings yet
ASME PTC 19.1 Uncertainty Analysis Guide
20 pages
Understanding Normal Distribution in Statistics
No ratings yet
Understanding Normal Distribution in Statistics
8 pages
Peer Pressure's Impact on Grade 12 ABM Students
No ratings yet
Peer Pressure's Impact on Grade 12 ABM Students
29 pages
Grade 12 Mathematics P2 Exam September 2023
No ratings yet
Grade 12 Mathematics P2 Exam September 2023
14 pages
2014 - Total and Appendicular Lean Mass Reference Ranges
No ratings yet
2014 - Total and Appendicular Lean Mass Reference Ranges
10 pages
Target Position Uncertainty in Tapping
No ratings yet
Target Position Uncertainty in Tapping
24 pages
Effective Scaling Techniques in ML
No ratings yet
Effective Scaling Techniques in ML
30 pages
12295484166
No ratings yet
12295484166
3 pages
Statistical Quality Control Analysis Report
No ratings yet
Statistical Quality Control Analysis Report
5 pages
Grade 10 Physics Notes Overview
No ratings yet
Grade 10 Physics Notes Overview
34 pages
Flood Risk Mapping in East Java, Indonesia
No ratings yet
Flood Risk Mapping in East Java, Indonesia
7 pages
Understanding Statistical Tools and Methods
No ratings yet
Understanding Statistical Tools and Methods
47 pages
Sampling Distributions Explained
No ratings yet
Sampling Distributions Explained
68 pages
Research Psychology: Scientific Thinking Guide
No ratings yet
Research Psychology: Scientific Thinking Guide
27 pages
Statistical Methods for Data Science Course
No ratings yet
Statistical Methods for Data Science Course
36 pages
Normal Distribution Applications in Statistics
100% (1)
Normal Distribution Applications in Statistics
3 pages
Lean Management vs. Six Sigma Explained
No ratings yet
Lean Management vs. Six Sigma Explained
97 pages
Ionic Liquids for Safer Battery Electrolytes
No ratings yet
Ionic Liquids for Safer Battery Electrolytes
332 pages
Common Statistical Errors in Health Research
No ratings yet
Common Statistical Errors in Health Research
103 pages
Rounding 0.03024 to Two Significant Figures
No ratings yet
Rounding 0.03024 to Two Significant Figures
107 pages
Business Statistics Analysis 2024
No ratings yet
Business Statistics Analysis 2024
9 pages
Numerical Descriptive Techniques Overview
No ratings yet
Numerical Descriptive Techniques Overview
46 pages
XPR205DR Calibration Certificate
No ratings yet
XPR205DR Calibration Certificate
3 pages
Gas Laws Experimentation Guide
No ratings yet
Gas Laws Experimentation Guide
4 pages
Modern Quality Control Tools Quiz
No ratings yet
Modern Quality Control Tools Quiz
37 pages
Flow Test Frequency Reduces Allocation Errors
No ratings yet
Flow Test Frequency Reduces Allocation Errors
22 pages
BBS Program Curriculum Overview
No ratings yet
BBS Program Curriculum Overview
161 pages