0% found this document useful (0 votes)

21 views7 pages

Descriptive Statistics: Central Tendency & Dispersion

Descriptive statistics

Uploaded by

bongani mungadze

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views7 pages

Descriptive Statistics: Central Tendency & Dispersion

Descriptive statistics

Uploaded by

bongani mungadze

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

DESCRIPTIVE STATISTICS-MEASURES OF CENTRAL TENDENCY AND DISPERSION

INTRODUCTION

The major focus is descriptive statistics which is used to describe the basic features of the data in a study.
They provide simple summaries about the sample and the measures. Together with simple graphics
analysis studied in unit 2, they form the basis of virtually every quantitative analysis of data.

Descriptive Statistics are used to present quantitative descriptions in a manageable form. In a research
study we may have lots of measures. Descriptive statistics help us to simplify large amounts of data in a
sensible way. Each descriptive statistic reduces lots of data into a simpler summary. In this unit we are
going the concentrate on the measures of central tendency (mean, mode and median) and measures of
dispersion (range, standard deviation and coefficient of variation) as measures that provide a summary of
any given quantitative data.

MEASURE OF CENTRAL TENDENCY

The central tendency of a distribution is an estimate of the "centre" of a distribution of values. There are
three major types of estimates of central tendency: Mean. Mode and media

The Mean or average is probably the most commonly used method of describing central tendency
denoted by x. It lends itself to subsequent analysis because it includes all values in the universe but may
not coincide with any value and in certain instances may be unrepresentative due to extreme numbers. We
compute the mean adding up all the values and then divide by the number of values.

We have two formulas that are used to compute the mean:

In mathematical terms, the general formula is denoted by:

Where n is the sample size and the x correspond to the observed valued

For ungrouped data, the formula is: Σx/n

Mean of grouped data   fx

f

Where n= no. of values, f= no of values in an interval and x = midpoint of class interval.

Example

1. Consider the yields obtained by a farmer for his maize enterprise for the past 10 year

Season Yield of maize in Tonnes

2000-2001 16
2001-2002 13
2002-2003 25
2003-2004 24
2004-2005 18
2005-2006 18
2006-2007 12
2007-2008 15
2008-2009 19
2009-2010 26
Total 186

Mean= Σx/n=16+13+25+24+18+18+12+15+19+26
= 18.6
10

The mode is the most frequently occurring value in the set of scores. To determine the mode, you must
order the yields shown in above table, and then count each one.

12, 13, 15, 16,18,18,19,24,25,26

The most frequently occurring value is the mode. In our example, the value 18 occurs twice and is the
model. In some distributions there is more than one modal value. For instance, in a bimodal distribution
there are two values that occur most frequently. If the distribution is truly normal (i.e., bell-shaped), the
mean, median and mode are all equal to each other.

The Median is the score found at the exact middle of the set of values. More precisely, the median is any
middle value in order of size, if n is odd, or the mean of the two middle numbers if n is even. Median is
more representative, when data contain a few very large numbers or small values although it cannot be
used for subsequent calculation unlike the mean. One way to compute the median is to list all scores in
numerical order, and then locate the score in the centre of the sample. For example, if there are 500 scores
in the list, score #250 would be the median. If we order the 10 yields shown above, we would get:

12, 13, 15, 16,18,18,19,24,25,26

There are 10scores and score #5 and #6 represent the halfway point. Since both of these scores are 18, the
median is 18. If the two middle scores had different values, you would have to interpolate to determine
the median.
Example

The dairy herd was weighed and the results were tabulated in the table below:

Live-weight in (KG) Frequency

150-154 08
155-159 16
160-164 43
165-170 29
170-174 04

Calculate the mean weight of the herd.

Working:

1. The first step is to find the midpoints of each weight category (x).

2. Multiply by frequencies of each category (fx)

3. Sum all products of fx

4. Divide by total frequency (Σf)

This can be summarised in a tabular form below:

X= lower limit plus upper limit divided by 2

Live-weight in (KG) Mid point Frequency Fx

(x) (f)
150-154 152 08 1 216
155-159 157 16 2 512
160-164 162 43 6 866
165-169 167 29 4 843
170-174 172 04 688
Total 100 16 125

Mean of grouped data

 fx
f
16 125
=
100

=161.25Kg

The mean weight was found to be 161.25kg for the dairy herd.

MEASURES OF DISPERSION.

Dispersion refers to the spread/ variation/ scatter of the values around the central tendency. This is vital
for:

i) Assessing reliability of the averages of the data.

ii) Serves as a basis for control of variability e.g. in quality control that assess variations in the products.

There are two common measures of dispersion, the range and the standard deviation.

The range is the simplest measure of dispersion which is calculated by simply taking the difference
between the maximum and minimum values in the data set. However, the range only provides
information about the maximum and minimum values and does not say anything about the values in
between.

The Standard Deviation is a more accurate and detailed estimate of dispersion because an outlier can
greatly exaggerate the range. The Standard Deviation shows the relation that set of scores has to the mean
of the sample.

We have different formula used to compute the standard deviation and these are:

For grouped data:

Or alternatively it can be given as:

 fx  fx 
2
2 
 
f   f 

Where +
x is the variable

f is the frequency of responses

VARIANCE

It is defined as sum of squared deviations from the mean. The general formula is given as
3.3.5 VARIANCE AND STANDARD DEVIATION:

Step by Step Simple calculation:

a. Calculate the mean, x.

b. Write a table that subtracts the mean from each observed value.
c. Square each of the differences.
d. Add this column.
e. Divide by n -1 where n is the number of items in the sample. This is the variance.
f. To get the standard deviation we take the square root of the variance

Although this computation may seem convoluted, it's actually quite simple.

The table below is a summary of the steps above using the ungroup data example:

X x - 49.2 (x - 49.2 )2
15 -5.875 34.515625
20 -0.875 0.765625
21 0.125 0.015625
20 -0.875 0.765625
36 15.125 228.765625
15 -5.875 34.516525
25 4.125 17.015625
15 5.875 34.515625
Total Σ= 350.875

Now, s2 = 350.875
= 50.125
8-1

S = √50.125 =7.07990112953

The standard deviation allows some conclusions about specific scores in our distribution.

Assuming that the distribution of scores is normal or bell-shaped (or close to it!), the following
conclusions can be reached:

 approximately 68% of the scores in the sample fall within one standard deviation of the mean
 approximately 95% of the scores in the sample fall within two standard deviations of the mean
 approximately 99% of the scores in the sample fall within three standard deviations of the mean

For instance, since the mean in our example is 20.875 and the standard deviation is 7.0799, an estimation
can be drawn from the above that approximately 95% of the scores will fall in the range of 20.875-
(2*7.0799) to 20.875+(2*7.0799) or between 6.7152 and 35.0348. This kind of information is a critical
stepping stone to enabling comparison between the performances of an individual on one variable with
their performance on another, even when the variables are measured on entirely different scale

 fx   fx 
2 2

standard deviation of grouped data   

f   f 

The sample standard deviation will be denoted by s and the population standard deviation will be denoted
by the Greek letter s.

The sample variance will be denoted by s2 and the population variance will be denoted by s2.

The variance and standard deviation describe how spread out the data is. If the data all lies close to the
mean, then the standard deviation will be small, while if the data is spread out over a large range of
values, s will be large. Having outliers will increase the standard deviation.

One of the flaws involved with the standard deviation, is that it depends on the units that are used. One
way of handling this difficulty, is called the coefficient of variation which is the standard deviation
divided by the mean times 100%

S
CV= x100%
m

In the above example, it is

17
x100% = 34.6%
49.2

CONCLUSION

Measures of central tendency/ location are estimates of centre of distribution of values. These are the
mean, the mode and the median.

Mean is commonly used measure of location and it lends itself to subsequent analysis since it includes all
values.

Dispersion measures the variation/scatter of values around the central tendency. The measures of
dispersion include standard deviation, variance, coefficient of variation and the range.

Standard deviation is more accurate and detailed estimate of dispersion and it shows the relation that a set
of values has to the mean.

Formulae
Mean Mode Median
Most frequently occurring value For un grouped data: any middle
For ungrouped data in the set of scores value in order of size if n is odd
or the mean of two middle values
Σx if n is even
=
n

Mean Of Grouped Data 

 fx
f

Measures of dispersion

Range Variance Standard deviation Coefficient of

variation
Highest value s
minus lowest CV= x100%
m
value

3.8 ACTIVITY
1. State the three measures of central tendency?
2. State the most important measure of location and give a reason(s)
3. State the mean formula of the grouped data.
4. Define the term dispersion and give the formula for the standard deviation.
5. Evaluate the relationships that do exist between standard deviation, variance, mean and coefficient
of variation.
6. Find the mean, mode, median, standard deviation and relative dispersion of the following data
which is the maize height distribution in field
Height Frequency

153- 157 04
158- 162 11
163- 167 20
168- 172 24
173- 177 17
178- 182 4

Common questions

The three major types of estimates of central tendency are the mean, mode, and median. The mean is computed by adding all values and dividing by the number of values, using the formula Σx/n for ungrouped data . The mode is the most frequently occurring value in a set, determined by arranging the data in order of magnitude and identifying the value that appears most often . The median is the middle score of a dataset, found by ordering the data and identifying the middle value if n is odd, or the mean of the two middle numbers if n is even .

The standard deviation is the square root of the variance and represents the amount of variation or dispersion in a dataset relative to the mean. Variance is computed as the average of the squared differences from the mean, while the standard deviation is the square root of this variance, providing a measure that relates directly to the data's measurement units . The coefficient of variation is a normalized measure of dispersion, calculated as the standard deviation divided by the mean and expressed as a percentage, allowing comparison of relative variability between datasets with different units or mean values .

Measures of dispersion such as range and standard deviation provide insights into how spread out the scores in a dataset are around a measure of central tendency. While measures of central tendency like the mean give a central value, they do not indicate how much variation exists around this central measure. The range indicates the difference between the maximum and minimum values, showing the spread of the entire dataset, whereas the standard deviation provides a more detailed estimate of dispersion, indicating the average distance of each data point from the mean. Together, these measures allow for a fuller understanding of the dataset's distribution and variability .

The mean, median, and mode are all equal in a perfectly normal (bell-shaped) distribution. This occurs when data is symmetrically distributed around a central point, and the frequency of data points gradually decreases as you move away from the center in both directions. In such distributions, the central tendency measures coincide, making the mean, median, and mode the same value .

The mean of grouped data is calculated by determining the midpoint of each data interval, multiplying these midpoints by their respective frequencies to get a weighted sum, summing the results, and then dividing by the total number of observations (Σf). This calculation is necessary because grouped data represents ranges of values rather than individual data points, thus requiring this method to estimate the mean .

The median is considered more representative than the mean when a dataset contains extreme values because it is not affected by outliers. The mean includes all values in its calculation, which can be skewed by very large or small outliers. In contrast, the median only considers the middle value(s) when ordered, providing a better central measure when there are extreme values present in the data .

The standard deviation is considered more accurate than the range as a measure of dispersion because it takes into account all data points relative to the mean, rather than just the extremes. Unlike the range, which only reflects the distance between the maximum and minimum values, the standard deviation calculates the spread of all values in a dataset, thereby providing a comprehensive understanding of data variability. Moreover, it is less sensitive to outliers than the range, making it a more robust measure of dispersion .

The coefficient of variation (CV) assists in comparing data distributions with different units by expressing standard deviation as a percentage of the mean. This transformation allows for a direct comparison of the relative variability of datasets with differing unit scales and mean values, as it normalizes the measure of dispersion. This is particularly useful in fields like finance or physical sciences, where comparisons across different scales are common .

The range can be an insufficient measure of variability in datasets with outliers or skewed distributions, as it only considers the maximum and minimum values and ignores all intermediate data points. It might exaggerate the spread of values because a single extreme outlier can dramatically increase the range, making it appear as if the data has a broader spread than it truly does .

Variance in sample data is calculated by subtracting the mean from each data point, squaring the result, summing these squares, and dividing by the sample size minus one (n-1). This division by n-1, known as Bessel's correction, corrects the bias in the estimation of the population variance from a finite sample. Standard deviation, which is the square root of variance, is used because variance is expressed in squared units of the original data, necessitating a square root transformation to return the measure to the units of the data values, making it more interpretable .

محاضره 5.4 د - جابر
No ratings yet
محاضره 5.4 د - جابر
14 pages
Central Tendency & Dispersion in Statistics
No ratings yet
Central Tendency & Dispersion in Statistics
31 pages
Central Tendency and Dispersion Measures
No ratings yet
Central Tendency and Dispersion Measures
22 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
28 pages
Dispersion Math Project Class11
No ratings yet
Dispersion Math Project Class11
7 pages
Data Summarization and Sampling Techniques
No ratings yet
Data Summarization and Sampling Techniques
64 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
74 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
85 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
91 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
11 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
12 pages
Understanding Measures of Dispersion
100% (1)
Understanding Measures of Dispersion
11 pages
Central Tendency and Variability Explained
No ratings yet
Central Tendency and Variability Explained
38 pages
Descriptive Statistics Overview: T-Test & Measures
No ratings yet
Descriptive Statistics Overview: T-Test & Measures
6 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
102 pages
Understanding Measures of Dispersion in Statistics
No ratings yet
Understanding Measures of Dispersion in Statistics
16 pages
Understanding Standard Deviation and Skewness
No ratings yet
Understanding Standard Deviation and Skewness
8 pages
2 Measures of Location - Dispersion
No ratings yet
2 Measures of Location - Dispersion
61 pages
Measures of Variability in Data Analysis
No ratings yet
Measures of Variability in Data Analysis
5 pages
Biostatistics: Measures of Dispersion
No ratings yet
Biostatistics: Measures of Dispersion
103 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
38 pages
Mean, Variance, and Std Dev Calculations
No ratings yet
Mean, Variance, and Std Dev Calculations
16 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
37 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
33 pages
Central Tendency and Variability Measures
No ratings yet
Central Tendency and Variability Measures
42 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
54 pages
Module I - Descriptive and Inferential Statistics - Updated On 23022026
No ratings yet
Module I - Descriptive and Inferential Statistics - Updated On 23022026
99 pages
Module I - Descriptive and Inferential Statistics
No ratings yet
Module I - Descriptive and Inferential Statistics
94 pages
Overview of Measures of Dispersion
100% (1)
Overview of Measures of Dispersion
13 pages
Measures of Data Variability Explained
No ratings yet
Measures of Data Variability Explained
33 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
8 pages
Central Tendency and Variability Measures
No ratings yet
Central Tendency and Variability Measures
29 pages
Central Tendency and Dispersion in Statistics
No ratings yet
Central Tendency and Dispersion in Statistics
23 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
5 pages
Central Tendency and Variation Explained
No ratings yet
Central Tendency and Variation Explained
12 pages
Understanding Descriptive Statistics
100% (3)
Understanding Descriptive Statistics
7 pages
Understanding Interfractile Range
No ratings yet
Understanding Interfractile Range
55 pages
Central Tendency & Variability Explained
No ratings yet
Central Tendency & Variability Explained
29 pages
Dispersion - Measure
No ratings yet
Dispersion - Measure
45 pages
Mean Calculation for Successes in Sample
No ratings yet
Mean Calculation for Successes in Sample
510 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
47 pages
Introduction to Statistics Concepts
No ratings yet
Introduction to Statistics Concepts
32 pages
Central Tendency & Variability Measures
No ratings yet
Central Tendency & Variability Measures
11 pages
Central Tendency and Dispersion Explained
100% (1)
Central Tendency and Dispersion Explained
30 pages
3 Numerical Descriptive Measures
No ratings yet
3 Numerical Descriptive Measures
55 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
48 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
54 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
38 pages
Data Presentation Techniques Explained
No ratings yet
Data Presentation Techniques Explained
104 pages
Univariate Analysis in Statistics
No ratings yet
Univariate Analysis in Statistics
63 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
11 pages
Strategic Alignment in Ethiopian Universities
No ratings yet
Strategic Alignment in Ethiopian Universities
19 pages
Nutrition Knowledge in Households Study
100% (1)
Nutrition Knowledge in Households Study
51 pages
M01 Introduction To Statistics Wen Ok
No ratings yet
M01 Introduction To Statistics Wen Ok
69 pages
Nielsen LPM Market Overview
No ratings yet
Nielsen LPM Market Overview
45 pages
IGCSE Geography Paper 4 Notes
100% (7)
IGCSE Geography Paper 4 Notes
31 pages
Inflation's Impact on Lingunan Micro-Businesses
No ratings yet
Inflation's Impact on Lingunan Micro-Businesses
22 pages
Jaunpur Youth Financial Literacy Study
No ratings yet
Jaunpur Youth Financial Literacy Study
46 pages
Non-Probability Sampling Techniques Guide
No ratings yet
Non-Probability Sampling Techniques Guide
19 pages
Healthcare Utilization in Barangay Meocan
No ratings yet
Healthcare Utilization in Barangay Meocan
19 pages
Article
No ratings yet
Article
10 pages
Victim Blaming in Indian Sexual Abuse Cases
No ratings yet
Victim Blaming in Indian Sexual Abuse Cases
33 pages
Statistical Analysis of Sample Data Sets
No ratings yet
Statistical Analysis of Sample Data Sets
10 pages
Research Methodology Overview
No ratings yet
Research Methodology Overview
1 page
Sampling Distribution Laboratory Exercise
100% (1)
Sampling Distribution Laboratory Exercise
4 pages
Consumer Research: Consumer Behavior, Ninth Edition Schiffman & Kanuk
No ratings yet
Consumer Research: Consumer Behavior, Ninth Edition Schiffman & Kanuk
30 pages
Advantages of Sample Surveys
No ratings yet
Advantages of Sample Surveys
10 pages
Understanding Quantitative Research Methods
No ratings yet
Understanding Quantitative Research Methods
3 pages
Contraband Trade Effects on Adama Security
No ratings yet
Contraband Trade Effects on Adama Security
14 pages
All MBA Assignments July 2025
No ratings yet
All MBA Assignments July 2025
14 pages
Quantitative Research Methods Overview
No ratings yet
Quantitative Research Methods Overview
56 pages
Effects of Daily Allowance on Students
No ratings yet
Effects of Daily Allowance on Students
7 pages
Understanding Measurement Scales in Research
No ratings yet
Understanding Measurement Scales in Research
15 pages
Evaluate Health Promotion Programs: 10 Steps
No ratings yet
Evaluate Health Promotion Programs: 10 Steps
3 pages
Probability of Tulip Selection at Site A
No ratings yet
Probability of Tulip Selection at Site A
8 pages
Students' Admission Preferences in Nigeria
No ratings yet
Students' Admission Preferences in Nigeria
19 pages
BI Adoption in Jordanian Banking Sector
No ratings yet
BI Adoption in Jordanian Banking Sector
15 pages
Understanding FCMA in Accounting
No ratings yet
Understanding FCMA in Accounting
62 pages
Tooth Loss: Epidemiology Insights
No ratings yet
Tooth Loss: Epidemiology Insights
32 pages
Grade 12 Research Skills Assessment
No ratings yet
Grade 12 Research Skills Assessment
10 pages
ECE 069 Engineering Data Analysis Quiz
No ratings yet
ECE 069 Engineering Data Analysis Quiz
4 pages

Descriptive Statistics: Central Tendency & Dispersion

Uploaded by

Descriptive Statistics: Central Tendency & Dispersion

Uploaded by

DESCRIPTIVE STATISTICS-MEASURES OF CENTRAL TENDENCY AND DISPERSION

MEASURE OF CENTRAL TENDENCY

We have two formulas that are used to compute the mean:

In mathematical terms, the general formula is denoted by:

For ungrouped data, the formula is: Σx/n

Mean of grouped data   fx

Where n= no. of values, f= no of values in an interval and x = midpoint of class interval.

Season Yield of maize in Tonnes

12, 13, 15, 16,18,18,19,24,25,26

12, 13, 15, 16,18,18,19,24,25,26

Live-weight in (KG) Frequency

Calculate the mean weight of the herd.

2. Multiply by frequencies of each category (fx)

3. Sum all products of fx

4. Divide by total frequency (Σf)

This can be summarised in a tabular form below:

X= lower limit plus upper limit divided by 2

Live-weight in (KG) Mid point Frequency Fx

Mean of grouped data

i) Assessing reliability of the averages of the data.

For grouped data:

Or alternatively it can be given as:

f is the frequency of responses

Step by Step Simple calculation:

a. Calculate the mean, x.

standard deviation of grouped data   

In the above example, it is

Mean Of Grouped Data 

Range Variance Standard deviation Coefficient of

Common questions

What are the three major types of estimates of central tendency and how are they computed?

What are the three major types of estimates of central tendency and how are they computed?

What is the relationship between standard deviation, variance, mean, and coefficient of variation?

What is the relationship between standard deviation, variance, mean, and coefficient of variation?

How do measures of dispersion like the range and standard deviation enhance our understanding of data variability beyond measures of central tendency?

How do measures of dispersion like the range and standard deviation enhance our understanding of data variability beyond measures of central tendency?

What conditions must be present for the mean, median, and mode to be equal, and what distributions typically show this?

What conditions must be present for the mean, median, and mode to be equal, and what distributions typically show this?

How is the mean of grouped data calculated, and why is this process necessary?

How is the mean of grouped data calculated, and why is this process necessary?

Why is the median considered a more representative measure of central tendency when data contains extreme values?

Why is the median considered a more representative measure of central tendency when data contains extreme values?

Why might the standard deviation be considered a more accurate measure of dispersion compared to the range?

Why might the standard deviation be considered a more accurate measure of dispersion compared to the range?

How does understanding the coefficient of variation assist in comparing data distributions with different units?

How does understanding the coefficient of variation assist in comparing data distributions with different units?

In what scenarios would range be an insufficient measure of variability, and why might it exaggerate the spread of the values?

In what scenarios would range be an insufficient measure of variability, and why might it exaggerate the spread of the values?

What formulae are typically used to calculate variance and standard deviation in sample data, and why does the standard deviation require a square root of variance?

What formulae are typically used to calculate variance and standard deviation in sample data, and why does the standard deviation require a square root of variance?

You might also like