0% found this document useful (0 votes)

3 views38 pages

Chapter 3

The document provides an introduction to the measures of central tendency in statistics, focusing on the mean, median, and mode. It details how to calculate these measures using both raw and grouped data, along with examples for clarity. Additionally, it discusses the properties and disadvantages of the arithmetic mean.

Uploaded by

Seid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views38 pages

Chapter 3

Uploaded by

Seid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

College of Natural and Computational Science

Department of Statistics

Introduction to Statistics
(Dep’t: IT)

November, 2024
©Seid A. MSc.) Tepi, Ethiopia
3. Summarization of Data
3.1 Measure of Central Tendency

• The most important objective of statistical analysis is to determine a single value for
the entire mass of data.

• It tells us where the center of the distribution of data is located.

The most commonly used measures of central tendencies are :

The Mean (Arithmetic mean, Weighted mean, Geometric mean and Harmonic means)
 The Median
 The Mode
Quantiles (Quartiles, Deciles and Percentiles).
2
3.1.1 Types of measure of central tendency
The Mean
1. Arithmetic mean
• The arithmetic mean of a sample is the sum of all the observation divided by the
number of observations in the sample.
𝒕𝒉𝒆 𝒔𝒖𝒎 𝒐𝒇 𝒂𝒍𝒍 𝒗𝒂𝒍𝒖𝒆𝒔 𝒊𝒏 𝒕𝒉𝒆 𝒔𝒂𝒎𝒑𝒍𝒆
i.e. 𝑺𝒂𝒎𝒑𝒍𝒆 𝒎𝒆𝒂𝒏 𝒐𝒓 𝒂𝒓𝒊𝒕𝒉𝒎𝒆𝒕𝒊𝒄 𝒎𝒆𝒂𝒏 = 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒗𝒂𝒍𝒖𝒆𝒔 𝒊𝒏 𝒕𝒉𝒆 𝒔𝒂𝒎𝒑𝒍𝒆

Suppose that 𝑥1 , 𝑥2 , … , 𝑥𝑛 are n observed values in a sample of size n taken

from a population of size N.
• Then the arithmetic mean of the sample, denoted by 𝑥,ҧ is given by
𝑿𝟏 +𝑿𝟐 +⋯+𝑿𝒏 σ𝒏
𝒊=𝟏 𝑿𝒊
ഥ=
𝑿 = →for sample
𝐧 𝒏 3
• In general, the sample arithmetic mean is calculated by
• 𝑿ഥ =
𝒏
𝑿𝒊
෍ → 𝑓𝑜𝑟 𝑟𝑎𝑤 𝑑𝑎𝑡𝑎
𝒏
𝒊=𝟏
𝒌
𝒇𝒊 𝑿𝒊
෍ → 𝑓𝑜𝑟 𝑢𝑛𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎 𝑤ℎ𝑒𝑟𝑒 ෍ 𝑓𝑖 = 𝑛.
σ 𝒇𝒊
𝒊=𝟏
𝒌
𝑴𝒊 𝑿𝒊
෍ → 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎 𝑤ℎ𝑒𝑟𝑒 ෍ 𝑓𝑖 = 𝑛.
σ 𝒇𝒊
𝒊=𝟏

4
Example 1 (raw data)
The net weights of five perfume bottles selected at random from the
production line 𝑎𝑟𝑒 85.4, 85.3, 84.9, 85.4 𝑎𝑛𝑑 85. What is the arithmetic
mean weight of the sample observation?

• Solution; 𝐺𝑖𝑣𝑒𝑛 𝑛 = 5

𝑥1 = 85.4, 𝑥2 = 85.3, 𝑥3 = 84.9, 𝑥4 = 85.4 𝑎𝑛𝑑 𝑥5 = 85.

σ𝑛
𝑖=1 𝑋𝑖 85.4+85.3+84.9+85.4+ 85 426
𝑋ത = = = = 85.2.
𝑛 5 5

5
Example 2 (ungrouped fd)
Calculate the mean of the marks of 46 students given below

Marks (𝑋𝑖 ) 9 10 11 12 13 14 15 16 17 18
Frequency (𝑓𝑖 ) 1 2 3 6 10 11 7 3 2 1

Solution: σ 𝑓𝑖 = 𝑛 = 46 is the sum of the frequencies or total

number of observations.
calculate σ𝑘𝑖=1 𝒇𝒊 𝑿𝒊 = 623

ത σ 𝒌 𝒇𝒊 𝑿𝒊 623
𝑋 = 𝒊=𝟏 σ = = 13.54.
𝒇𝒊 46

6
Example 3: (grouped fd)
The net income of a sample of large importers of Urea was
organized into the following table. What is the arithmetic mean
of net income? Net income 2-4 5-7 8-10 11-13 14-16
Number of importers 1 4 10 3 2
𝒎𝒊 3 6 9 12 15
Solution: σ 𝑓𝑖 = 𝑛 = 20 is the sum of the frequencies or total
number of observations.
To calculate σ𝑘𝑖=1 𝒇𝒊 𝒎𝒊 first find the mid points

𝒇𝒊 𝒎𝒊 𝟏𝟖𝟑
𝑋ത = σ𝒌𝒊=𝟏 σ 𝒇𝒊
= = 𝟗. 𝟏𝟓. 7
𝟐𝟎
Combined mean

If we have an arithmetic means 𝑋1 , 𝑋2 , … , 𝑋𝑘 of k groups

having the same unit of measurement of a variable, with sizes
𝑛1 , 𝑛2 , … , 𝑛𝑘 observations respectively,

• we can compute the combined mean of the variant values of the

groups taken together from the individual means by

ഥ𝟏 +𝒏𝟐 𝒙
𝒏𝟏 𝒙 ഥ𝟐 +⋯+𝒏𝒌 𝒙
ഥ𝒌 σ𝒌 ഥ𝒊
𝒊=𝟏 𝒏𝒊 𝒙
𝑿𝒄𝒐𝒎 = = σ𝒌
𝒏𝟏 +𝒏𝟐 +⋯+𝒏𝒌 𝒊=𝟏 𝒏𝒊
8
Example 1:
Compute the combined mean for the following two set

𝑺𝒆𝒕 𝑨: 1, 4, 12, 2, 8 𝑎𝑛𝑑 6 ; 𝑺𝒆𝒕 𝑩: 3, 6, 2, 7 𝑎𝑛𝑑 4.

σ6𝑖=1 𝑋𝑖 33 σ5𝑖=1 𝑋𝑖 22
Solution: 𝑛1 = 6, 𝑥ҧ1 = = = 5.5 ; 𝑛2 = 5, 𝑥ҧ2 = = = 4.4
𝒏𝟏 6 𝒏𝟐 5

𝑛1 𝑥ҧ1 + 𝑛2 𝑥ҧ2 6 𝑥 5.5 + 5 𝑥 4.4 55

𝑋𝑐𝑜𝑚 = = = = 5.
𝑛1 + 𝑛2 6+5 11

Exercise: The mean weight of 150 students in a certain class is 60 kg. The mean

weight of boys in the class is 70 kg and that of girl’s is 55 kg . Find the number of

boys and girls in the class? Answer: 𝑛1 = 50 𝑎𝑛𝑑 𝑛2 = 100 9

Properties of arithmetic mean
1. The algebraic sum of the deviations of each value (xi) from the mean (x̅) is equal
to zero. That is
𝑛 𝑛 𝑛
ഥ
෌ (Xi − X) = 0 =≫ σ𝑖=1 Xi − ෌ X ഥ = nതx − nതx = 0.
𝑖=1 𝑖=1
3
As an e.g., the mean of 3, 8 & 4 𝑖𝑠 5. Then ෌𝑖=1 ഥ = 3 − 5 + (8 −
Xi − X

10
5. The mean is sensitive to extreme values.

6. Uniqueness: the mean of any set of data is unique.

7. It can be used for further treatment.

 Comparison of means.

 Test on means.

 Disadvantages of the arithmetic mean

1. The mean is meaningless in the case of nominal or qualitative data.

2. In case of grouped data, if any class interval is open ended, arithmetic mean
cannot be calculated, since the class mark of this interval cannot be found.
11
෩)
The Median(𝑿

 Median is the middle value in the sorted list. We denote it by x෤ .

• Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be n ordered observations. Then the median is
given by:

𝑿 𝒏+𝟏 𝐼𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑.
𝟐
෥=൞
𝒙 𝑿 𝒏 +𝑿 𝒏
𝟐 𝟐+𝟏
𝐼𝑓 𝑛 𝑖𝑠 𝑒𝑣𝑒𝑛.
𝟐

12
Example 1:
Find the median for the following data:
23, 16, 31, 77, 21, 14, 32, 6, 155, 9, 36, 24, 5, 27, 19
Solution: First arrange the given data in increasing order. That is
5, 6, 9, 14, 16, 19, 21, 23, 24, 27, 31, 32, 36, 77, 𝑎𝑛𝑑 155.
𝑛 = 15 =≫ 𝑜𝑑𝑑, 𝑥෤ = 𝑋 𝑛+1 = 𝑋 15+1 = 𝑋 8 = 23
2 2

2. Find the median for the following data.

61, 62, 63, 64, 64, 60, 65, 61, 63, 64, 65, 66, 64, 63
Solution: First arrange the given data in increasing order. that is
60, 61, 61, 62, 63, 63, 63, 64, 64, 64, 65, 65 & 66.
𝑛 = 14 =≫ 𝑒𝑣𝑒𝑛,
𝑋 𝑛 + 𝑋 𝑛+1 𝑋 14 + 𝑋 14+1 𝑋 + 𝑋 63 + 64 127
2 2 2 2 7 8
𝑥෤ = = = = = = 63.5
2 2 2 2 2
13
The formula for computing the median for grouped data is given by

𝒏
− 𝒍𝒄𝒇𝒑 𝒙 𝒘
𝒎𝒆𝒅𝒊𝒂𝒏 = 𝐱෤ = 𝐥𝐜𝐛෥𝒙 + 𝟐
𝒇𝒎
𝑊ℎ𝑒𝑟𝑒: 𝑙𝑐𝑏𝑥෤ − 𝑖𝑠 𝑡ℎ𝑒 𝒍𝒄𝒃 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠

 𝑛 − 𝑖𝑠 𝑡ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠.

 𝑓𝑚 𝑖𝑠 𝑡ℎ𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠.
 𝑤 𝑖𝑠 𝑡ℎ𝑒 𝑤𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠. 𝑟𝑒𝑐𝑎𝑙𝑙 ∶ 𝑤 = 𝑢𝑐𝑏 − 𝑙𝑐𝑏.
 𝑙𝑐𝑓𝑝 𝑖𝑠 𝑡ℎ𝑒 𝒍𝒄𝒇 𝑐𝑜𝑟𝑟𝑒𝑠𝑝𝑜𝑛𝑑𝑖𝑛𝑔 𝑡𝑜 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠 𝒊𝒎𝒎𝒆𝒅𝒊𝒂𝒕𝒆𝒍𝒚 𝒑𝒓𝒆𝒄𝒆𝒅𝒊𝒏𝒈 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠.

The class corresponding to the smallest LCF which is ≥ 𝑛2 is called the median class.

So that the median lies in this class. 14

Example
Find the median for the following data.
Daily 80 − 89 90 − 99 100 − 109 110 − 119 120 − 129 130 − 139
production
Frequency 5 9 20 8 6 2

Solution: First construct the LCF table.

Daily 80 − 89 90 − 99 100 − 109 110 − 119 120 − 129 130 − 139
production(CI)
Frequency(fi) 5 9 20 8 6 2
Lcf 5 14 34 42 48 50

15
Solution

• To obtain the median class , calculate 𝑛

2
=
50
2
= 25. Thus the smallest lcf
𝑛
which is ≥ is 34. So the class corresponding to this lcf is 100 − 109,
2
𝑖𝑠 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠
𝑇ℎ𝑒𝑟𝑒𝑓𝑜𝑟𝑒, 𝑙𝑐𝑏𝑥෤ = 99.5, 𝑤 = 10, 𝑓𝑚 = 20, 𝑙𝑐𝑓𝑝 = 14.
𝑛
− 𝑙𝑐𝑓𝑝 𝑥 𝑤 25 − 14 𝑥 10
𝑚𝑒𝑑𝑖𝑎𝑛 = x෤ = lcb𝑥෤ + 2 = 99.5 + = 105.
𝑓𝑚 20

16
Properties of the median

[Link] median is unique.

2. It can be computed for an open ended frequency distribution if the
median does not lie in an open ended class.

3. It is not affected by extremely large or small values.

[Link] can be computed for ratio level, interval level and ordinal level
data.

17
෡)
The mode(𝑿
• In every day speech, something is “in the mode” if it is fashionable or popular.
• In statistics this “popularity” refers to frequency of observations.
• Therefore, mode is the “most frequently” observed value in a set of
observations.

𝑬𝒙𝒂𝒎𝒑𝒍𝒆: 𝑺𝒆𝒕 𝑨: 10, 10, 9, 8, 5, 4, 5, 12, 10 𝑚𝑜𝑑𝑒 = 10

→ 𝑢𝑛𝑖𝑚𝑜𝑑𝑎𝑙.

𝑺𝒆𝒕 𝑩: 10, 10, 9, 9, 8, 12, 15, 5 𝑚𝑜𝑑𝑒 = 9 &10 → 𝑏𝑖𝑚𝑜𝑑𝑎𝑙.

𝑺𝒆𝒕 𝑪: 4, 6, 7, 15, 12, 9 𝑛𝑜 𝑚𝑜𝑑𝑒.

18
Mode for a grouped data
 If the data is grouped such that we are given frequency distribution of finite class intervals,
we do not know the value of every item, but we easily determine the class with highest
frequency.

• Therefore, the modal class is the class with the highest frequency.

• So that the mode of the distribution lies in this class.

To compute the mode for a grouped data we use the formula:

∆𝟏
෡ = 𝒍𝒄𝒃𝒙ෝ +
𝒎𝒐𝒅𝒆 = 𝑿 𝒙𝒘
∆𝟏 + ∆𝟐

𝑤ℎ𝑒𝑟𝑒; ∆1 = 𝑓𝑚 − 𝑓𝑝 , ∆2 = 𝑓𝑚 − 𝑓𝑠 19
𝑊ℎ𝑒𝑟𝑒: 𝑙𝑐𝑏𝑥ො – 𝑖𝑠 𝑡ℎ𝑒 𝒍𝒄𝒃 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠.
• 𝑓𝑚 − 𝑖𝑠 𝑡ℎ𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠.
• 𝑓𝑝 − 𝑖𝑠 𝑡ℎ𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠 𝒑𝒓𝒆𝒄𝒆𝒅𝒊𝒏𝒈 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠.
• 𝑓𝑠 − 𝑖𝑠 𝑡ℎ𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠 𝒔𝒖𝒄𝒄𝒆𝒆𝒅𝒊𝒏𝒈 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
• 𝑤 − 𝑖𝑠 𝑡ℎ𝑒 𝑤𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠.
Example:
The ages of newly hired, unskilled employees are grouped into the following
distribution. Then compute the modal age?
Ages 18 − 20 21 − 23 24 − 26 27 − 29 30 − 32
Numbers 4 8 11 20 7

Solution: First we determine the modal class. The modal class is 27 − 29, since it has the highest
frequency. 𝑇ℎ𝑢𝑠, 𝑙𝑐𝑏𝑥ො = 26.5, 𝑤 = 3, ∆1 = 20 − 11 = 9, ∆2 = 20 − 7 = 13
∆1 9 27
Mode =𝑋෠ = 𝑙𝑐𝑏𝑥ො + 𝑥 𝑤 = 26.5 + 𝑥 3 = 26.5 + = 26.5 + 1.2 = 𝟐𝟕. 𝟕
∆1 +∆2 9+13 22

 The age of most of these newly hired employees is 27.7 (27 years and 7 months).

21
Example 2: The following table shows the distribution of a group of families
according to their expenditure per week. The median and the mode of the following
distribution are known to be 25.50 Birr and 24.50 Birr respectively. Two frequency
values are however missing from the table. Find the missing frequencies.

Class interval 1 − 10 11 − 20 21 − 30 31 − 40 41 − 50
Frequency 14 𝑓2 27 𝑓4 15
 Properties of mode
• It is not affected by extreme values of a set of observations.
• It can be calculated for distribution with open ended classes.
• It can be computed for all levels of data i.e. nominal, ordinal, interval and ratio.
• The main drawback of mode is that often it does not exist.
• Often its values are not unique.
4 Measures of Variation (Dispersion)
4.1. Introduction
 Measures of central tendency locate the center of the distribution.
 But they do not tell how individual observations are scattered on either side of the
center.

 The spread of observations around the center is known as dispersion or variability.

 In other words; the degree to which numerical data tend to spread about an average
value is called dispersion or variation of the data.

 Small dispersion indicates high uniformity of the observation while larger dispersion
indicates less uniformity
24
Objective of measure of dispersion

• To measure reliability of the average being used.

• To control variability itself.
• To compare two or more groups of numbers in terms of their variability.

Absolute and Relative Measures of Dispersion

• Measure of dispersion can be classified as absolute and relative form

a. Absolute measure of dispersion: is a measure of dispersion w/c are expressed in

terms of the original unit of a series.

• Such measures are not suitable for comparing the variability of two distributions
which are expressed in different units of measurement and different average size. 25
b. Relative measures of dispersions

• Are a ratio or percentage of a measure of absolute dispersion to an

appropriate measure of central tendency

• These are pure numbers independent of the units of measurement.

• Relative measures are used for comparing the variability of two
distributions (even if they are measured in the different unit).

26
Types of Measures of Dispersion
1. The Range (R) and Relative Range (RR)
The Range is the difference b/n the highest and the smallest observation. That is

𝑋𝑚𝑎𝑥 −𝑋𝑚𝑖𝑛 → 𝑓𝑜𝑟 𝑟𝑎𝑤 𝑎𝑛𝑑 𝑢𝑛𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎.

𝑅=ቊ
𝑈𝐶𝐿𝑙𝑎𝑠𝑡 − 𝐿𝐶𝐿𝑓𝑖𝑟𝑠𝑡 → 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎.

• It is a quick and dirty measure of variability.

• Because of the range is greatly affected by extreme values, it may give a distorted picture of
the scores.

• Range is a measure of absolute dispersion and as such cannot be used for comparing
variability of two distributions expressed in different units.
• The solution is to use relative range or any other relative measure of variation

𝑹 27
𝑹𝒆𝒍𝒂𝒕𝒊𝒗𝒆 𝑹𝒂𝒏𝒈𝒆 (𝑹𝑹) = , also known as coefficient of range.
𝑿𝒎𝒂𝒙 +𝑿𝒎𝒊𝒏
2. Variance and Standard Deviation

Variance: is the average of the squares of the deviations taken from the mean
 Suppose that 𝑥1 , 𝑥2 , … , 𝑥𝑁 be the set of observations on N populations.
Then,
σ 𝑁 2 σ 𝑁 2 2
𝑖=1 𝑖𝑥 − 𝜇 𝑖=1 𝑖𝑥 − 𝑁𝜇
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎 2 = = .
𝑁 𝑁
→ 𝑓𝑜𝑟 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
σ 𝑛 2 σ 𝑛 2 2
𝑖=1 𝑥𝑖 − 𝑥ҧ 𝑖=1 𝑥𝑖 − 𝑛 𝑥ҧ
𝑆𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑠 2 = = .
𝑛−1 𝑛−1
→ 𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒

28
• In general, the sample variance is computed by:

σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2 σ𝑛𝑖=1 𝑥𝑖 2 − 𝑛𝑥ҧ 2

= . → 𝑓𝑜𝑟 𝑟𝑎𝑤 𝑑𝑎𝑡𝑎.
𝑛−1 𝑛−1
𝑠2 = σ𝑘𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 2 σ𝑘𝑖=1 𝑓𝑖 𝑥𝑖 2 − 𝑛𝑥ҧ 2
𝑘 = . → 𝑓𝑜𝑟 𝑢𝑛𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎.
σ𝑖=1 𝑓𝑖 − 1 𝑛−1
σ𝑘𝑖=1 𝑓𝑖 𝑚𝑖 − 𝑥ҧ 2 σ𝑘𝑖=1 𝑓𝑖 𝑚𝑖 2 − 𝑛𝑥ҧ 2
= . → 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎.
σ𝑘𝑖=1 𝑓𝑖 − 1 𝑛−1

Standard Deviation: it is the square root of variance

It is a measure of how far, on the average, an individual measurement is from the mean
𝑆. 𝑑 = 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑆2 = 𝑆 29
Example:
Suppose the data given below indicates time in minute required for a laboratory experiment to
compute a certain laboratory test. Calculate the variance and standard deviation for the
following data.
𝒙𝒊 32 36 40 44 48 Total

𝒇𝒊 2 5 8 4 1 20
Solution 𝒙𝒊 32 36 40 44 48 Total
𝒇𝒊 2 5 8 4 1 20
𝒇𝒊 𝒙𝒊 64 180 320 176 48 788
𝒇𝒊 𝒙𝒊 𝟐 2048 6480 12800 7744 2304 31376

σ𝑛𝑖=1 𝒇𝒊 𝒙𝒊 788
𝑥ҧ = = = 39.4 ,
𝑛 20
𝑛 2
2
σ 𝑖=1 𝒇 𝒊 𝑥𝑖 − 𝑛𝑥ҧ 2 31376 − 20 𝑥 39.4 2
𝑠 = = = 17.31. , 𝑆 = 17.31 = 4.16 30
𝑛−1 19
Properties of Variance

1. The variance is always non-negative (𝑠 2 ≥ 0).

2. If every element of the data is multiplied by a constant "c", then the new variance
𝑠 2 𝑛𝑒𝑤 = 𝑐 2 𝑥 𝑠 2 𝑜𝑙𝑑 .
3. When a constant is added to all elements of the data, then the variance does not
change.
4. The variance of a constant (c) measured in n times is zero. i.e. (𝑉𝑎𝑟(𝑐) = 0).
 Uses of the Variance and Standard Deviation
• They can be used to determine the spread of the data. If the variance or S.D is large,
then the data are more dispersed.
• They are used to measure the consistency of a variable.
• They are used quit often in inferential statistics.
31
3. Coefficient of Variation (C.V)

 Whenever the two groups have the same units of measurement, the variance
and standard deviation for each can be compared directly.

A statistics that allows one to compare two groups when the units of
measurement are different is called coefficient of variation.

It is computed by:

𝑆
𝐶. 𝑉 = 𝑥 100% → 𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒.
𝑥ҧ

32
Example
The following data refers to the hemoglobin level for 5 males and 5 female
students. In which case, the hemoglobin level has high variability (less
consistency).

Solution:
13 + 13.8 + 14.6 + 15.6 + 17 74
𝑥ҧ𝑚𝑎𝑙𝑒 = = = 14.8
5 5
12 + 12.5 + 13.8 + 14.6 + 15.6 68
𝑥𝑓𝑒𝑚𝑎𝑙𝑒
ҧ = = = 13.7
5 5
Cont..
σ𝑛 2 2
•𝑠 2
𝑚𝑎𝑙𝑒𝑠 = 𝑖=1 𝑥𝑖 − 𝑛𝑥ҧ
𝑛−1
= 2.44. ,

𝑆𝑚𝑎𝑙𝑒𝑠 = 2.44 = 1.56

σ𝑛 2 2
• 𝑠 2𝑓𝑒𝑚𝑎𝑙𝑒𝑠 = 𝑖=1 𝑥𝑖 − 𝑛𝑥ҧ
𝑛−1
= 2.19.,

𝑆𝑓𝑒𝑚𝑎𝑙𝑒𝑠 = 2.19 = 1.48

• 𝐶. 𝑉𝑚𝑎𝑙𝑒𝑠 = 𝑆𝑥𝑚𝑎𝑙𝑒𝑠
ҧ𝑚𝑎𝑙𝑒
𝑥 100% =
1.56
14.8
𝑥100% = 𝟏𝟎. 𝟓𝟒%,

• 𝐶. 𝑉𝑓𝑒𝑚𝑎𝑙𝑒𝑠 = 𝑆𝑓𝑒𝑚𝑎𝑙𝑒𝑠
𝑥ҧ 𝑓𝑒𝑚𝑎𝑙𝑒
𝑥 100% =
1.48
13.7
𝑥100% = 𝟏𝟎. 𝟖𝟎%

Therefore, the variability in hemoglobin level is higher for females than for males,
because 𝐶. 𝑉𝑓𝑒𝑚𝑎𝑙𝑒𝑠 > 𝐶. 𝑉𝑚𝑎𝑙𝑒𝑠 . 𝐼n other words, there is less consistency among
34

females than males in hemoglobin level.

4. Standard Scores (Z-Scores)

 Z gives the number of standard deviation a particular observation lie above

or below the mean.
It is used for describing the relative position of a single score in the entire
set of data in terms of the mean and standard deviation.
 If X is a measurement (an observation) from a distribution with mean 𝑥ҧ
and standard deviation S, then its value in standard units is
𝑥 − 𝑥ҧ
𝑍= → 𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒
𝑆
• A positive Z-score indicates that the observation is above the mean.
• A negative Z-score indicates that the observation is below the mean.
35
Example:
• Two sections were given an examination on a certain course. For section 1, the
average mark (score) was 72 with standard deviation of 6 and for section 2, the
average mark (score) was 85 with standard deviation of 7. If student A from
section 1 scored 84 and student B from section 2 scored 90, then who perform a
better relative to the group?
𝑥−𝑥ҧ 84−72
Solution: 𝑍 𝑠𝑐𝑜𝑟𝑒 𝑓𝑜𝑟 𝐴 𝑖𝑠 𝑍 = = =2
𝑆 6
𝑥 − 𝑥ҧ 90 − 85
𝑍 𝑠𝑐𝑜𝑟𝑒 𝑓𝑜𝑟 𝐵 𝑖𝑠 𝑍 = = = 0.71
𝑆 7
Since, both Z-scores are positive indicating that the observations are above the
mean.
• By comparing both scores, since ZA > ZB i.e. 2 > 0.71, student A performed
better relative to his group than student B. 36
37

Data Summarization Techniques Explained
No ratings yet
Data Summarization Techniques Explained
36 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
78 pages
Properties of a Good Average
No ratings yet
Properties of a Good Average
6 pages
Central Tendency and Dispersion Measures
No ratings yet
Central Tendency and Dispersion Measures
10 pages
Data Summarization Techniques
No ratings yet
Data Summarization Techniques
62 pages
Central Tendency and Dispersion Measures
No ratings yet
Central Tendency and Dispersion Measures
89 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
26 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
47 pages
Central Tendency and Dispersion Measures
No ratings yet
Central Tendency and Dispersion Measures
47 pages
Understanding Central Tendency Statistics
No ratings yet
Understanding Central Tendency Statistics
8 pages
Location
No ratings yet
Location
34 pages
Central Tendency & Dispersion Explained
No ratings yet
Central Tendency & Dispersion Explained
10 pages
Sta 111 Lecture Note (Second Part)
No ratings yet
Sta 111 Lecture Note (Second Part)
50 pages
4thlecture On Measures of Central Tendency
No ratings yet
4thlecture On Measures of Central Tendency
34 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
9 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
13 pages
Numerical Summaries: Mean, Median, Mode
No ratings yet
Numerical Summaries: Mean, Median, Mode
6 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
65 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
32 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
31 pages
Central Tendancy in R
No ratings yet
Central Tendancy in R
10 pages
Measure of Central Tendency: Measure of Location: Goals
No ratings yet
Measure of Central Tendency: Measure of Location: Goals
7 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
105 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
35 pages
Central Tendency
No ratings yet
Central Tendency
7 pages
Statistics: Central Tendency & Dispersion
No ratings yet
Statistics: Central Tendency & Dispersion
40 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
12 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
36 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
42 pages
Central Tendency & Dispersion Methods
No ratings yet
Central Tendency & Dispersion Methods
7 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
19 pages
Statistical Data Analysis Techniques
No ratings yet
Statistical Data Analysis Techniques
44 pages
Understanding Data Management Basics
No ratings yet
Understanding Data Management Basics
20 pages
Understanding Arithmetic Mean in Statistics
No ratings yet
Understanding Arithmetic Mean in Statistics
40 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
41 pages
Measures of Central Tendency & Dispersion
No ratings yet
Measures of Central Tendency & Dispersion
51 pages
BOTA 302 Lesson 4 - Measures of Central Tendency and Dispersal
No ratings yet
BOTA 302 Lesson 4 - Measures of Central Tendency and Dispersal
10 pages
Types of Central Tendency Statistics
No ratings yet
Types of Central Tendency Statistics
51 pages
Central Tendency: Mean, Median, Mode
No ratings yet
Central Tendency: Mean, Median, Mode
20 pages
Lecture 3 Unit-II Mathematical Averages
No ratings yet
Lecture 3 Unit-II Mathematical Averages
24 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
10 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
25 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
32 pages
Central Tendency and Variation Measures
No ratings yet
Central Tendency and Variation Measures
67 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
36 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
60 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
53 pages
Introduction to Central Tendency Measures
No ratings yet
Introduction to Central Tendency Measures
30 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
33 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
10 pages
Central Tendency and Mean Calculation
No ratings yet
Central Tendency and Mean Calculation
14 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
6 pages
Central Tendency Measures Explained
No ratings yet
Central Tendency Measures Explained
47 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
111 pages
Chapter 5e
No ratings yet
Chapter 5e
7 pages
Advisor Evaluation
No ratings yet
Advisor Evaluation
2 pages
5.1. Expectation of Random Variables
No ratings yet
5.1. Expectation of Random Variables
33 pages
Stat I Work Sheet For Chapter 3,4,&5-2
No ratings yet
Stat I Work Sheet For Chapter 3,4,&5-2
14 pages
Sampling Techniques and Distributions Explained
No ratings yet
Sampling Techniques and Distributions Explained
30 pages
Understanding Random Variables in Statistics
No ratings yet
Understanding Random Variables in Statistics
20 pages
Month-Based Puzzle Course Set -4
No ratings yet
Month-Based Puzzle Course Set -4
44 pages
Introduction to Computer and IT Basics
No ratings yet
Introduction to Computer and IT Basics
11 pages
Pressure Modeling in Atmosphere
No ratings yet
Pressure Modeling in Atmosphere
7 pages
BJT-Based Metal Detector Circuit Design
No ratings yet
BJT-Based Metal Detector Circuit Design
11 pages
Understanding the Solar Nebula Theory
No ratings yet
Understanding the Solar Nebula Theory
29 pages
1 - Musora - Pentatonic Guide
100% (1)
1 - Musora - Pentatonic Guide
19 pages
Measurement Uncertainty in Instrumentation
No ratings yet
Measurement Uncertainty in Instrumentation
49 pages
Transient Vibration Analysis Dec07 Handout
No ratings yet
Transient Vibration Analysis Dec07 Handout
24 pages
Gran Plot Titration Analysis Guide
No ratings yet
Gran Plot Titration Analysis Guide
6 pages
Shear and Bond in Reinforced Concrete
No ratings yet
Shear and Bond in Reinforced Concrete
9 pages
FCL Inventory Management Strategies
No ratings yet
FCL Inventory Management Strategies
1 page
MIC-30 Insulation Resistance Meter Overview
No ratings yet
MIC-30 Insulation Resistance Meter Overview
4 pages
Wiley Edge Aptitude Test Sample Questions
100% (1)
Wiley Edge Aptitude Test Sample Questions
9 pages
Iq/Oq: Iq/Oq For Vaisala Viewlinc Monitoring System Page 1 of 161
No ratings yet
Iq/Oq: Iq/Oq For Vaisala Viewlinc Monitoring System Page 1 of 161
23 pages
Hand Gesture Cursor Control System
No ratings yet
Hand Gesture Cursor Control System
76 pages
HL7 Standards: Escape Sequences & Delimiters
No ratings yet
HL7 Standards: Escape Sequences & Delimiters
11 pages
Uniform Wear Theory in Clutch Design
No ratings yet
Uniform Wear Theory in Clutch Design
1 page
Solar Mobile Charger Project Report
100% (1)
Solar Mobile Charger Project Report
25 pages
Anglais 3eme
No ratings yet
Anglais 3eme
72 pages
Java Programming Concepts Overview
No ratings yet
Java Programming Concepts Overview
20 pages
Creative Industries Clustering in Italy & Spain
No ratings yet
Creative Industries Clustering in Italy & Spain
20 pages
The Chinese Remainder Theorem
No ratings yet
The Chinese Remainder Theorem
8 pages
Hausdorff Dimension of Mandelbrot Bound
No ratings yet
Hausdorff Dimension of Mandelbrot Bound
41 pages
Sitagliptin Phosphate Tablets USP Monograph
No ratings yet
Sitagliptin Phosphate Tablets USP Monograph
3 pages
Custom CDS Entities in SAP RAP V4
No ratings yet
Custom CDS Entities in SAP RAP V4
4 pages
C++ Programming Lab Report: Loops & Patterns
No ratings yet
C++ Programming Lab Report: Loops & Patterns
10 pages
Mitsubishi S12R-PTAR1 Engine Specs
No ratings yet
Mitsubishi S12R-PTAR1 Engine Specs
4 pages
Grade 10 Math Performance Analysis
No ratings yet
Grade 10 Math Performance Analysis
10 pages
Corruption's Impact on Public Spending
No ratings yet
Corruption's Impact on Public Spending
26 pages
IT and Knowledge Management in XXX Pharma
No ratings yet
IT and Knowledge Management in XXX Pharma
17 pages

Chapter 3

Uploaded by

Chapter 3

Uploaded by

College of Natural and Computational Science

• It tells us where the center of the distribution of data is located.

The most commonly used measures of central tendencies are :

Suppose that 𝑥1 , 𝑥2 , … , 𝑥𝑛 are n observed values in a sample of size n taken

𝑥1 = 85.4, 𝑥2 = 85.3, 𝑥3 = 84.9, 𝑥4 = 85.4 𝑎𝑛𝑑 𝑥5 = 85.

Solution: σ 𝑓𝑖 = 𝑛 = 46 is the sum of the frequencies or total

If we have an arithmetic means 𝑋1 , 𝑋2 , … , 𝑋𝑘 of k groups

• we can compute the combined mean of the variant values of the

𝑺𝒆𝒕 𝑨: 1, 4, 12, 2, 8 𝑎𝑛𝑑 6 ; 𝑺𝒆𝒕 𝑩: 3, 6, 2, 7 𝑎𝑛𝑑 4.

𝑛1 𝑥ҧ1 + 𝑛2 𝑥ҧ2 6 𝑥 5.5 + 5 𝑥 4.4 55

boys and girls in the class? Answer: 𝑛1 = 50 𝑎𝑛𝑑 𝑛2 = 100 9

6. Uniqueness: the mean of any set of data is unique.

7. It can be used for further treatment.

 Disadvantages of the arithmetic mean

 Median is the middle value in the sorted list. We denote it by x෤ .

2. Find the median for the following data.

 𝑛 − 𝑖𝑠 𝑡ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠.

So that the median lies in this class. 14

Solution: First construct the LCF table.

• To obtain the median class , calculate 𝑛

[Link] median is unique.

3. It is not affected by extremely large or small values.

𝑬𝒙𝒂𝒎𝒑𝒍𝒆: 𝑺𝒆𝒕 𝑨: 10, 10, 9, 8, 5, 4, 5, 12, 10 𝑚𝑜𝑑𝑒 = 10

𝑺𝒆𝒕 𝑩: 10, 10, 9, 9, 8, 12, 15, 5 𝑚𝑜𝑑𝑒 = 9 &10 → 𝑏𝑖𝑚𝑜𝑑𝑎𝑙.

𝑺𝒆𝒕 𝑪: 4, 6, 7, 15, 12, 9 𝑛𝑜 𝑚𝑜𝑑𝑒.

• So that the mode of the distribution lies in this class.

 The spread of observations around the center is known as dispersion or variability.

• To measure reliability of the average being used.

Absolute and Relative Measures of Dispersion

a. Absolute measure of dispersion: is a measure of dispersion w/c are expressed in

• Are a ratio or percentage of a measure of absolute dispersion to an

• These are pure numbers independent of the units of measurement.

𝑋𝑚𝑎𝑥 −𝑋𝑚𝑖𝑛 → 𝑓𝑜𝑟 𝑟𝑎𝑤 𝑎𝑛𝑑 𝑢𝑛𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑑𝑎𝑡𝑎.

• It is a quick and dirty measure of variability.

σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2 σ𝑛𝑖=1 𝑥𝑖 2 − 𝑛𝑥ҧ 2

Standard Deviation: it is the square root of variance

1. The variance is always non-negative (𝑠 2 ≥ 0).

It is computed by:

𝑆𝑚𝑎𝑙𝑒𝑠 = 2.44 = 1.56

𝑆𝑓𝑒𝑚𝑎𝑙𝑒𝑠 = 2.19 = 1.48

females than males in hemoglobin level.

 Z gives the number of standard deviation a particular observation lie above

You might also like