0% found this document useful (0 votes)

13 views27 pages

Introduction to Basic Statistics Concepts

Uploaded by

Thanh Luân Lê Nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views27 pages

Introduction to Basic Statistics Concepts

Uploaded by

Thanh Luân Lê Nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Statistics

Manufacturing and Industrial Engineering

Lecture 1
Basic Concepts of Statistics

* Definition:
Statistics is the science of data. It involves
collecting, classifying, summarizing, organizing,
and interpreting numerical information.

* What is the Objective of Statistics?

Is to make inference (prediction, and decisions)
about a population based upon information
contained in sample.
•Fundamental Elements of Statistics:
Statistics methods are particularly useful for studying,
analyzing, and learning about populations.
A population is a set of units (usually people, objects,
transactions, or events) that we are in studying.
For examples, population may include
1) All employed workers in a company
2) All registered voters in an election
3) All the cars produced last year by a particular
assembly line
In studying a population, we focus on one or more
characteristics or properties of the units in the
population. We call such characteristics variables.
A variable is a characteristic or property of an
individual population unit.
When we measure a variable for every unit of a
population, the result is called a census of the
population.
For such populations, conducting a census would be
prohibitively time-consuming and/or costly.
A reasonable alternative would be to select and study a
subset (or portion) of the units in the population.
A sample is a subset of the units of a population.
Statistical Inference is an estimate or prediction or
some other generalization about a population based on
information contained in a sample.
Since the statistics is the science of data and are obtained by
measuring the values of one or more variables on the units
in the sample (or population). All the data (and the variables
we measure) can be classified as one of two general types:
1. Quantitative data are measurements that are recorded
on a naturally occurring numerical scale.
2. Qualitative data are measurements that cannot be
measured on a natural numerical scale; they can only be
classifies into one of a group of categories.
Ex:
Chemical and manufacturing plants sometimes discharge toxic-
waste materials such as DDT into nearby rivers and streams. These
toxins can adversely affect the plants and animals inhabiting the
river and river bank. The U.S. Army Corps of Engineers conducted a
study of fish in the Tennessee River (in Alabama) and its three
tributary creeks. A total of 144 fish were captured and the
following variables measured for each:
1. River/creek where fish was captured.
2. Species (catfish, largemouth bass, or smallmouth buffalo fish)
3. Length (centimeters)
4. Weight (grams)
5. DDT concentration (parts per million)
Classify each of the five variables measured as quantitave or
qualitative.
Solution:
The variables length, weight, and DDT are quantitative
because each is measured on a numerical scale:
Length in centimeters
Weight in grams
DDT in parts per million
In contrast, river/creek and species cannot be measured
quantitatively, they can only be classified into categories
(e.g., channel catfish, largemouth bass and smallmouth
buffalofish for species. Consequently, data on river/creek and
species are qualitative.
Collecting Data:
To solve a problem at hand, you need to decide the appropriate type of
data-quantitative or qualitative- so you’ll need to collect the data.
Generally, you can obtain the data in four different ways:
1. Data from a publish source, such as book, journal, or newspaper.
2. A data from a designed experiment, in which researcher exerts strict
control over the units (people, objects, or events) in the study. For
example, a recent medical study investigated the potential of aspirin
in preventing heart attacks.
3. Data from survey, in which the researcher samples a group of people,
asks one or more questions, and records the responses. Probably the
most familiar type of survey is the political polls conducted by any one
of a number of organizations and designed to predict the outcome of
political election.
4. Data collected observationally, in the observation study the
researcher observes the experimental units in their natural setting and
records the variable(s) of interest. For example, a company
psychologist might observe and record the level of “Type A” behavior
of a sample of assembly line workers.
Regardless of the data collection method employed, it is likely
that the data will be a sample from some population. And
if we wish to apply inferential statistics, we must obtain a
representative sample.
A representative sample exhibits characteristics typical of
those possessed by the population of interest.

The most common way to satisfy the representative sample

requirement is to select a random sample. A random
sample ensures that every subset of fixed size in the
population has the same chance of being included in the
sample.
Frequency Distributions:
Definitions:
Raw Data:
It is the collected data which have not been organized numerically.
Ex: The masses of 10 students are:
50 kg, 55kg, 45 kg, 40 kg, 47 kg, 60 kg, 61 kg, 62 kg, 46 kg, 51 kg

Arrays:
An array is an arrangement of raw numerical data in ascending or descending
order of magnitude.

Range:
It is the difference between the largest and the smallest numbers of data.
Ex: The range of the data set (35, 50, 55, 45, and 30) is:
55- 30= 25
When we summarizing data into classes or categories to determine the
number of individuals belonging to each class is called the “class frequency”.

Frequency distribution or (Frequency Table):

Is a tabular arrangement of data by classes; for example:
Class Frequency Distribution (Class frequency)
60- 62 5
63- 65 18
66- 68 42
69- 71 27
72- 74 8

Total frequency 100

* Class Intervals and class Limits:
The class of 60- 62 is called the “class interval”. The end numbers 60 and 62 are called
“class limits”; the smaller number 60 is the “lower class limit” and the larger number 62 is
the “upper class limit”.
* Open Class Interval:
It is the class interval which has either no upper or no lower, class limit. For example, the
age of groups of individuals of “65 years and above” is an open class interval.
* Class mark:
It is the “midpoint” of the class interval and it is obtained by adding the lower and upper
class limits and dividing by 2.
Ex: the class mark of interval (60- 62) is
(60+ 62)/2= 61 it is the class mark or class midpoint.
* Histogram and Frequency Polygon:
The two graphical representations of frequency distributions, are:
1. The Histogram or (Frequency Histogram) consists of a set of rectangles having basses on
a horizontal axis (x-axis) with the class marks and lengths equal to the class frequency.
Frequency polygon is a line graph of class frequency plotted against class mark.
2. The Frequency Polygon is a line graph of class frequency plotted against class mark.
• Relative Frequency:
It is the frequency of the class divided by the total frequency of all classes and
generally expressed by a percentage.
Ex:
Class Interval Frequency Dist. Relative Frequency
(Class Frequency) Distribution

60- 62 5 5%
63- 65 18 18 %
66- 68 42 42 %
69- 71 27 27 %
72- 74 8 8%

100 Total Frequency 100 % Sum of Relative

Freq. Dist.
• Cumulative Frequency Distribution (OGIVES):
It is the total frequency of all values of a given class intervals.
Ex:
Class Interval Frequency Cumulative Freq. Dist. Relative
Distribution (OGIVES) Cumulative
Freq. Dist.
(Percentage
OGIVES)=
Cumulative Freq/
Total Freq
60- 62 5 5 5%
63- 65 18 23 23 %
66- 68 42 65 65 %
69- 71 27 92 92 %
72- 74 8 100 100 %
• Class Boundaries:
They are obtained by adding the upper limit of one class interval to the lower limit
of the next higher class interval and dividing by 2.
Ex:
For the class intervals 63- 65, 66- 68; the class boundaries are
62.5- 65.5, 65.5- 68.5
• Class Width or (Class Size):
It is the difference between the upper and lower class boundaries.
Ex: The class width or (class size) of the above classes are
65.5- 68.5= 3 and 68.5- 65.5= 3

• Mean, Median, Mode and Measures of Central Tendency:

An average is a value which is typical or representative of a set of data. Since such
typical value s tend to lie centrally within a set of data arranged according to
magnitude, averages are also called “measures of central tendency”.
Several types of averages could be defined, the most common being are:
* Arithmetic Mean or briefly the Mean, Median, Mode, Geometric Mean, and the
Harmonic Mean.
•Arithmetic Mean (Mean):
For a set of N numbers X1, X2, …, XN it is denoted by and defined by
Ex: The mean for the set of numbers 8, 3, 5, 12, 10 is

If the numbers X1, X2, …, Xk occur with frequency of f1, f2, …, fk the
arithmetic mean is:

where = total frequency

Ex:
The final examination in a course is weighted three times as much as a quiz,
and a student has a final examination grade of 85 and quiz grades 70, and
90, the mean grade is
Note
The algebraic sum of the of the deviations of a set of numbers from their
arithmetic mean is zero.
Ex:
The deviations of the numbers 8, 3, 5, 12, and 10 from their mean 7.6 are
8-7.6, 3-7.6, 5-7.6 , 12-7.6, 10-7.6 so that the algebraic sum is equal to:
0.4-4.6-2.6+4.4+2.4 = 0
• Median:
The median of a set of numbers arranged in order of magnitude (in an
array) is the “middle value” or the arithmetic mean of two middle values.
Ex:
The set of numbers 3, 4, 4, 5, 6, 8, 8, 10 has median 6.
Ex:
The set of numbers 5, 5, 7, 9, 11, 12, 15, and 18 has median
(9+11)/2= 10
• For grouped data, the median obtained by:

L1 = lower class boundary of the median class

N = number of items in the data (total frequency)
= sum of frequencies of all classes lower than the median class
f median = frequency of median class
c = size of median class interval
Ex:
Find the median length of the 40 laurel leaves of the following table:
Length Frequency
(mm)
118-126 3
127-135 5
136-144 9
145-153 12
154-162 5
163-171 4
172-180 2
Total 40
Solutions
Method 1:
By Interpolation
The median lies between the half total frequency (40/2 = 20)
The sum of the first three class frequencies is (3+5+9=17) , we require 3 to reach the desired 20, so
that the median lies in the class interval of 145-153
Which is the 4th class , the lower boundary of the 4th class is (144+145)/2=144.5, and the it’s upper
boundary = (153+154)/2=153.5
144.5+(3/12)(153.5-144.5)= 144.5+(3/12)(9)= 146.8 mm
Method 2
By using the formula

Its clear that the median lies in the 4th class interval since 3+5+9=17, while
3+5+9+12=29 and 40/2=20 half of total frequency, so that
L1 = 144.5 lower class boundary of median class
N = 40 # of items in the data
= sum of all classes lower than the median class = 3+5+9= 17
f median = frequency of median class = 12
c = size (width0 of median class interval = 153.5- 144.5= 9, so that
Median = 144.5 + [(20-17)/12](9)= 146.8 mm
Note: The Geometric Median in the Histogram is the value of X
corresponding to the vertical line which divides the histogram into two
parts having equal areas.
H.W: Find the Median age of the following (Ans. 45.1)

Age of Head of Family Number*106

(years) (Frequency
Under 25 2.22
25-29 4.05
30-34 5.08
35-44 10.45
45-54 9.47
55-64 6.63
65-74 4.16
75 and over 1.66
Total (N) 43.42
• Mode:
It is the value of a set of numbers which occurs with greatest frequency . The
mode may not exists, and if it does exist, it may not be unique.
Ex:
The set 2, 2, 5, 7, 9, 9, 9, 10, 10, 11, 12, 18 has mode equals 9.
The set 3, 5, 8, 10, 12, 15, 16 has no mode
The set 2, 3, 4, 4, 4, 5, 5, 7, 7, 7, 9 has two modes 4 and 7 and is called “Bimodal”.
A distribution having only one mode is called “Unimodal”.
So the mode is the number occurring most frequently.

Note:
In the case of grouped data when a frequency curve has been constructed to fit the data,
the Mode will be the value of X corresponding to the maximum point (or points) on the
curve.
• The Mode can be obtained for group data with frequency distribution or histogram by
the formula:

= Lower class boundary of modal class (class containing the mode)

= excess of modal frequency over frequency of next lower class
= excess of modal frequency over frequency of next higher class
= size of modal class interval
Ex: The following table shows a frequency distribution of monthly wages in pounds, find the modal
wages of 65 employees:
Wages ($) Number of Employees (f)
50-59.99 8
60-69.99 10
70-79.99 16
80-89.99 14
90-99.99 10
100-109.99 5
110-119.99 2
Total 65
Sol.
The class containing the mode (modal class) is the 3rd class (70-79.99)

Frequency of next lower class = 10

Frequency of next higher class = 14

C = size of modal class = 79.995- 69.995 = 10

So the
Mode = 69.995 +(6/(6+2))(10)= $ 77.5
• An Empirical Relation between Mean, Median, and Mode:
For Unimodal frequency curves which are moderately skewed (asymmetrical),
(which is either Right Skewed, or Left Skewed), the empirical relation:
Mean – Mode= 3(Mean – median)
Ex:
Use the empirical relation between Mean, Mode, and Median to find the modal wage of
65 employees of past result of modal.
Sol:
Since
Mean – mode = 3(Mean – Median)
Mode = Mean- 3(Mean – Median)

Mean = , where X = class mark

Class f Class Mark (X)

50- 59.99 8 54.995
60- 69.99 10 64.995
70- 79.99 16 74.995
80- 89.99 14 84.995
90- 99.99 10 94.995
100- 109.99 5 104.995
110- 119.99 2 114.995
Total 65
Mean = ((8)(54.995) + (10)(64.995) +…)/65 = 79.764

N/2 = 65/2 = 32.5

8+10+16=34
So the Median lies in the 3rd class interval
Where 8+10 = 18
L1 = 69.995, f median =16
Median = 69.995 + ((32.5-18)/16)(10) = 79.06
So the Mode = 79.764-3(79.764-79.06) = 77.652
The difference of the two results is
77.65- 77.5 = 0.15 so,
there is good agreement with the empirical result in this case.
Geometric Mean (G):
For a set of numbers X1, X2, …, XN , G is the Nth root of the product of the numbers:

Ex: For the numbers 2, 4, 8

• For group of data

with frequencies
where
The geometric mean

• Harmonic Mean (H)

For a set of N numbers X1, X2, …, XN the Harmonic mean is the reciprocal of the arithmetic mean

Ex: The Harmonic mean of 2, 4, 8 is

H.W: Find , G, H for 5, 10, 18.

•Root Mean Square (R.M.S):
The R.M.S or the “Quadratic Mean” of set of numbers X 1, X2, …, XN is defined by:

This average is used in physical applications

• Quantities: The equal subdivisions of data, such as:

-Divided by Two equal parts is denoted by “Median”
-Divided by Four equal parts, each part is called “Quartiles”
- Divided by Ten equal parts, each part is called “Deciles”
- Divided by one hundred equal parts, each part is called “Percentiles”

Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
8 pages
Basic Statistics Overview
No ratings yet
Basic Statistics Overview
10 pages
Lesson 5 - Quantitative Analysis and Interpretation of Data
No ratings yet
Lesson 5 - Quantitative Analysis and Interpretation of Data
78 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
46 pages
Understanding Statistics and Probability
No ratings yet
Understanding Statistics and Probability
48 pages
Understanding Statistics Basics
No ratings yet
Understanding Statistics Basics
59 pages
Bed 226 (Business Statistics)
No ratings yet
Bed 226 (Business Statistics)
36 pages
Statistics - Basic Concepts
No ratings yet
Statistics - Basic Concepts
29 pages
Stats Full Notes
No ratings yet
Stats Full Notes
81 pages
Introduction to Statistical Inference
100% (1)
Introduction to Statistical Inference
33 pages
SDS - Unit IV - Mean, Median, Mode
No ratings yet
SDS - Unit IV - Mean, Median, Mode
104 pages
Statistical Analysis Fundamentals
No ratings yet
Statistical Analysis Fundamentals
5 pages
Data Management and Statistics Guide
No ratings yet
Data Management and Statistics Guide
14 pages
Deductive vs Inductive Reasoning Guide
No ratings yet
Deductive vs Inductive Reasoning Guide
3 pages
Understanding Statistics Fundamentals
No ratings yet
Understanding Statistics Fundamentals
14 pages
Understanding Statistics Fundamentals
No ratings yet
Understanding Statistics Fundamentals
28 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
60 pages
Statistics Overview: Key Concepts & Definitions
No ratings yet
Statistics Overview: Key Concepts & Definitions
13 pages
Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
4 pages
Understanding Population and Sampling in Statistics
No ratings yet
Understanding Population and Sampling in Statistics
13 pages
Understanding Statistics and Probability
No ratings yet
Understanding Statistics and Probability
59 pages
Probability and Statistics Overview Guide
No ratings yet
Probability and Statistics Overview Guide
33 pages
Understanding True Class Limits in Statistics
No ratings yet
Understanding True Class Limits in Statistics
6 pages
Introduction to Biostatistics for Nursing
No ratings yet
Introduction to Biostatistics for Nursing
34 pages
Understanding Data and Statistics Basics
No ratings yet
Understanding Data and Statistics Basics
8 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
36 pages
Constructing Frequency Distributions
No ratings yet
Constructing Frequency Distributions
104 pages
Understanding Statistics: Types & Measures
No ratings yet
Understanding Statistics: Types & Measures
21 pages
Understanding Basic Statistics Concepts
No ratings yet
Understanding Basic Statistics Concepts
17 pages
Understanding Psychological Statistics
No ratings yet
Understanding Psychological Statistics
5 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
41 pages
Basic Statistics
No ratings yet
Basic Statistics
26 pages
Understanding Statistics: Key Concepts and Applications
No ratings yet
Understanding Statistics: Key Concepts and Applications
135 pages
Introduction to Statistics and Data Analysis
No ratings yet
Introduction to Statistics and Data Analysis
18 pages
Essential Guide to Statistics and Data Analysis
No ratings yet
Essential Guide to Statistics and Data Analysis
35 pages
Understanding Statistics Fundamentals
No ratings yet
Understanding Statistics Fundamentals
29 pages
Understanding Basic Statistics Concepts
No ratings yet
Understanding Basic Statistics Concepts
4 pages
Introduction to Statistics Course Overview
No ratings yet
Introduction to Statistics Course Overview
187 pages
SB Niazi
No ratings yet
SB Niazi
38 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
9 pages
Stat-Data p01
No ratings yet
Stat-Data p01
47 pages
Introduction to Statistics Overview
No ratings yet
Introduction to Statistics Overview
68 pages
Organizing Data: Statistics Basics
No ratings yet
Organizing Data: Statistics Basics
14 pages
Identifying Experimental Units in Statistics
No ratings yet
Identifying Experimental Units in Statistics
57 pages
Origin and Meaning of "Statista"
No ratings yet
Origin and Meaning of "Statista"
25 pages
Class Boundaries in Grouped Data
No ratings yet
Class Boundaries in Grouped Data
49 pages
Statistics Essentials for Data Science
100% (2)
Statistics Essentials for Data Science
27 pages
Introduction to Statistics and Probability
No ratings yet
Introduction to Statistics and Probability
5 pages
Importance of Descriptive Statistics in Psychology
No ratings yet
Importance of Descriptive Statistics in Psychology
12 pages
Data Management and Descriptive Statistics
No ratings yet
Data Management and Descriptive Statistics
50 pages
Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
15 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
46 pages
Introduction to Statistics and Data Visualization
No ratings yet
Introduction to Statistics and Data Visualization
17 pages
Research Q4
No ratings yet
Research Q4
2 pages
Understanding Data Management and Statistics
No ratings yet
Understanding Data Management and Statistics
2 pages
Quick Reference to Statistical Concepts
No ratings yet
Quick Reference to Statistical Concepts
21 pages
Key Opinion Leaders in Rheumatology
No ratings yet
Key Opinion Leaders in Rheumatology
6 pages
English Grammar Practice Questions
No ratings yet
English Grammar Practice Questions
2 pages
English Pronunciation and Grammar Quiz
No ratings yet
English Pronunciation and Grammar Quiz
6 pages
Pronunciation and Grammar Quiz
No ratings yet
Pronunciation and Grammar Quiz
4 pages
English Exam Sample Questions
No ratings yet
English Exam Sample Questions
180 pages
Retail Loyalty Programs in India
100% (1)
Retail Loyalty Programs in India
34 pages
Public Project Management in Bishoftu
No ratings yet
Public Project Management in Bishoftu
41 pages
Toilet Cleaning Tool Feasibility Study
No ratings yet
Toilet Cleaning Tool Feasibility Study
11 pages
Promoting Women's Higher Education
100% (4)
Promoting Women's Higher Education
69 pages
A Study of Factors Affecting Purchase Intention of Young People Towards Smartphone Brand
100% (1)
A Study of Factors Affecting Purchase Intention of Young People Towards Smartphone Brand
49 pages
Qualitative Research on Group Buying
No ratings yet
Qualitative Research on Group Buying
10 pages
Sampling Strategies in Educational Research
No ratings yet
Sampling Strategies in Educational Research
35 pages
Sampling Techniques
No ratings yet
Sampling Techniques
12 pages
CLOOME: Unlocking Bioimaging Queries
No ratings yet
CLOOME: Unlocking Bioimaging Queries
14 pages
PLFS Annual Report 2023-24 Insights
No ratings yet
PLFS Annual Report 2023-24 Insights
10 pages
Academic Anxiety in Senior High Students
No ratings yet
Academic Anxiety in Senior High Students
41 pages
Cultural Values and E-Learning Acceptance
No ratings yet
Cultural Values and E-Learning Acceptance
24 pages
Survey Validation with Discrete Tests
No ratings yet
Survey Validation with Discrete Tests
17 pages
Impact of Government Policies On Manufacturing
No ratings yet
Impact of Government Policies On Manufacturing
20 pages
Descriptive vs Inferential Statistics
No ratings yet
Descriptive vs Inferential Statistics
7 pages
Cash Flow Analysis of 3F Industries
No ratings yet
Cash Flow Analysis of 3F Industries
70 pages
Retailer Insights on Varun Beverages
No ratings yet
Retailer Insights on Varun Beverages
56 pages
Revised Group 4 Research Output
No ratings yet
Revised Group 4 Research Output
77 pages
Efficient Technology and The Conservation of Natural Forests: Evidence From Sri Lanka
No ratings yet
Efficient Technology and The Conservation of Natural Forests: Evidence From Sri Lanka
32 pages
Business Strategy Challenges at Zemen Bank
No ratings yet
Business Strategy Challenges at Zemen Bank
75 pages
E-Menu System for IUM Tuck Shop Management
No ratings yet
E-Menu System for IUM Tuck Shop Management
28 pages
Digital Learning Impact on Math Performance
No ratings yet
Digital Learning Impact on Math Performance
25 pages
Research Methodology Overview
No ratings yet
Research Methodology Overview
69 pages
Cluster Sampling Methodology Explained
No ratings yet
Cluster Sampling Methodology Explained
17 pages
Research Methodology Questions Bank
No ratings yet
Research Methodology Questions Bank
18 pages
Spatial Autoregressive Models Guide
No ratings yet
Spatial Autoregressive Models Guide
25 pages
Data Collection Methods and Techniques
No ratings yet
Data Collection Methods and Techniques
32 pages
Lind Chapter 08 MCW
No ratings yet
Lind Chapter 08 MCW
21 pages
Understanding Population and Sample in Research
No ratings yet
Understanding Population and Sample in Research
7 pages
Pilot Test for Crowd Management Survey
No ratings yet
Pilot Test for Crowd Management Survey
8 pages

Introduction to Basic Statistics Concepts

Uploaded by

Introduction to Basic Statistics Concepts

Uploaded by

Statistics

Manufacturing and Industrial Engineering

* What is the Objective of Statistics?

The most common way to satisfy the representative sample

Frequency distribution or (Frequency Table):

Total frequency 100

100 Total Frequency 100 % Sum of Relative

• Mean, Median, Mode and Measures of Central Tendency:

where = total frequency

L1 = lower class boundary of the median class

Age of Head of Family Number*106

= Lower class boundary of modal class (class containing the mode)

Frequency of next lower class = 10

C = size of modal class = 79.995- 69.995 = 10

Mean = , where X = class mark

Class f Class Mark (X)

N/2 = 65/2 = 32.5

Ex: For the numbers 2, 4, 8

• For group of data

• Harmonic Mean (H)

Ex: The Harmonic mean of 2, 4, 8 is

H.W: Find , G, H for 5, 10, 18.

This average is used in physical applications

• Quantities: The equal subdivisions of data, such as:

You might also like