0% found this document useful (0 votes)

5 views7 pages

Chapter 7

Chapter 7 covers the fundamentals of statistics, including its definition, importance in data science, and various methods of data collection. It explains key concepts such as population vs. sample, measures of central tendency (mean, median, mode), and measures of variation (variance, standard deviation). Additionally, it discusses correlation, percentiles, quartiles, and normal distribution, emphasizing their significance in analyzing and interpreting data.

Uploaded by

balamurugan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views7 pages

Chapter 7

Uploaded by

balamurugan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Chapter 7: Basic Statistics

1. Statistics – Meaning (Detailed Explanation)

Statistics is a branch of mathematics that deals with data, but it is not only about numbers. It
is about understanding what numbers are telling us.

Statistics involves four main steps:

1. Collecting data – gathering information

2. Organizing data – arranging data in tables or charts
3. Analyzing data – finding averages, spread, relationships
4. Interpreting data – drawing conclusions and decisions

In simple words:
Statistics helps us convert raw data into useful information.

Real-life example:
A teacher collects marks of students (data), calculates average and pass percentage (analysis),
and decides whether students understood the subject (interpretation).

2. Importance of Statistics in Data Science (Detailed

Explanation)
In data science, we work with huge amounts of data. Statistics helps us manage and
understand this data.

Statistics is important because it:

 Reduces large data into simple numbers (mean, percentage)

 Helps compare groups (Class A vs Class B)
 Identifies patterns and trends
 Helps in prediction and decision-making

Example:
Netflix uses statistics to recommend movies based on user behavior.

Without statistics, data science cannot exist.

3. Types of Data Collection (Detailed Explanation)

Data collection is the first step in statistics. Data can be collected in two major ways
depending on how much control we have.
Observational Data

In observational data:

 We only observe what is happening

 We do not interfere or control

Examples:

 Conducting surveys
 Census data
 Observing customer purchases

👉 Used when experiments are not possible.

(b) Experimental Data

In experimental data:

 We conduct experiments
 We control variables

Examples:

 Giving different medicines to two groups

 Testing two different teaching methods

Gives more accurate cause-and-effect results.

(b) Experimental Data

 Data is collected by conducting experiments

 Researcher controls conditions

Examples:

 Testing new medicine

 Comparing two teaching methods

4. Population and Sample (Detailed Explanation)

In statistics, studying the entire population is often difficult.

Population

 Complete group under study

 Very large in size
Example:
All voters in a country

Sample

 Small part selected from population

 Used to represent population

Example:
1000 voters selected for survey

A good sample gives accurate results about the population.

5. Sampling Methods (Detailed Explanation)

Sampling is the method of selecting individuals from a population.

Random Sampling

 Every individual has equal chance

 No bias

Example:
Lottery method

Unequal Probability Sampling

 Some individuals have higher chance

 Used when groups are unequal

Example:
Selecting more people from cities than villages

Proper sampling gives reliable results.

6. Measures of Central Tendency (Detailed Explanation)

Measures of central tendency help us find a single value that represents the whole data.

They help answer:

👉 What is the typical value?

Three measures are used:

 Mean
 Median
 Mode
7. Mean (Average) – Detailed Explanation
Mean is the most commonly used average.

Formula:
Mean = Sum of all values / Number of values

Example:
Marks = 60, 70, 80
Mean = (60+70+80)/3 = 70

Advantage:
Easy to calculate

Disadvantage:
Affected by extreme values

Example:
Salaries = 10k, 15k, 20k, 1,00,000 → Mean becomes misleading

8. Median – Detailed Explanation

Median is the middle value when data is arranged in order.

Steps:

1. Arrange data in ascending order

2. Find middle value

Example:
Marks = 50, 60, 90
Median = 60

Advantage:
Not affected by extreme values

Best used for income, salary data

9. Mode – Detailed Explanation

Mode is the value that occurs most frequently.

Example:
Marks = 60, 70, 70, 80
Mode = 70

Useful when data is categorical

Example:
Most preferred mobile brand

10. Measures of Variation (Detailed Explanation)

Measures of variation tell us how data values differ from each other.

They help answer:

Are values close together or spread out?

Main measures:

 Range
 Variance
 Standard Deviation

11. Variance and Standard Deviation (Detailed

Explanation)
Variance

Variance measures the average squared distance from the mean.

Higher variance means data is more spread out.

Standard Deviation (SD)

Standard deviation is the square root of variance.

Why SD is important:

 Same unit as data

 Easy to interpret

Example:
Low SD → consistent marks
High SD → inconsistent marks

12. Correlation (Detailed Explanation)

Correlation measures the strength and direction of relationship between two variables.

Types:

 Positive correlation: both increase together

 Negative correlation: one increases, other decreases
 No correlation: no relationship

Example:
Temperature ↑ → Ice cream sales ↑ (positive)

Correlation does NOT mean causation.

13. Percentiles (Detailed Explanation)

Percentiles show the relative position of a value in a dataset.

 Data is divided into 100 equal parts

 Used to compare performance

Example:
If a student is in the 90th percentile, it means the student scored better than 90% of students.

Used in competitive exams, rankings, and performance analysis.

14. Quartiles (Detailed Explanation)

Quartiles divide data into four equal parts.

 Q1 (25%) – Lower quartile

 Q2 (50%) – Median
 Q3 (75%) – Upper quartile

Use:
Helps understand data distribution and detect outliers.

15. Normal Distribution & Empirical Rule (Detailed

Explanation)
Normal Distribution

 Bell-shaped curve
 Most values are near the mean

Empirical Rule (68–95–99.7 Rule)

 68% of data lies within 1 SD

 95% of data lies within 2 SD
 99.7% of data lies within 3 SD

Used to understand how data is spread around the mean.

10th Grade Statistics Overview
No ratings yet
10th Grade Statistics Overview
24 pages
Class 10 Statistics Overview Guide
No ratings yet
Class 10 Statistics Overview Guide
20 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
7 pages
Data Analysis and Statistics Overview
No ratings yet
Data Analysis and Statistics Overview
33 pages
Business Statistics Overview and Concepts
No ratings yet
Business Statistics Overview and Concepts
8 pages
Business Analytics Overview and Notes
No ratings yet
Business Analytics Overview and Notes
6 pages
Understanding Data Collection and Statistics
No ratings yet
Understanding Data Collection and Statistics
7 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
2 pages
Understanding Statistics Basics
No ratings yet
Understanding Statistics Basics
9 pages
Module 1 - Session 3 - Statistics
No ratings yet
Module 1 - Session 3 - Statistics
49 pages
Statistics for Data Science Overview
No ratings yet
Statistics for Data Science Overview
65 pages
Understanding Descriptive Statistics and Data Concepts
No ratings yet
Understanding Descriptive Statistics and Data Concepts
11 pages
Statistics and Data Analysis Basics
No ratings yet
Statistics and Data Analysis Basics
4 pages
Beginner's Guide to Statistics Textbook
No ratings yet
Beginner's Guide to Statistics Textbook
12 pages
Research Methodology Notes
No ratings yet
Research Methodology Notes
64 pages
Introduction to Business Statistics Basics
No ratings yet
Introduction to Business Statistics Basics
9 pages
Week 1: Introduction To Statistics and Data Analysis
No ratings yet
Week 1: Introduction To Statistics and Data Analysis
3 pages
Functions and Types of Statistics
No ratings yet
Functions and Types of Statistics
12 pages
Understanding Regression Analysis in Machine Learning
No ratings yet
Understanding Regression Analysis in Machine Learning
86 pages
Business Statistics Overview for BBA
100% (1)
Business Statistics Overview for BBA
54 pages
Introduction to Business Statistics Basics
No ratings yet
Introduction to Business Statistics Basics
12 pages
Statistical Data Analysis in Data Science
No ratings yet
Statistical Data Analysis in Data Science
86 pages
Understanding Statistics Basics
No ratings yet
Understanding Statistics Basics
7 pages
Data Processing and Analysis Techniques
No ratings yet
Data Processing and Analysis Techniques
38 pages
Understanding Statistics: Types & Measures
No ratings yet
Understanding Statistics: Types & Measures
21 pages
Data Analysis Planning Strategies
No ratings yet
Data Analysis Planning Strategies
27 pages
Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
19 pages
Introduction to Statistics for Managers
No ratings yet
Introduction to Statistics for Managers
30 pages
Understanding Data Analysis Techniques
No ratings yet
Understanding Data Analysis Techniques
38 pages
Statistics Beginners Guide
No ratings yet
Statistics Beginners Guide
42 pages
Statistical Data Analysis Essentials
No ratings yet
Statistical Data Analysis Essentials
23 pages
Understanding Statistics: Descriptive & Inferential
100% (1)
Understanding Statistics: Descriptive & Inferential
6 pages
Data Management in Modern Statistics
No ratings yet
Data Management in Modern Statistics
26 pages
Data and Statistics for Business Insights
No ratings yet
Data and Statistics for Business Insights
21 pages
Statistical Foundation
No ratings yet
Statistical Foundation
40 pages
Types of Data in Statistical Analysis
100% (1)
Types of Data in Statistical Analysis
26 pages
Introduction to Basic Statistics
100% (1)
Introduction to Basic Statistics
64 pages
Statistics in Research: Analysis Methods
No ratings yet
Statistics in Research: Analysis Methods
34 pages
Data Analysis Techniques and Methods
No ratings yet
Data Analysis Techniques and Methods
79 pages
Understanding Data Types in Statistics
No ratings yet
Understanding Data Types in Statistics
5 pages
Chi-Square Assumption Violations
No ratings yet
Chi-Square Assumption Violations
29 pages
Applications of Statistics Across Fields
100% (1)
Applications of Statistics Across Fields
30 pages
Statistical Methods for Data Science
No ratings yet
Statistical Methods for Data Science
31 pages
Statistical Methods for Data Science
No ratings yet
Statistical Methods for Data Science
30 pages
Business Statistics Course Overview
No ratings yet
Business Statistics Course Overview
21 pages
Business Statistics: Importance & Applications
No ratings yet
Business Statistics: Importance & Applications
78 pages
Data Analysis
No ratings yet
Data Analysis
30 pages
Statistics Basics for Data Science
No ratings yet
Statistics Basics for Data Science
20 pages
Unit 3
No ratings yet
Unit 3
11 pages
Statistics Overview: Descriptive & Inferential
No ratings yet
Statistics Overview: Descriptive & Inferential
6 pages
Data Analysis and Research Methods
No ratings yet
Data Analysis and Research Methods
30 pages
Introduction to Data Science Concepts
No ratings yet
Introduction to Data Science Concepts
143 pages
Statistics in Research: Analysis Methods
No ratings yet
Statistics in Research: Analysis Methods
33 pages
MATH2C Intro Reviewer
No ratings yet
MATH2C Intro Reviewer
23 pages
Math Problem Solving & Statistics Guide
No ratings yet
Math Problem Solving & Statistics Guide
6 pages
Quantitative Techniques in Management
No ratings yet
Quantitative Techniques in Management
18 pages
Data Management and Statistical Analysis
No ratings yet
Data Management and Statistical Analysis
56 pages
Understanding Statistics and Data Analysis
No ratings yet
Understanding Statistics and Data Analysis
26 pages
Introduction to Statistics and Data Analysis
No ratings yet
Introduction to Statistics and Data Analysis
52 pages
A3 Thinking: Problem Solving Essentials
No ratings yet
A3 Thinking: Problem Solving Essentials
8 pages
Microfinance: Social Responsibility vs. Profit
No ratings yet
Microfinance: Social Responsibility vs. Profit
15 pages
The Power of Distraction: Insights & Analysis
No ratings yet
The Power of Distraction: Insights & Analysis
209 pages
Drug Abuse: History and Impact in India
No ratings yet
Drug Abuse: History and Impact in India
15 pages
Exam Guide for English Test 2022
No ratings yet
Exam Guide for English Test 2022
6 pages
Integrated Inventory Management System
No ratings yet
Integrated Inventory Management System
10 pages
Peter's Age Calculation from Ratios
No ratings yet
Peter's Age Calculation from Ratios
13 pages
Class 11 Probability Exercise 14.1 Solutions
No ratings yet
Class 11 Probability Exercise 14.1 Solutions
16 pages
Stress Management Program Proposal
No ratings yet
Stress Management Program Proposal
8 pages
Java Practical File Overview
No ratings yet
Java Practical File Overview
19 pages
LL(1) Grammar: First and Follow Sets
No ratings yet
LL(1) Grammar: First and Follow Sets
20 pages
PowerShell 4.0 Quick Reference Guide
No ratings yet
PowerShell 4.0 Quick Reference Guide
4 pages
Key Elements of Social Mobilization
No ratings yet
Key Elements of Social Mobilization
2 pages
Mathematics Worksheet on Thousands
No ratings yet
Mathematics Worksheet on Thousands
3 pages
RPS Listening 2 for English Education
No ratings yet
RPS Listening 2 for English Education
9 pages
Understanding Love and Intimacy
No ratings yet
Understanding Love and Intimacy
25 pages
Science Teaching Styles and Student Performance
No ratings yet
Science Teaching Styles and Student Performance
6 pages
SIPOC Analysis
No ratings yet
SIPOC Analysis
3 pages
MFGPRO SysAdminReferenceGuide ProgressDatabase WindowsNTServer IG v09
100% (1)
MFGPRO SysAdminReferenceGuide ProgressDatabase WindowsNTServer IG v09
140 pages
01 Complete Guide To The Indian Education System
No ratings yet
01 Complete Guide To The Indian Education System
32 pages
Statistics
No ratings yet
Statistics
7 pages
PLATQREF
No ratings yet
PLATQREF
4 pages
High-Performance SCO-2120R IR Camera
No ratings yet
High-Performance SCO-2120R IR Camera
2 pages
Understanding Quantification Theory
No ratings yet
Understanding Quantification Theory
7 pages
School Program Performance Highlights
No ratings yet
School Program Performance Highlights
3 pages
Parametric Esports Arena Design in Bangalore
100% (1)
Parametric Esports Arena Design in Bangalore
12 pages
Understanding Types of Crisis Management
100% (1)
Understanding Types of Crisis Management
22 pages
HSCC India Limited Recruitment 2025
No ratings yet
HSCC India Limited Recruitment 2025
4 pages
UNIDEE Residency: Ways of Becoming
No ratings yet
UNIDEE Residency: Ways of Becoming
5 pages
TOEFL Level 2 Practice Test Guide
No ratings yet
TOEFL Level 2 Practice Test Guide
15 pages

Chapter 7

Uploaded by

Chapter 7

Uploaded by

Chapter 7: Basic Statistics

1. Statistics – Meaning (Detailed Explanation)

Statistics involves four main steps:

1. Collecting data – gathering information

2. Importance of Statistics in Data Science (Detailed

Statistics is important because it:

 Reduces large data into simple numbers (mean, percentage)

Without statistics, data science cannot exist.

3. Types of Data Collection (Detailed Explanation)

 We only observe what is happening

👉 Used when experiments are not possible.

(b) Experimental Data

 Giving different medicines to two groups

Gives more accurate cause-and-effect results.

(b) Experimental Data

 Data is collected by conducting experiments

 Testing new medicine

4. Population and Sample (Detailed Explanation)

 Complete group under study

 Small part selected from population

A good sample gives accurate results about the population.

5. Sampling Methods (Detailed Explanation)

 Every individual has equal chance

Unequal Probability Sampling

 Some individuals have higher chance

Proper sampling gives reliable results.

6. Measures of Central Tendency (Detailed Explanation)

They help answer:

Three measures are used:

8. Median – Detailed Explanation

1. Arrange data in ascending order

Best used for income, salary data

9. Mode – Detailed Explanation

Useful when data is categorical

10. Measures of Variation (Detailed Explanation)

They help answer:

11. Variance and Standard Deviation (Detailed

Variance measures the average squared distance from the mean.

Higher variance means data is more spread out.

Standard Deviation (SD)

Standard deviation is the square root of variance.

 Same unit as data

12. Correlation (Detailed Explanation)

 Positive correlation: both increase together

Correlation does NOT mean causation.

13. Percentiles (Detailed Explanation)

 Data is divided into 100 equal parts

Used in competitive exams, rankings, and performance analysis.

14. Quartiles (Detailed Explanation)

 Q1 (25%) – Lower quartile

15. Normal Distribution & Empirical Rule (Detailed

Empirical Rule (68–95–99.7 Rule)

 68% of data lies within 1 SD

Used to understand how data is spread around the mean.

You might also like