0% found this document useful (0 votes)

7 views23 pages

Understanding Data Types and Analysis

The document outlines two types of data: categorical and quantitative, emphasizing the focus on quantitative data analysis. It introduces descriptive statistics, statistical inference, and various methods for summarizing data, including frequency distributions and histograms. Additionally, it discusses measures of central tendency, variability, and relationships between variables using scatter diagrams and trendlines.

Uploaded by

snrpd2pqwh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views23 pages

Understanding Data Types and Analysis

Uploaded by

snrpd2pqwh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Data

Two types of data encountered:

1. Categorical Data = data that can be grouped by specific, non-numerical categories

2. Quantitative data = data that use numeric values to indicate how much or how many

There are limitations as to the extent of analysis

that can be performed on categorical data.
Therefore, we will concentrate our efforts primarily
on the analysis of quantitative data.
3. Descriptive Statistics = summaries of data in a form that is easier for the reader to
understand…includes tabular, graphical or numerical
presentations of the data
Where does the data come from?

4. Population = the set of all elements of interest in a particular study

5. Sample = a subset of the population

How is the data used?--Introduction to Statistical Inference

6. Statistical Inference = the use of data from a sample to make estimates

and predictions about the characteristics of
a population
Descriptive Statistics:
Using tables and graphical
displays
7. Frequency Distribution = a tabular summary of data showing the number (or frequency)
of observations in each of several non-overlapping categories
(or classes)

8. Relative Frequency = the fraction or percentage of observations belonging to a class

FORMULA:

Where: n = the number of observations

Steps for creating a Frequency Distribution for Quantitative Data:
A. Determine the number of non-overlapping classes, called k

Rule: Use the first k, such that 2k ≥ n

B. Determine the width of each class

Rule:
,

C. Determine the class limits

Rule: Class limits must be chosen so that each data item belongs to one and only one class.
Use both LOWER CLASS LIMITS and UPPER CLASS LIMITS
Work Chapter 2, Example #1 on Excel
Using Histograms to summarize data:
8. Histogram = a bar graph in which the classes are marked on the horizontal axis
and the class frequencies on the vertical axis. The class frequencies
are represented by the heights of the bars with the bars being drawn
adjacent to each other.
SPECIAL NOTE:

One of the most important uses of a histogram is to provide

information about the shape, or form, of a distribution.

Create Histogram using Chapter 2, Exercise #1 data

9. Skewness = the tendency of a histogram to be “off centered”

A. Skewed to the right = the histogram’s “tail” extends farther to the right
B. Skewed to the left = the histogram’s tail extends farther to the left
C. Symmetric = the histogram is neither skewed left nor right
Summarizing Data for Two Variables Using Graphical Displays: Scatter Diagram and Trendline

10. Scatter diagram = a graphical display of the relationship between

two quantitative variables

11. Trendline = a line that provides an approximation of the relationship

between two variables
Types of Relationships Depicted by Scatter Diagrams

12. Positive relationship = the scatter diagram seems to suggest an “upward”

pattern. Positive relationship indicates that the two
variables move in the SAME DIRECTION.

13. Negative relationship = the scatter diagram seems to suggest a

“downward” pattern. Negative relationship
indicates that the two variables move in
OPPOSITE DIRECTIONS.

14. No apparent relationship = the scatter diagram seems to suggest a

“random” pattern.
Descriptive Statistics:
Using Numerical Measurements
Measures of Location
15. Mean = the average value for a variable. It provides a measure of the
central location for the data.

Two different means:

16. Population mean = the average value for all the observations from the POPULATION.

It is denoted by the Greek letter µ.

17. Sample mean = the average value for all the observations from the SAMPLE.

It is denoted by the standard letter .

SPECIAL NOTE:

18. Parameter = a characteristic of a population

Remember the “P’s” go together

19. Statistic = a characteristic of a sample

Remember the “S’s” go together

FORMULAS for Mean:

∑
Population mean:

where: Σ = the summation operator

N = total number of observations in the population

∑
Sample mean:

where: Σ = the summation operator

n = total number of observations in the sample

Work Sample Mean Problem using Excel

20. Median = another measure of central location. It is the value in the middle when
the data are arranged in ascending order (smallest value to largest value).

How To Find the Median of a Data Set:

A. Arrange the data in ascending order (smallest value to largest value)

B. Determine ODD or EVEN data:

1. For an ODD number of observations, the median is the middle value

2. For an EVEN number of observations, the median is the average of the two middle values

21. Mode = another measure of central location. It is the value that occurs with the greatest
frequency
Measures of Variability
22. Range = largest value – smallest value

23. Variance = a measure of variability that utilizes all the data.

Two Different Variance Formulas:

∑( )
POPULATION VARIANCE:

( ̅)
SAMPLE VARIANCE:
24. Standard deviation = the positive square root of the variance

Two Different Standard Deviation Formulas:

POPULATION STANDARD DEVIATION: σ=

SAMPLE STANDARD DEVIATION: s=

Work Range, Var. and St. Dev. Problems on Excel

Measures of Relative Location
25. z-Scores = often called the standardized value. It is the number of standard
deviations a data value (x) is from the sample mean

FORMULA FOR CALCULATING A z-Score:

̅
z=
26. Chebyshev’s Theorem = enables us to make statements about the
percentage of data values that must be
within a specified number of standard
deviations of the mean

FORMULA FOR CHEBYSHEV’S THEOREM

At least ( ) of the data values must be within c standard

deviations of the mean, where c is any value greater than 1
27. Empirical Rule = used to determine the percentage of data
values that must be within a specified number
of standard deviations of the mean for
data that has a symmetric, bell-shaped
distribution.

FORMULA FOR EMPIRICAL RULE

A. ~68% of the data values will be within ONE standard deviation of the mean
B. ~95% of the data values will be within TWO standard deviations of the mean
C. ~99.7 or ~ALL of the data values will be within THREE standard deviations of the mean
END OF TEST #1 NOTES

Understanding Data Types and Statistics
No ratings yet
Understanding Data Types and Statistics
23 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
4 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
3 pages
Numerical Measures in Data Analysis
No ratings yet
Numerical Measures in Data Analysis
46 pages
Numerical Measures for Data Analysis
No ratings yet
Numerical Measures for Data Analysis
48 pages
Numerical Measures in Probability
No ratings yet
Numerical Measures in Probability
46 pages
Understanding Population, Sample, and Data Analysis
No ratings yet
Understanding Population, Sample, and Data Analysis
22 pages
Statistics Essentials for Data Science
100% (2)
Statistics Essentials for Data Science
27 pages
Descriptive Statistics in Education Analysis
No ratings yet
Descriptive Statistics in Education Analysis
48 pages
Introduction to Data Modeling Basics
No ratings yet
Introduction to Data Modeling Basics
64 pages
Numerical Measures in Data Analysis
No ratings yet
Numerical Measures in Data Analysis
51 pages
Statistical Analysis with Excel Tools
No ratings yet
Statistical Analysis with Excel Tools
15 pages
Understanding Midrange in Statistics
No ratings yet
Understanding Midrange in Statistics
11 pages
Mean Calculation for Successes in Sample
No ratings yet
Mean Calculation for Successes in Sample
510 pages
Understanding Statistical Measures and Tests
No ratings yet
Understanding Statistical Measures and Tests
29 pages
Sample vs Population Statistics Explained
No ratings yet
Sample vs Population Statistics Explained
1 page
Lesson3 Descriptive Statistics Reviewer
No ratings yet
Lesson3 Descriptive Statistics Reviewer
12 pages
Intro to Basic Statistics Workshop
No ratings yet
Intro to Basic Statistics Workshop
78 pages
Statistics in Data Science Overview
No ratings yet
Statistics in Data Science Overview
155 pages
Data Analysis and Hypothesis Testing Guide
No ratings yet
Data Analysis and Hypothesis Testing Guide
12 pages
Understanding Variables and Data Analysis
No ratings yet
Understanding Variables and Data Analysis
4 pages
Essential Statistics for Data Science
No ratings yet
Essential Statistics for Data Science
93 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
67 pages
Understanding Standard Deviation Basics
No ratings yet
Understanding Standard Deviation Basics
89 pages
Basic Econometrics Course Overview
No ratings yet
Basic Econometrics Course Overview
86 pages
National Diploma in Statistics Outline
No ratings yet
National Diploma in Statistics Outline
3 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
25 pages
Introduction to Hypothesis Testing
No ratings yet
Introduction to Hypothesis Testing
27 pages
Statistical Methods and Data Analysis Guide
No ratings yet
Statistical Methods and Data Analysis Guide
4 pages
Understanding Data Averages and Variability
No ratings yet
Understanding Data Averages and Variability
38 pages
Understanding Population and Sample Means
No ratings yet
Understanding Population and Sample Means
16 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
9 pages
03 Numerical Summaries of Data
No ratings yet
03 Numerical Summaries of Data
30 pages
Central Tendency Measures Explained
No ratings yet
Central Tendency Measures Explained
37 pages
Introduction to Statistics Concepts
No ratings yet
Introduction to Statistics Concepts
50 pages
Business Statistics Course Overview
No ratings yet
Business Statistics Course Overview
46 pages
Descriptive Statistics in Economics
No ratings yet
Descriptive Statistics in Economics
45 pages
Data Analysis: Graphical & Numerical Methods
No ratings yet
Data Analysis: Graphical & Numerical Methods
10 pages
Central Tendency & Dispersion in Statistics
No ratings yet
Central Tendency & Dispersion in Statistics
31 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
10 pages
Statistical Methods for Decision Making
No ratings yet
Statistical Methods for Decision Making
44 pages
Statistical Methods for Decision Making
No ratings yet
Statistical Methods for Decision Making
44 pages
Data Description Methods in Statistics
No ratings yet
Data Description Methods in Statistics
47 pages
Statistical Measures and Definitions
No ratings yet
Statistical Measures and Definitions
8 pages
Descriptive Statistics Overview Guide
No ratings yet
Descriptive Statistics Overview Guide
9 pages
Statistical Characteristics of Data
No ratings yet
Statistical Characteristics of Data
9 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
99 pages
Unit 1 Stats
No ratings yet
Unit 1 Stats
32 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
15 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
38 pages
Descriptive Measures in Stat 102
No ratings yet
Descriptive Measures in Stat 102
8 pages
Huiqing Yang - Complete Package (b2 Spring)
No ratings yet
Huiqing Yang - Complete Package (b2 Spring)
140 pages
Descriptive Statistics in Data Analysis
No ratings yet
Descriptive Statistics in Data Analysis
49 pages
Descriptive Analytics Techniques in Excel
No ratings yet
Descriptive Analytics Techniques in Excel
42 pages
Descriptive Statistical Measures Overview
No ratings yet
Descriptive Statistical Measures Overview
18 pages
Understanding Biostatistics Basics
No ratings yet
Understanding Biostatistics Basics
343 pages
Understanding Weighted T-Test Methods
No ratings yet
Understanding Weighted T-Test Methods
6 pages
Excel STDEVA Function Overview
No ratings yet
Excel STDEVA Function Overview
12 pages
Kenya Population Census 1989 Report
No ratings yet
Kenya Population Census 1989 Report
476 pages
Statistics and Probability Mastery Test
No ratings yet
Statistics and Probability Mastery Test
4 pages
Descriptive Statistics Overview Guide
No ratings yet
Descriptive Statistics Overview Guide
37 pages
Probability Distributions Overview
No ratings yet
Probability Distributions Overview
19 pages
Census: Pros and Cons Overview
No ratings yet
Census: Pros and Cons Overview
3 pages
Indian States: Area, Population & Languages
No ratings yet
Indian States: Area, Population & Languages
5 pages
Bangladesh's Demographic Transition Insights
No ratings yet
Bangladesh's Demographic Transition Insights
22 pages
Statistical Analysis of Sales and Surveys
No ratings yet
Statistical Analysis of Sales and Surveys
2 pages
2011 World Population Data Sheet Presentation
100% (1)
2011 World Population Data Sheet Presentation
41 pages
Overview of Probability Distributions
No ratings yet
Overview of Probability Distributions
6 pages
Central Limit Theorem Explained
100% (1)
Central Limit Theorem Explained
25 pages
Measures of Dispersion in Statistics
No ratings yet
Measures of Dispersion in Statistics
6 pages
Probability and Statistics Questions Guide
No ratings yet
Probability and Statistics Questions Guide
12 pages
Understanding Box and Whisker Plots
No ratings yet
Understanding Box and Whisker Plots
4 pages
Maharashtra Urban Population Trends 2011
No ratings yet
Maharashtra Urban Population Trends 2011
4 pages
Topline & Methodology: Conducted by Ipsos Using The Probability-Based Knowledgepanel®
0% (1)
Topline & Methodology: Conducted by Ipsos Using The Probability-Based Knowledgepanel®
5 pages
Sample Size Determination in Research
No ratings yet
Sample Size Determination in Research
6 pages
Z-Test for Population Proportions
No ratings yet
Z-Test for Population Proportions
29 pages
Pakistan 1998 Population Census Analysis
No ratings yet
Pakistan 1998 Population Census Analysis
8 pages
Nursing Statistics Basics Explained
No ratings yet
Nursing Statistics Basics Explained
11 pages
Manhattan Community Board 3 Overview
No ratings yet
Manhattan Community Board 3 Overview
1 page
Celts' Spanish Ancestry Revealed
No ratings yet
Celts' Spanish Ancestry Revealed
2 pages
Rajasthan Population Demographics 2011
No ratings yet
Rajasthan Population Demographics 2011
2 pages
Finding EEC-Approved Child Care in MA
No ratings yet
Finding EEC-Approved Child Care in MA
9 pages
Statistical Hypothesis Testing Assignments
No ratings yet
Statistical Hypothesis Testing Assignments
3 pages
Measures of Variation in Data Analysis
No ratings yet
Measures of Variation in Data Analysis
25 pages
Kalikot Census 2011 Results
No ratings yet
Kalikot Census 2011 Results
46 pages
Sampling Distribution
No ratings yet
Sampling Distribution
22 pages

Understanding Data Types and Analysis

Uploaded by

Understanding Data Types and Analysis

Uploaded by

Data

Two types of data encountered:

1. Categorical Data = data that can be grouped by specific, non-numerical categories

There are limitations as to the extent of analysis

4. Population = the set of all elements of interest in a particular study

5. Sample = a subset of the population

How is the data used?--Introduction to Statistical Inference

6. Statistical Inference = the use of data from a sample to make estimates

8. Relative Frequency = the fraction or percentage of observations belonging to a class

Where: n = the number of observations

Rule: Use the first k, such that 2k ≥ n

B. Determine the width of each class

C. Determine the class limits

One of the most important uses of a histogram is to provide

Create Histogram using Chapter 2, Exercise #1 data

10. Scatter diagram = a graphical display of the relationship between

11. Trendline = a line that provides an approximation of the relationship

12. Positive relationship = the scatter diagram seems to suggest an “upward”

13. Negative relationship = the scatter diagram seems to suggest a

14. No apparent relationship = the scatter diagram seems to suggest a

Two different means:

It is denoted by the Greek letter µ.

It is denoted by the standard letter .

18. Parameter = a characteristic of a population

Remember the “P’s” go together

19. Statistic = a characteristic of a sample

Remember the “S’s” go together

where: Σ = the summation operator

where: Σ = the summation operator

Work Sample Mean Problem using Excel

How To Find the Median of a Data Set:

A. Arrange the data in ascending order (smallest value to largest value)

B. Determine ODD or EVEN data:

1. For an ODD number of observations, the median is the middle value

23. Variance = a measure of variability that utilizes all the data.

Two Different Variance Formulas:

Two Different Standard Deviation Formulas:

POPULATION STANDARD DEVIATION: σ=

SAMPLE STANDARD DEVIATION: s=

Work Range, Var. and St. Dev. Problems on Excel

FORMULA FOR CALCULATING A z-Score:

FORMULA FOR CHEBYSHEV’S THEOREM

At least ( ) of the data values must be within c standard

FORMULA FOR EMPIRICAL RULE

You might also like