0% found this document useful (0 votes)

18 views102 pages

Understanding Measures of Central Tendency

The document provides an overview of measures of central tendency, including mean, median, and mode, explaining their definitions, calculations, and appropriate usage based on data distribution. It also discusses measures of variability such as range, interquartile range, and standard deviation, highlighting their importance in understanding data dispersion. Additionally, the document covers skewness and its implications for data interpretation.

Uploaded by

Rafid Rahman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views102 pages

Understanding Measures of Central Tendency

Uploaded by

Rafid Rahman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Measures of

Central
Tendency

• Naima Nigar
• Assistant Professor
• Department of Psychology
• University of Dhaka
Introduction

Definition: Measures of Central Purpose: Understanding the central

Tendency location of data.
A statistic indicating the midpoint or
average score in a distribution.
Mean
Mean (Arithmetic Mean)
Commonly known as the
"average."
Formula: μ = (∑X) / N
Explained: Sum of all values
divided by the number of
values.
Median
If there are an odd number of observations in a data set, then the
median can be calculated as below:
Step 1: Arrange the data either in ascending or descending order.
Step 2: If the number of observations (say n) is odd, then the
middlemost observation is the median of the given data.

If there an even number of observations in a data set, then the

median can be calculated as below:
Step 1: Arrange the data either in ascending or descending order.
Step 2: If the number of observations (say n) is even, then identify
(n/2)th and [(n/2) + 1]th observations.
Step 3: The average of the above two observations (which are
identified in step 2) is the median of the given data.
Calculating Median

Series 1: Odd
12, 6, 7, 5, 3 Median: 7
cases

Series 2: Even Ascending Median = (6 +

cases order: 5, 6, 7, 12 7)/2 = 6.5
Mode

Most frequently occurring

value in a dataset.
Mode = Value with the
highest frequency.
When to Use Different
Measures

Choosing the Right Measure

Mean: Suitable for normally distributed
data.
Median: Useful when dealing with
outliers or skewed data.
Mode: Effective for identifying the most
common value.
Summation Notation

Summation Notation
Symbol: ∑ (Sigma)
Meaning: Represents "sum of."
Usage: ∑X implies "add all the values in X."
Conclusion

• Measures of central
tendency help describe
where data clusters.
• Mean, median, and mode
are standard measures.
• Choose the appropriate
measure based on the
data’s distribution.
The Arithmetic Mean
Definition: Arithmetic Mean
Denoted by x̄ (pronounced "X bar").
Formula: x̄ = ∑(X) / n
Explained: Sum of observations divided by the
number of observations.
• Typically used for interval or ratio data.
• Suitable when data follows an
When to Use approximately normal distribution.

Arithmetic Mean
Frequency
Distribution
Arithmetic Mean from
Frequency Distribution
Formula: x̄ = (∑fX) / n
Explanation: Multiply the
frequency of each score by
its corresponding score and
then sum.
Grouped Frequency Distribution

Mean Calculation from Grouped

Frequency Distribution
Formula: x̄ = (∑fX) / n
Note: X represents the midpoint of the
class interval.
Example: Calculation using grouped data
yields a mean of 72.12.
• Rounding Results
• After calculation, 71.8 may be
rounded to 72.
• Similarly, 72.12 may also be
rounded to 72.
• Precision depends on context.
Precision in Measurement

Choosing the Right Decision depends on Arithmetic Mean

Statistic the required offers a balance
precision in between simplicity
measurement. and accuracy.
Conclusion

Arithmetic Mean is a fundamental measure of central

tendency.

Versatile - suitable for various data types.

Precision of the mean depends on rounding and

context.
The Median
• Definition: The Median
• Middle score in a distribution.
• Determining the Median
• Order scores by magnitude.
• Odd number of scores: Middle score.
• Even number of scores: Average of two
middle scores.
Median Example
• Median Calculation Example
• Scores: 66, 65, 61, 59, 53, 52,
41, 36, 35, 32
• Median = (53 + 52) / 2 = 52.5
• Applicability of Median
• Suitable for ordinal, interval,
and ratio data.
Advanced Median
Calculation
• Large Data Sets
• For large datasets,
advanced methods are
used.
• Complex techniques for
finding the median.
The Mode
• Definition: The Mode
• Most frequently
occurring score.
• Example: Bruce's Word
Processing Scores
• Scores: 43, 34, 45, 51,
42, 31, 51
• Mode = 51 (most
frequent score).
Decision Making with
Median and Mode
• Personnel Decision Example
• Hiring Bruce based on Median (51) or
Mean (below 50).
• Emphasizes the importance of context
in decision-making.
Multiple Modes
• Distributions with Multiple Modes
• Bimodal Distribution Example
• Scores: 51, 49, 51, 50, 66, 52, 53,
38, 17, 66, 33, 44, 73, 13, 21, 91,
87, 92, 47, 3
• Two modes: 51 and 66 (both
occur twice).
Utility of the Mode

Qualitative and
Mode's Usefulness
verbal analyses.

Example: Consumer Mode conveys

recall of commercial information on word
content. frequency.
Mode vs. Mean

Estimating journal Mean indicates

Mode vs. Mean articles published average, while
Example by clinical mode shows
psychologists. distribution.
Conclusion

MEAN INDICATES MEDIAN: REPRESENTS MODE: IDENTIFIES THE

AVERAGE THE MIDDLE SCORE. MOST FREQUENT
SCORE.
Measures of Variability

Variability measures Mean alone does We'll explore

how scores in a not capture the measures of
distribution are whole story; variability and their
dispersed. dispersion matters. importance.
Understanding Variability

Distributions with the same mean can have

different dispersions.

Example: Test scores ranging from 0 to 100 in

two distributions (A and B) with the same mean
of 50.
Distribution A: Wide dispersion.

Distribution B: Narrow dispersion.

Measures of
Variability

• Statistics that describe the

variation in distribution are
referred to as measures of
variability.
• Key measures include:
• The Range
• Interquartile Range
• Semi-Interquartile Range
• Average Deviation
• Standard Deviation
• Variance
The Range

Range = Limited use:

Difference Sensitive to
between highest extreme values.
and lowest
scores.
Example: Caveat: One extreme
Distribution B score can skew the
with a range of range.
20 (60 - 40).
Quartiles and
Quartiles
• Quartiles divide a distribution into
four equal parts.
• Quartiles: Q1 (25th percentile), Q2
(Median), Q3 (75th percentile).
• Quartile = Specific point; Quarter =
Interval.
The Interquartile
Range (IQR)

• IQR = Difference between Q3 (75th

percentile) and Q1 (25th percentile).
• More robust than the range.
• Describes the middle 50% of data.
• Less sensitive to outliers.
• Like the median, it is an ordinal
statistic.
The Semi-
Interquartile Range
• Semi-IQR = IQR divided by 2.
• Represents half of the IQR.
• Provides a measure of variability that
is less affected by outliers.
Interpreting
Quartiles
• The median (Q2) is the
midpoint.
• Range between Q1 and
Q3 are the quarter-
points.
• Q1 and Q3 provide
insight into distribution
shape.
Skewness
• In a symmetrical
distribution, Q1 and Q3
are equidistant from the
Q2.
• Skewness affects data
interpretation.
• Skewness indicates a
lack of symmetry in the
distribution.
Conclusion

• Variability measures are crucial to

understanding data dispersion.
• The range, IQR, and semi-IQR offer
valuable insights.
• Skewness can impact data
interpretation.
• Use the appropriate measure based on
your data's characteristics.
Exploring Average
Deviation in Data
Analysis
• Average Deviation (AD) Definition: A tool
to describe variability in data distribution
• Formula: AD = Σ| x | / n
• Importance: Foundation for Understanding
Standard Deviation
• Breakdown of the Formula:
• X= |X - mean|: Absolute deviation from
the mean
• X: Individual score
• Mean: Mean of all scores
• Explanation: Calculate deviation for each
score, sum, and divide by n
Calculation Example

• Example Distribution: 85 100 90 95 80

• Calculate Mean: (85 + 100 + 90 + 95 + 80) / 5 = 90
• Deviation Scores:
• |85 - 90| = 5
• |100 - 90| = 10
• |90 - 90| = 0
• |95 - 90| = 5
• |80 - 90| = 10
• Sum of Deviation Scores: 5 + 10 + 0 + 5 + 10 = 30
• Average Deviation: 30 / 5 = 6
Interpretation of Average Deviation

INTERPRETATION: AD OF 6 MEANS NOTE: AD IGNORES ALGEBRAIC SIGNIFICANCE: PROVIDES INSIGHT

SCORES VARY, ON AVERAGE, BY 6 SIGNS (POSITIVE/NEGATIVE) INTO DATA VARIABILITY
POINTS FROM THE MEAN
Limitations of Average Deviation

Rarely Used: Not commonly employed in data analysis

Limitation: Deletion of algebraic signs makes it less useful

Purpose: Understanding AD helps in grasping the concept of Standard Deviation

Connection: Understanding Average Deviation is essential for comprehending

Standard Deviation
Conclusion

Recap: Importance: Transition:

Average Lays the Let's delve
Deviation foundation into
measures for Standard
variability in understandi Deviation in
data ng Standard our next
Deviation discussion.
Understanding the
Standard Deviation

NN
Introduction
First, let's get acquainted with the standard deviation (SD). It is a measure that helps us understand the
variability in a given data distribution. By analyzing the standard deviation, we gain insights into how
spread out the data points are from the average.

Before we proceed further, it's important to note that the standard deviation differs from the average
deviation. The standard deviation involves squaring the deviation scores and then taking the square root.
This process gives us a measure that captures both the positive and negative deviations in the data.
Calculating the Variance
In order to better grasp the concept of standard deviation,
it's crucial to understand how variance is calculated.
Variance is characterized as the average squared deviation
from the mean.

The formula for calculating variance is:

Variance (s^2) = Σ(x - x̄)^2 / n,
• where Σ denotes the sum of the squared deviations,
• x stands for individual data points,
• x̄ represents the mean, and n signifies the number of
scores.
Hands-On Calculation (Part 1)

Now, let's put our knowledge into practice by performing a

hands-on calculation of the standard deviation. We will use a
practical exercise to demonstrate the process.

For this exercise, we will refer to the data provided in Table

3-1. We will calculate the standard deviation using deviation
scores.
Hands-On Calculation (Part 2)

Continuing with our hands-on calculation, we will now

explore an alternative method: using the raw scores
formula to calculate the standard deviation.

• In both cases, the standard deviation is the square

root of the variance (s2).
• By using this method, we can obtain the same
result: SD = 14.10.
Interpretation of Standard Deviation

Understanding the interpretation of standard deviation

is crucial. The standard deviation serves as a measure
of the spread of data points throughout a distribution.

Additionally, it's important to note that the standard

deviation is closely related to variance. In fact, the
standard deviation is simply the square root of
variance. We will explore this relationship further in the
next section.

Lastly, we will delve into the concept of positively

skewed data and its impact on the standard deviation,
as well as the concept of skewness.
Symbols for Standard Deviation

Various symbols commonly represent standard deviation,

including s, S, SD, and σ. It's important to clarify the
distinction between the symbols s and σ, which refer to
sample and population standard deviations, respectively.

Additionally, there is a debate on whether to use n or n - 1 for

the denominator in calculations. Lindgren's argument
advocates for the use of n - 1. As for convention, it's essential
to understand when to use n or n - 1 as a denominator.
Population Standard
Deviation Formula

Let's explore the formula for population standard

deviation. The population standard deviation (σ) is
calculated using the formula: √Σ(x - M)^2 / N, where M
represents the population mean.

It's important to differentiate between the sample mean

(x̄) and the population mean (M) when calculating the
population standard deviation.
Benefits of Standard Deviation

The standard deviation offers several significant

benefits. Firstly, it measures the variation in a given
dataset, allowing us to better understand the data's
dispersion.

Moreover, the standard deviation takes into account

the distance of each data point from the mean, making
it an important tool in statistics, research, and data
analysis.
Psychology research and measurement commonly
make use of the standard deviation due to its
relevance and importance in understanding data
variability.
Conclusion
In conclusion, the standard deviation is a powerful tool in
statistics that helps us measure the variability and spread of
data in a given distribution. We can effectively analyze and
interpret data by fully understanding the standard deviation.

The next section will explore additional related concepts,

expanding our knowledge beyond standard deviation. Stay
tuned!
Skewness and Kurtosis
Understanding
Skewness in
Distributions
Distributions can be
characterized by their
skewness.
Skewness indicates the
nature and extent of
symmetry in a distribution.
It helps us understand how
measurements are
distributed within a
dataset.
A distribution with positive skew
means few scores are at the high end.
Positive Skewness

Positively skewed exam results

suggest the test was too difficult.

More easier items are needed to

discriminate better at the lower end
of the scores.
A distribution with negative skew
means few scores are at the low end.
Negative Skewness

Negatively skewed exam results

suggest the test was too easy.

More difficult items are needed to

discriminate better at the upper end
of the scores.
Skewed vs. Abnormal Skewed doesn't necessarily mean
abnormal.

It's a way to describe the

symmetry of a distribution.

Consider the example of the

Marine Corps Ability and
Endurance Screening Test.
Marine Corps Test
Example
Which graph represents the Marine
Corps Ability and Endurance Screening
Test?
Positively skewed distribution (Graph C)
might fit because few would score high.
The Nature of Skewness

In truth, skewness
Is skewness a good is just a
thing? A bad thing? characteristic; it's
Abnormal? not inherently good
or bad.
Measuring
Skewness
Various formulas exist for
measuring skewness.
One way is to examine the
distances of quartiles from the
median.
Positive skew: Q3 - Q2 > Q2 -
Q1. Negative skew: Q3 - Q2 <
Q2 - Q1.
Symmetrical
Distribution
In a symmetrical
distribution, distances
from Q1 and Q3 to the
median are the same.
Conclusion
Skewness is a valuable tool for understanding
the symmetry of distributions.
It helps interpret test results, data, and various
scenarios.
Remember, skewness is descriptive, not
inherently good or bad.
Kurtosis measures the
steepness of a distribution
in its center.

Understanding Prefixes like platy-, lepto-,

Kurtosis in or meso- describe the
Distributions peakedness or flatness.

Distributions are classified

as platykurtic, leptokurtic,
or mesokurtic.
Platykurtic
Platykurti distributions are
c relatively flat.
Distributi
These distributions
ons have less pronounced
peaks in the center.
Leptokurtic distributions
are relatively peaked.
Leptokurt
ic
Distributi They have more
ons pronounced and
sharper peaks in the
center.
Mesokurtic distributions
Mesokurt fall somewhere in the
ic middle.
Distributi They have moderate
ons central steepness and
peakedness.
Various methods exist for
measuring kurtosis.

Measurin
g Kurtosis Some computer
programs use an index of
kurtosis ranging from -3
to +3.
The measurement and
interpretation of kurtosis
can be controversial.
Controversies
in Kurtosis
Opinions on technical
matters related to
kurtosis differ among
measurement specialists.
The Normal Curve: History
• Development of the concept of a normal curve began in the
middle of the eighteenth century with the work of Abraham
DeMoivre.
• Later, the Marquis de Laplace. Karl Friedrich Gauss made
some substantial contributions at the beginning of the
nineteenth century.
• Scientists called it the “Laplace-Gaussian curve through the
early nineteenth century.”
• Karl Pearson is credited with being the first to refer to the
curve as the normal curve, perhaps to be diplomatic to all of
the people who helped develop it.
Properties of the Normal
Curve
• The normal curve is perfectly symmetrical.
• The mean, median, and mode all have the same value because the curve is
perfectly symmetrical.
• The normal curve can be divided into different areas defined by standard
deviation units.
• In theory, the normal curve's distribution ranges from negative to positive
infinity.
• The curve is highest at its center, tapering on both sides approaching the X-
axis asymptotically.
• A normal curve has two tails. The area on the normal curve between 2
and 3 standard deviations above the mean is called a tail. The area
between -2 and -3 standard deviations below the mean is also called a
tail.
The Area under the Normal Curve
• The normal curve can be
conveniently divided into areas
defined in standard deviation
units.
• A hypothetical distribution of
National Spelling Test scores
with a mean of 50 and a standard
deviation of 15 is illustrated in
Figure beside.
• The graph tells us that 99.74% of
all scores in these normally
distributed spelling-test data lie
between 3 standard deviations.
Areas under the Normal Curve

50% - Above & Below 68% - Within 1 Standard Deviation

50% of the scores occur above the mean Approximately 68% of all scores occur
and 50% of the scores occur below the between the mean and 1 standard
mean. deviation.

95% - Within 2 Standard Deviations 99.74% - Within 3 Standard

Deviations
Approximately 95% of all scores occur
99.74% of all scores occur between 3
between the mean and 2 standard
standard deviations above and below the
deviations.
mean.
Tale of the Tails

1 Deviance from 2 Mental Ability 3 Adjustments Needed

the Norm
Approximately two Out-of-sync individuals
Intelligence test scores standard deviations require substantial
that fall within the limits of from the mean is one adjustments in parental
either tail can have key element in expectations,
momentous identification of educational settings,
consequences in terms mentally retarded or and social and leisure
of the tale of one’s life. gifted individuals. activities.
Implications of the Normal Curve

Useful Interpretation Conveying Standard Scores

Knowledge of the areas Information Standard scores provide
under the normal curve Knowledge of the areas information about how
can be quite useful to the under the normal curve impressive, average, or
interpreter of test data. can convey useful lackluster an individual is
information about a test with respect to a
score in relation to other particular discipline or
test scores. ability.
Standard Scores
• Definition: A standard score is a transformed raw score that
is converted from one scale to another.

• Purpose: Standard scores provide a common scale with a

set mean and standard deviation, making scores easily
interpretable.
Importance and Benefits of
Standard Scores
• Importance
• Raw scores can be difficult to interpret on their own.
• Standard scores provide a precise reference point.
• Allow comparison of an individual's performance to
others.
• Benefits
• Easy Interpretation: Standard scores are more easily
interpretable than raw scores.
• Relative Position: They show a test-taker's performance
relative to others.
• Universal Application: Useful in various fields, such as
education and psychology.
Z Scores

• Z scores are a way of measuring how far away a raw score is from the mean of a
distribution. They tell us how many standard deviations a particular score is
above or below the mean. It's like valuing how relative that score is to the rest of
the data.
• Mean set at 0, standard deviation set at 1.
• Raw scores converted to z scores on this scale.
A z score is calculated as the difference between a raw score and the mean, divided by the standard
deviation.
It provides context and meaning to a score, allowing comparison with others.

Knowing a z score can tell you the percentage of test-takers who scored higher.

Raw scores lack context and are not as informative as z scores.

Z scores help compare scores on different tests, providing a common context.

Example: Crystal's raw scores for reading and arithmetic were 24 and 42, respectively.

Her z scores reveal that she performed above average in reading (z = 1.32) and below average in
arithmetic (z = -0.75).
Reference to normal curve tables can provide more detailed insights into performance relative to the
population.
T Scores
• T scores are another type of standard score that is commonly used. Z
scores are computed on a "zero plus or minus one scale."
• T scores are computed on a "fifty plus or minus ten scale."
• T scores have a mean of 50 and a standard deviation 10.
• T scores were devised by W. A. McCall and named in honor of E. L.
Thorndike.
• T scores range from 5 standard deviations below the mean to 5 standard
deviations above the mean.
• A raw score at -5 standard deviations is a T score of 0, the mean is a T
score of 50, and at +5 standard deviations is a T score of 100.
• T scores have the advantage of not being negative, unlike z scores.
• Z scores can be both positive and negative, making calculations more
clumsy in some cases.
• This system makes working with and interpreting scores easier,
especially when negative values aren't helpful.
• We can compare performances across a wide range of data points with T
scores.
Other Standard Scores
• Various standard scoring systems exist, including
stanines, SAT/GRE scores, and IQ scores.
• Stanines have a mean of 5 and a standard
deviation of approximately 2, divided into nine
units from 1 to 9.
• The 5th stanine represents average performance,
covering the middle 20% of scores in a normal
distribution.
• SAT and GRE scores have a mean of 500 and a
standard deviation of 100.
• Deviation IQ scores have a mean of 100 and a
standard deviation of 15, with typical scores
ranging from 70 to 130.
Linear and Nonlinear Transformations

• Different standard scoring systems may involve linear or nonlinear

transformations of raw scores.

• Linear transformations maintain a direct numerical relationship to the

original raw score.

• Nonlinear transformations are used when data are not normally distributed,

and the resulting standard score doesn't have a direct numerical

relationship to the original raw score.

Understanding Correlation

Correlation is a fundamental concept in

statistics used to determine how variables
are related and to what extent they are
related. It's essential to understand how to
measure correlation if you want to make
meaningful inferences from data.
Types of
Correlation
• Positive Correlation
• Occurs when two variables
increase or decrease together.

• Negative Correlation
• Happens when one variable
increases as the other decreases.

• Zero Correlation
• Means there is no relationship
between the variables.
Perfect Correlation: The Ideal
Scenario
The perfect correlation is either -1 or 1, which
indicates a flawless relationship between the
variables. However, finding a "perfectly zero"
correlation is rare.
Correlation vs. Causation
Correlation Causation
Indicates a Implies that one variable
relationship between causes the other.
variables.

Example
If you were told, for example, that from birth to age 9
there is a high positive correlation between hat size
and spelling ability, would it be appropriate to
conclude that hat size causes spelling ability?
The Potential for Prediction

Regression Analysis Machine Learning

Uses correlation to predict future values. Uses correlation to train prediction models.
The Pearson r
The Pearson r is a widely used technique for
measuring correlation. It is also known as the
Pearson correlation coefficient and the
Pearson product-moment coefficient of
correlation.
Measurement of Linear Relationships
Correlation
It is most suitable when the
The Pearson r is a widely used relationship between
technique for measuring variables is linear.
correlation between variables.

Continuous Variables
It works best with continuous variables rather than
with categorical or ordinal variables.
Understanding the Formula

Standard Scores Formula Variations

To calculate the Here N represents the
Pearson r, we convert number of paired scores; ∑
raw scores to XY is the sum of the product
standard scores and of the paired X and Y scores;
multiply them. This ∑ X is the sum of the X
helps us analyze the scores; ∑ Y is the sum of the
relative position of Y scores; ∑ X2 is the sum of
each score within the the squared X scores; and ∑
distribution. Y2 is the sum of the squared
Y scores. Similar results are
obtained with the use of each
formula.
Interpreting Statistical Significance

1 Tables of Significance 2 Level of Significance

To determine the A Pearson r value can

statistical significance of be considered
the Pearson r, we statistically significant at
consult tables. These various levels, such as
tables help us .01 or .05. These levels
understand the indicate the likelihood of
probability of the the correlation occurring
correlation occurring by by chance.
chance alone.

3 Interpreting Results
Statistical significance at the .01 level suggests a strong
correlation, while significance at the .05 level provides a
less rigorous basis for concluding a correlation exists.
The Coefficient of Determination

1 Explaining r2 2 Interpretation of r2
To understand the If r is .9, then r2 would be
percentage of variance .81. the variables account
the variables share, we for 81% of the variance,
calculate the coefficient of while the remaining 19%
determination, r2. This could be due to chance or
involves squaring the unmeasured factors.
correlation coefficient and
multiplying by 100.
Unraveling the Terminology
1 The "Product- 2 Adding Depth to Application
Moment" Connection
Understanding the
In psychometrics, a "moment" terminology behind the
refers to a deviation about the Pearson r adds an interesting
mean of a distribution. The layer of knowledge to its
Pearson r involves multiplying application. It sheds light on
corresponding standard scores, the mathematical foundations
which are the first moments of a of this correlation coefficient.
distribution. Hence the term
"product-moment correlation.
Applications in Research

1 Data Analysis
Researchers use the Pearson r to analyze and
interpret data in various fields of study.

2 Correlation Studies
It is commonly employed in correlation studies
to determine the strength and direction of
relationships.

3 Population Studies
The Pearson r helps researchers uncover
patterns and associations within populations of
interest.
Conclusion

• The Pearson r is a valuable tool for

measuring correlation, particularly in cases
of linear relationships between continuous
variables.
• Its calculation involves analyzing standard
scores and interpreting statistical
significance.
• Additionally, the coefficient of
determination provides insights into the
shared variance between variables.
• Understanding the terminology behind the
Pearson r enhances our appreciation of its
significance and application.
The Spearman Rho

• The Spearman Rho, also known as Spearman's

rank-order correlation coefficient, is an alternative
statistic to the Pearson r for measuring correlation.
• Charles Spearman, a British psychologist,
developed this coefficient.
• Spearman's rho is commonly used when dealing
with small sample sizes (less than 30 pairs of
measurements) and when both sets of
measurements are in ordinal or rank-order form.
• Special tables are used to assess the significance of
the obtained rho coefficient in Spearman's rank-
order correlation.
Graphic Representations of
Correlation
• Many names, such as bivariate distribution, a scatter diagram, a
scattergram, and a scatterplot refer to graphic representations of
correlation.
• A scatterplot is a simple graphing of the coordinate points for values of
the X-variable (placed along the graph’s horizontal axis) and the Y-
variable (placed along the graph’s vertical axis).
• Scatterplots are helpful because they provide a quick indication of the
direction and magnitude of the relationship, if any, between the two
variables.
• Scatterplots help reveal the presence of curvilinearity in a relationship.
• As you may have guessed, curvilinearity in this context refers to an
“eyeball gauge” of how curved a graph is.
Regression
Regression: In statistics, regression refers to analyzing relationships
between variables to understand how one variable can predict another.
It is often used to model the relationship between a predictor variable
(X) and an outcome variable (Y), resulting in a regression line equation.

Simple Regression: Simple regression involves one predictor variable

(X) and one outcome variable (Y). It results in an equation for a
regression line of best fit on a scatterplot of X and Y.

Regression Line Equation: The equation for a regression line is typically

represented as Y = a + bX, where a is the intercept (the Y-axis crossing
point), and b is the slope of the line. The line is fitted to minimize the
sum of squared vertical distances between the data points and the line.

Regression Coefficients: The coefficients a and b are calculated through

algebraic methods. 'a' represents the intercept, and 'b' represents the
slope. They determine the precise positioning of the regression line on
the scatterplot.
Predictive Use: Regression equations are commonly used
for predicting one variable (Y) based on another (X). For
Cont. example, they can be used to predict a student's GPA (Y)
based on their entrance exam score (X).

Prediction Process: To predict Y using the regression line,

you plug a specific X value into the equation. This allows
for predicting Y based on the value of X. For example, an
entrance exam score of 50 may predict a GPA of 2.3, while
a score of 85 may predict a GPA of 3.7.

Error in Prediction: Despite the predictions made by the

regression line, individual data points may deviate from the
predictions. This is the error in prediction. The standard
error of the estimate quantifies this error in predicting Y
from X.

Correlation and Prediction Accuracy: The accuracy of

predictions is influenced by the correlation between X and
Y. A higher correlation indicates greater accuracy and a
smaller standard error of the estimate, meaning the
regression line is a better predictor of Y based on X.
Multiple Regression

Multiple Regression: Multiple regression is used when the prediction of a variable (Y),
such as GPA, is expected to be improved by using multiple predictor variables
simultaneously. This approach takes into account the intercorrelations among all the
predictor variables.
Multiple Regression Equation: The multiple regression equation considers the
correlations between the predictor scores and the variable being predicted. It assigns
weights to each predictor, and predictors with higher correlations with the predicted
variable are given more weight, leading to larger regression coefficients (b-values).
Weighted Predictors: Predictors that strongly correlate with the variable being predicted
are given more weight in the multiple regression equation, as they are expected to
contribute more to the prediction.
Cont.
Correlations Among Predictors: The multiple regression
equation also accounts for correlations among the predictor
variables themselves. If predictors are highly correlated with
each other, they may provide redundant information, and their
weights might be adjusted accordingly.

Efficiency Considerations: When using multiple predictors, it's

essential to consider their value in enhancing the prediction. If
two predictors provide similar information, it may be more
efficient to use only one to avoid redundancy.

Practical Applications: Knowledge of correlation, regression,

and related statistical tools can be valuable in various fields,
including unexpected ones like professional sports. For
instance, the use of regression equations may be beneficial to
an NBA team.

Central Tendency and Dispersion Overview
No ratings yet
Central Tendency and Dispersion Overview
31 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
67 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
30 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
74 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
5 pages
Central Tendency & Variability Explained
No ratings yet
Central Tendency & Variability Explained
10 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
6 pages
Lesson3 Descriptive Statistics Reviewer
No ratings yet
Lesson3 Descriptive Statistics Reviewer
12 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
23 pages
Data Presentation Techniques Explained
No ratings yet
Data Presentation Techniques Explained
104 pages
Central Tendency: Mean, Median, Mode
No ratings yet
Central Tendency: Mean, Median, Mode
62 pages
2 Measures of Location - Dispersion
No ratings yet
2 Measures of Location - Dispersion
61 pages
Descriptive Statistics Overview for PSYC 217
No ratings yet
Descriptive Statistics Overview for PSYC 217
9 pages
Statistical Methods in Social Sciences
No ratings yet
Statistical Methods in Social Sciences
69 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
27 pages
Unit 1 Stats
No ratings yet
Unit 1 Stats
32 pages
Data Analytics Process Overview
No ratings yet
Data Analytics Process Overview
85 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
33 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
33 pages
M.sc. Sem 1 Unit 4 4
No ratings yet
M.sc. Sem 1 Unit 4 4
121 pages
Understanding Averages and Variability
No ratings yet
Understanding Averages and Variability
5 pages
Central Tendency and Variability Explained
No ratings yet
Central Tendency and Variability Explained
68 pages
Numerical Measures in Data Analysis
No ratings yet
Numerical Measures in Data Analysis
46 pages
Central Tendency and Variability Explained
No ratings yet
Central Tendency and Variability Explained
73 pages
Measures of Data Variability Explained
No ratings yet
Measures of Data Variability Explained
33 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
28 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
63 pages
Numerical Measures for Data Analysis
No ratings yet
Numerical Measures for Data Analysis
48 pages
Central Tendency and Variability Measures
No ratings yet
Central Tendency and Variability Measures
42 pages
Central Tendency and Dispersion Measures
No ratings yet
Central Tendency and Dispersion Measures
35 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
38 pages
Statistical Methods Course Overview
No ratings yet
Statistical Methods Course Overview
59 pages
Exploratory Data Analysis Techniques
No ratings yet
Exploratory Data Analysis Techniques
51 pages
Data Management in Modern Mathematics
No ratings yet
Data Management in Modern Mathematics
8 pages
Central Tendency and Variability Measures
100% (15)
Central Tendency and Variability Measures
15 pages
Understanding Measures of Central Tendency
No ratings yet
Understanding Measures of Central Tendency
54 pages
Central Tendency and Variability Measures
No ratings yet
Central Tendency and Variability Measures
13 pages
Measures of Dispersion Explained
No ratings yet
Measures of Dispersion Explained
8 pages
Descriptive Statistics Overview Guide
No ratings yet
Descriptive Statistics Overview Guide
48 pages
Measures of Variability Explained
No ratings yet
Measures of Variability Explained
8 pages
Data Summarization and R Basics
No ratings yet
Data Summarization and R Basics
11 pages
Understanding Frequency Distribution in Statistics
No ratings yet
Understanding Frequency Distribution in Statistics
13 pages
Analyzing Mean and Standard Deviation
No ratings yet
Analyzing Mean and Standard Deviation
10 pages
Central Tendency and Variability Explained
No ratings yet
Central Tendency and Variability Explained
28 pages
Descriptive Statistics: Central Tendency & Dispersion
No ratings yet
Descriptive Statistics: Central Tendency & Dispersion
7 pages
Lecture - 8 - Measures of Tendency & Dispersion
No ratings yet
Lecture - 8 - Measures of Tendency & Dispersion
48 pages
Research Methodology & Bio-Statistics Syllabus
No ratings yet
Research Methodology & Bio-Statistics Syllabus
103 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
7 pages
Normal Distribution and Central Tendency
No ratings yet
Normal Distribution and Central Tendency
44 pages
Statistics in Data Science Overview
No ratings yet
Statistics in Data Science Overview
155 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
37 pages
Introduction to Statistical Methods Course
No ratings yet
Introduction to Statistical Methods Course
55 pages
Understanding Interfractile Range
No ratings yet
Understanding Interfractile Range
55 pages
Central Tendency and Dispersion Explained
No ratings yet
Central Tendency and Dispersion Explained
10 pages
Data Presentation Techniques Explained
100% (1)
Data Presentation Techniques Explained
40 pages
Basic Stat Lect-1
No ratings yet
Basic Stat Lect-1
20 pages
Data Analysis and Research Report Guide
No ratings yet
Data Analysis and Research Report Guide
40 pages
Understanding Central Tendency Measures
No ratings yet
Understanding Central Tendency Measures
15 pages
Central Tendency vs. Dispersion Explained
No ratings yet
Central Tendency vs. Dispersion Explained
28 pages
Test Score Interpretation Methods
No ratings yet
Test Score Interpretation Methods
18 pages
Void Bill of Lading Inquiry
No ratings yet
Void Bill of Lading Inquiry
8 pages
Botany 2021
No ratings yet
Botany 2021
2 pages
GBP, GM, WM Cixÿv-2021 Welqt Rxeweávb 2Q Cî: (1.0) Cövwyi Wewfbœzv I K Yxweb M
No ratings yet
GBP, GM, WM Cixÿv-2021 Welqt Rxeweávb 2Q Cî: (1.0) Cövwyi Wewfbœzv I K Yxweb M
2 pages
HSC 2025 Chemistry 1st Paper Syllabus
No ratings yet
HSC 2025 Chemistry 1st Paper Syllabus
4 pages
Combinational Logic Design Techniques
No ratings yet
Combinational Logic Design Techniques
47 pages
MIP1501 Assignment 02 Analysis and Solutions
No ratings yet
MIP1501 Assignment 02 Analysis and Solutions
15 pages
Discrete Random Variables Overview
No ratings yet
Discrete Random Variables Overview
17 pages
Degrees of Freedom for Laser Robot
No ratings yet
Degrees of Freedom for Laser Robot
5 pages
C If-Else Statement Explained
No ratings yet
C If-Else Statement Explained
8 pages
Static Voltage Stability Analysis
No ratings yet
Static Voltage Stability Analysis
6 pages
Central Limit Theorem in Gaming
No ratings yet
Central Limit Theorem in Gaming
17 pages
Automata Theory: Basics and Applications
No ratings yet
Automata Theory: Basics and Applications
4 pages
My Journey as a Mathematician
No ratings yet
My Journey as a Mathematician
2 pages
Linear Programming for Regression Analysis
100% (1)
Linear Programming for Regression Analysis
8 pages
Class IV Half Yearly Math Exam 2021-22
100% (1)
Class IV Half Yearly Math Exam 2021-22
14 pages
Albay District Math Action Plan 2015
No ratings yet
Albay District Math Action Plan 2015
3 pages
Importance of Math For AI
No ratings yet
Importance of Math For AI
10 pages
X-Fiber Language: Features and Syntax
No ratings yet
X-Fiber Language: Features and Syntax
11 pages
Grade 7 Math Reviewer: Algebra Basics
100% (1)
Grade 7 Math Reviewer: Algebra Basics
6 pages
Function Properties and Relations
No ratings yet
Function Properties and Relations
5 pages
Grade 8 Math Curriculum Map 2020-2021
100% (1)
Grade 8 Math Curriculum Map 2020-2021
4 pages
2018 A Maths MS
No ratings yet
2018 A Maths MS
24 pages
Graph Analysis and Function Solutions
No ratings yet
Graph Analysis and Function Solutions
18 pages
B.Tech 1st Year Linear Algebra Exam
No ratings yet
B.Tech 1st Year Linear Algebra Exam
2 pages
Calculus II Assignment 3 Solutions
No ratings yet
Calculus II Assignment 3 Solutions
3 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
7 pages
Soalan Tugasan Individu Am025 Sesi 2025 - 2026
No ratings yet
Soalan Tugasan Individu Am025 Sesi 2025 - 2026
3 pages
Evaluating Limits in Mathematics
No ratings yet
Evaluating Limits in Mathematics
8 pages
Grade 7 Geometry Lesson Plan: Angles
No ratings yet
Grade 7 Geometry Lesson Plan: Angles
5 pages
Uncertainty Propagation in Dynamic Systems
No ratings yet
Uncertainty Propagation in Dynamic Systems
6 pages
First Eigenvalue of Weighted p-Laplacian
No ratings yet
First Eigenvalue of Weighted p-Laplacian
8 pages
Understanding the Number System in Math
No ratings yet
Understanding the Number System in Math
5 pages
Understanding Random Number Generators
No ratings yet
Understanding Random Number Generators
2 pages
Economic Analysis Optimization Exam Guide
No ratings yet
Economic Analysis Optimization Exam Guide
10 pages