0% found this document useful (0 votes)
10 views9 pages

Pearson Correlation in SPSS Guide

This document provides a comprehensive guide on conducting Pearson's Product-Moment Correlation using SPSS Statistics, detailing its purpose, assumptions, and procedures. It emphasizes the importance of meeting specific assumptions for valid results, including the measurement level of variables, linear relationships, absence of outliers, and normal distribution. The guide also includes an example and instructions for interpreting the output from SPSS after performing the correlation analysis.

Uploaded by

Ashita Savsani
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views9 pages

Pearson Correlation in SPSS Guide

This document provides a comprehensive guide on conducting Pearson's Product-Moment Correlation using SPSS Statistics, detailing its purpose, assumptions, and procedures. It emphasizes the importance of meeting specific assumptions for valid results, including the measurement level of variables, linear relationships, absence of outliers, and normal distribution. The guide also includes an example and instructions for interpreting the output from SPSS after performing the correlation analysis.

Uploaded by

Ashita Savsani
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Pearson's Product-Moment Correlation using

SPSS Statistics
Introduction
The Pearson product-moment correlation coefficient (Pearson’s correlation, for short) is a
measure of the strength and direction of association that exists between two variables measured
on at least an interval scale. For example, you could use a Pearson’s correlation to understand
whether there is an association between exam performance and time spent revising; whether
there is an association between depression and length of unemployment; and so forth. A
Pearson’s correlation attempts to draw a line of best fit through the data of two variables, and the
Pearson correlation coefficient, r, indicates how far away all these data points are to this line of
best fit (i.e., how well the data points fit this new model/line of best fit). You can learn more
here, which we recommend if you are not familiar with this test. If one of your variables is
dichotomous you can use a point-biserial correlation instead or if you have one or more control
variables you can run a partial correlation.

This "quick start" guide shows you how to carry out a Pearson's correlation using SPSS
Statistics, as well as interpret and report the results from this test. However, before we introduce
you to this procedure, you need to understand the different assumptions that your data must meet
in order for a Pearson's correlation to give you a valid result. We discuss these assumptions next.
SPSS Statisticstop ^

Assumptions
When you choose to analyse your data using Pearson’s correlation, part of the process involves
checking to make sure that the data you want to analyse can actually be analysed using Pearson’s
correlation. You need to do this because it is only appropriate to use Pearson’s correlation if your
data "passes" four assumptions that are required for Pearson’s correlation to give you a valid
result. In practice, checking for these four assumptions just adds a little bit more time to your
analysis, requiring you to click of few more buttons in SPSS Statistics when performing your
analysis, as well as think a little bit more about your data, but it is not a difficult task.

Before we introduce you to these four assumptions, do not be surprised if, when analysing your
own data using SPSS Statistics, one or more of these assumptions is violated (i.e., is not met).
This is not uncommon when working with real-world data rather than textbook examples, which
often only show you how to carry out Pearson’s correlation when everything goes well!
However, don’t worry. Even when your data fails certain assumptions, there is often a solution to
overcome this. First, let’s take a look at these four assumptions:

 Assumption #1: Your two variables should be measured at the interval or ratio level
(i.e., they are continuous). Examples of variables that meet this criterion include revision
time (measured in hours), intelligence (measured using IQ score), exam performance
(measured from 0 to 100), weight (measured in kg), and so forth. You can learn more
about interval and ratio variables in our article: Types of Variable.
 Assumption #2: There needs to be a linear relationship between the two variables.
Whilst there are a number of ways to check whether a linear relationship exists between
your two variables, we suggest creating a scatterplot using SPSS Statistics, where you
can plot the dependent variable against your independent variable, and then visually
inspect the scatterplot to check for linearity. Your scatterplot may look something like
one of the following:

If the relationship displayed in your scatterplot is not linear, you will have to either run a
non-parametric equivalent to Pearson’s correlation or transform your data, which you can
do using SPSS Statistics. In our enhanced guides, we show you how to: (a) create a
scatterplot to check for linearity when carrying out Pearson’s correlation using SPSS
Statistics; (b) interpret different scatterplot results; and (c) transform your data using
SPSS Statistics if there is not a linear relationship between your two variables.

 Assumption #3: There should be no significant outliers. Outliers are simply single data
points within your data that do not follow the usual pattern (e.g., in a study of 100
students’ IQ scores, where the mean score was 108 with only a small variation between
students, one student had a score of 156, which is very unusual, and may even put her in
the top 1% of IQ scores globally). The following scatterplots highlight the potential
impact of outliers:
Pearson’s r is sensitive to outliers, which can have a very large effect on the line of best
fit and the Pearson correlation coefficient, leading to very difficult conclusions regarding
your data. Therefore, it is best if there are no outliers or they are kept to a minimum.
Fortunately, when using SPSS Statistics to run Pearson’s correlation on your data, you
can easily include criteria to help you detect possible outliers. In our enhanced Pearson’s
correlation guide, we: (a) show you how to detect outliers using "casewise diagnostics",
which is a simple process when using SPSS Statistics; and (b) discuss some of the
options you have in order to deal with outliers.

 Assumption #4: Your variables should be approximately normally distributed. In


order to assess the statistical significance of the Pearson correlation, you need to have
bivariate normality, but this assumption is difficult to assess, so a simpler method is more
commonly used. This known as the Shapiro-Wilk test of normality, which is easily tested
for using SPSS Statistics. In addition to showing you how to do this in our enhanced
Pearson’s correlation guide, we also explain what you can do if your data fails this
assumption.

You can check assumptions #2, #3 and #4 using SPSS Statistics. We suggest testing these
assumptions in this order because it represents an order where, if a violation to the assumption is
not correctable, you will no longer be able to use Pearson’s correlation (although you may be
able to run another statistical test on your data instead). Just remember that if you do not run the
statistical tests on these assumptions correctly, the results you get when running a Pearson's
correlation might not be valid. This is why we dedicate a number of sections of our enhanced
Pearson's correlation guide to help you get this right. You can find out about our enhanced
content as a whole here, or more specifically, learn how we help with testing assumptions here.

In the section, Procedure, we illustrate the SPSS Statistics procedure to perform a Pearson’s
correlation assuming that no assumptions have been violated. First, we set out the example we
use to explain the Pearson’s correlation procedure in SPSS Statistics.
Example
A researcher wants to know whether a person's height is related to how well they perform in a
long jump. The researcher recruited untrained individuals from the general population, measured
their height and had them perform a long jump. The researcher then investigated whether there is
an association between height and long jump performance.
SPSS Statisticstop ^

Setup in SPSS Statistics


In SPSS Statistics, we created two variables so that we could enter our data: Height (i.e., the
person's height) and Jump_Dist (i.e., long jump distance). In our enhanced Pearson's correlation
guide, we show you how to correctly enter data in SPSS Statistics to run a Pearson's correlation.
You can learn about our enhanced data setup content here. Alternately, we have a generic, "quick
start" guide to show you how to enter data into SPSS Statistics, available here.
SPSS Statisticstop ^

Test Procedure in SPSS Statistics


The six steps below show you how to analyse your data using Pearson’s correlation in SPSS
Statistics when none of the four assumptions in the previous section, Assumptions, have been
violated. At the end of these six steps, we show you how to interpret the results from this test. If
you are looking for help to make sure your data meets assumptions #2, #3 and #4, which are
required when using Pearson’s correlations, and can be tested using SPSS Statistics, you can
learn more about our enhanced guides here.

 Click Analyze > Correlate > Bivariate... on the menu system as shown below:
Published with written permission from SPSS Statistics, IBM Corporation.

You will be presented with the following screen:


Published with written permission from SPSS Statistics, IBM Corporation.

 Transfer the variables Height and Jump_Dist into the Variables: box by dragging-and-
dropping or by clicking the button. You will end up with a screen similar to the one
below:
Published with written permission from SPSS Statistics, IBM Corporation.

Note: If you study involves calculating more than one correlation and you want to carry
out these correlations at the same time, we show you how to do this in our enhanced
Pearson’s correlation guide. We also show you how to write up the results from multiple
correlations.

 Make sure that the Pearson tickbox is checked under the -Correlation Coefficients- area
(although it is selected by default in SPSS Statistics).
 Click the button. If you wish to generate some descriptives, you can do it here
by clicking on the relevant tickbox under the -Statistics- area.
Published with written permission from SPSS Statistics, IBM Corporation.

 Click the button.


 Click the button.

Output
SPSS Statistics generates quite a few tables for a Pearson’s correlation, but only one for the main
Pearson’s correlation procedure that you ran in the previous section. If your data passed
assumptions #2 (linear relationship), #3 (no outliers) and #4 (normality), which we explained
earlier in the Assumptions section, you will only need to interpret this one table. However, since
you should have tested your data for these assumptions, you will also need to interpret the SPSS
Statistics output that was produced when you tested for them (i.e., you will have to interpret (a)
the scatterplot you used to check for a linear relationship between your two variables, (b) the
"casewise diagnostics" table that highlights if you had any significant outliers, and (c) the output
SPSS Statistics produces for your Shapiro-Wilk test of normality). If you do not know how to do
this, we show you in our enhanced Pearson’s correlation guide. Remember that if your data
failed any of these assumptions, the output that you get from the Pearson’s correlation procedure
(i.e., the table we discuss below), will no longer be correct.

However, in this "quick start" guide, we focus on the results from the Pearson’s correlation
procedure only, assuming that your data met all the relevant assumptions. Therefore, when
running the Pearson’s correlation procedure, you will be presented with the Correlations table in
the output viewer as shown below:
Published with written permission from SPSS Statistics, IBM Corporation.

The results are presented in a matrix such that, as can be seen above, the correlations are
replicated. Nevertheless, the table presents the Pearson correlation coefficient, the significance
value and the sample size that the calculation is based on.

In this example, we can see that the Pearson correlation coefficient, r, is 0.777, and that this is
statistically significant (p < 0.0005). For interpreting multiple correlations, see our enhanced
Pearson’s guide.
SPSS Statisticstop ^

Understanding the Output


In our example above, you might present the results are follows:

 General

A Pearson product-moment correlation was run to determine the relationship between an


individual's height and their performance in a long jump (distance jumped). The data showed no
violation of normality, linearity or homoscedasticity (you will need to have checked for these).
There was a strong, positive correlation between height and distance jumped, which was
statistically significant (r = .777, n = 27, p < .0005).

In our enhanced Pearson’s correlation guide, we also show you how to write up the results from
your assumptions tests and Pearson’s correlation output if you need to report this in a
dissertation/thesis, assignment or research report. We do this using the Harvard and APA styles.
We also show you how to write up your results if you have performed multiple Pearson’s
correlations. You can learn more about our enhanced content here.

You might also like