0% found this document useful (0 votes)
85 views38 pages

SPSS Beginner's Guide for Data Analysis

The document serves as a beginner's guide to using SPSS, covering the interface, data entry, and variable management. It includes instructions on importing data from Excel and CSV files, cleaning data, transforming variables, and performing basic statistical analysis. Key concepts such as descriptive statistics, skewness, and kurtosis are also introduced to help users understand their data better.

Uploaded by

Soumendra Patra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views38 pages

SPSS Beginner's Guide for Data Analysis

The document serves as a beginner's guide to using SPSS, covering the interface, data entry, and variable management. It includes instructions on importing data from Excel and CSV files, cleaning data, transforming variables, and performing basic statistical analysis. Key concepts such as descriptive statistics, skewness, and kurtosis are also introduced to help users understand their data better.

Uploaded by

Soumendra Patra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

SPSS for

Beginners

Sr. Assistant Professor (QT & Decision Science)


Department of Business Administration (DBA),
Ravenshaw University
Cuttack-753003, Odisha
1
SPSS interface
Data view
The place to enter data
Columns: variables
Rows: records
Variable view
The place to enter variables
List of all variables
Characteristics of all variables

2
Enter variables
1. Click Variable View
2. Type variable name
under Name column (e.g.
4. Q01).
2. Type Description NOTE: Variable name can
variable of variable
be 64 bytes long, and
name 3. Type: the first character must
numeric or be a letter or one of the
string characters @, #, or $.
3. Type: Numeric, string,
etc.
1. Click 4. Label: description of
this variables.
Window

3
Enter data in SPSS

Columns:
variables

Rows:
cases

Under
Data
View

4
Enter variables

Based on your code


book!

5
Enter cases

1. Two variables in the data set.


2. They are: Code and Q01.
3. Code is an ID variable, used to identify
individual case (NOT people’s real IDs).
4. Q01 is about participants’ ages: 1 = 12
years or younger, 2 = 13 years, 3 = 14
years…

Under
Data View

6
Import data from Excel
Select File Open Data
Choose Excel as file type
Select the file you want to import
Then click Open

7
Open Excel files in SPSS

8
Import data from CVS file
CVS is a comma-separated values file.
If you use Qualtrics to collect data (online
survey), you will get a CVS data file.
Select File Open Data
Choose All files as file type
Select the file you want to import
Then click Open

9
Continue

10
Continue

11
Continue

12
Continue

13
Continue

14
Continue

15
Continue

Save this
file as
SPSS data

16
Clean data after import data
files
Key in values and labels for each variable
Run frequency for each variable
Check outputs to see if you have variables
with wrong values.
Check missing values and physical surveys
if you use paper surveys, and make sure
they are real missing.
Sometimes, you need to recode string
variables into numeric variables

17
Continue

Wrong
entries

18
Variable transformation
Recode variables
1. Select Transform
Recode into Different
Variables
2. Select variable that you
want to transform (e.g.
Q20): we want
1= Yes and 0 = No
3. Click Arrow button to put
your variable into the
right window
4. Under Output Variable:
type name for new
variable and label, then
click Change
5. Click Old and New Values
19
Continue
6. Type 1 under Old Value
and 1 under New Value,
click Add. Then type 2
under Old Value, and 0
under New Value, click
Add.
7. Click Continue after
finish all the changes.
8. Click Ok

20
Variable transformation
 Compute variable (use YRBSS 2009 data)
 Example 1. Create a new variable: drug_use
(During the past 30 days, any use of cigarettes,
alcohol, and marijuana is defined as use, else as
non-use). There are two categories for the new
variable (use vs. non-use). Coding: 1= Use and 0
= Non-use
1. Use Q30, Q41, and Q47 from 2009 YRBSS survey
2. Non-users means those who answered 0
days/times to all three questions.
3. Go to Transform Compute Variable

21
Continue
4. Type “drug_use” under
Target Variable
5. Type “0” under Numeric
Expression. 0 means
Non-use
6. Click If button.

22
Continue
7. With help of that
Arrow button, type
Q30= 1 & Q41 = 1 & Q47= 1
then click Continue
8. Do the same thing for
Use, but the numeric
AND
expression is different: OR
Q30> 1 | Q41> 1 | Q47>1

23
Continue
9. Click OK
10. After click OK,
a small window asks
if you want to
change existing
variable because
drug_use was already
created when you
first define non-use.
11. Click ok.

24
Continue
Compute variables
 Example 2. Create a new variable drug_N
that assesses total number of drugs that
adolescents used during the last 30 days.
1. Use Q30 (cigarettes), 41 (alcohol), 47
(marijuana), and 50 (cocaine). The number
of drugs used should be between 0 and 4.
2. First, recode all four variables into two
categories: 0 = non-use (0 days), 1 = use (at
least 1 day/time)
3. Four variables have 6 or 7 categories

25
Continue
4. Recode four variables: 1 (old) = 0 (new), 2-
6/7 (old) = 1 (New).
5. Then select Transform Compute
Variable

26
Continue
6. Type drug_N under Target Variable
7. Numeric Expression: SUM (Q30r,Q41r,Q47r,Q50r)
8. Click OK

27
Continue
Compute variables
Example 3: Convert string variable into
numeric variable
1. Enter 1 at Numeric
Expression.
2. Click If button and
type Q2 =
‘Female’
3. Then click Ok.
4. Enter 2 at Numeric
Expression.
5. Click If button and
type Q2 = ‘Male’
6. Then click Ok

28
Sort and select cases
 Sort cases by variables: Data Sort Cases
 You can use Sort Cases to find missing.

29
Sort and select cases
Select cases
Example 1. Select Females for analysis.
[Link] to Data Select Cases
[Link] Select: check the second one
[Link] If button

30
Continue
4. Q2 (gender) = 1,
1 means Female
5. Click Continue
6. Click Ok

Unselect
ed
cases :
Q2 = 2

31
Sort and select cases
7. You will see a new variable: filter_$
(Variable view)

32
Sort and select cases
 Select cases
 Example 2. Select cases who used any of cigarettes, alcohol,
and marijuana during the last 30 days.
1. Data Select Cases
2. Click If button
3. Type Q30 > 1 | Q41 > 1 | Q47 > 1, click Continue

33
Basic statistical analysis
Descriptive statistics
Purposes:
[Link] wrong entries
[Link] basic knowledge about the sample and
targeted variables in a study
[Link] data

Analyze Descriptive statistics


Frequency

34
Continue

35
Frequency table

36
1. Skewness: a measure of the
asymmetry of a distribution.
The normal distribution is
symmetric and has a skewness
value of zero.
Positive skewness: a long right
tail.
Negative skewness: a long left
tail.
Departure from symmetry : a
skewness value more than
twice
its standard error.
2. Kurtosis: A measure of the
extent
to which observations cluster
around
a central point. For a normal
Norma
l
distribution, the value of the
Curve kurtosis
statistic is zero. Leptokurtic
data
values are more peaked,
37 whereas
platykurtic data values are
38

Common questions

Powered by AI

To recode a variable like Q20 into a binary format, begin by selecting ‘Transform’ and ‘Recode into Different Variables’. Select the target variable for transformation and move it to the appropriate field using the arrow button. Under 'Output Variable', assign a new name and label, then click 'Change'. For recoding, use the 'Old and New Values' section to map old values to new ones, such as assigning 1 to 'Yes' and 0 to 'No'. After entering all changes, click 'Continue' and then 'OK' to apply the transformation .

To compute a sum variable aggregating binary responses in SPSS, select ‘Transform’ then ‘Compute Variable’. Enter a new variable name in Target Variable, such as ‘total_score’. In Numeric Expression, use the SUM function encompassing the relevant recoded survey items; for example, SUM(Q30r, Q41r, Q47r, Q50r). If Q30, Q41, Q47, and Q50 are binary recoded as 0 = Non-use and 1 = Use, this function yields a total count of ‘use’ responses across items. Click ‘OK’ to execute. This process enables the aggregation of responses into a composite variable that reflects overall usage across diverse items .

After importing data into SPSS, best practices for cleaning include entering the correct values and labels for variables, conducting runs of frequency for each variable to spot erroneous entries, and checking for missing values to ensure they're genuine missing instances. If data comes from paper surveys, verify using the actual surveys. It's also advisable to recode string variables into numerical values for analyses and rectify any variable transformations as needed .

To select only female respondents in SPSS, one should navigate to ‘Data’ then ‘Select Cases’. Under 'Select', check the second option and then click the 'If' button. Enter the filter condition Q2 = 1, where 1 represents female. After clicking 'Continue', finalize the selection by clicking 'OK'. Unselected cases (e.g., males represented by Q2 = 2) are thus filtered out, resulting in a dataset containing only females for analysis .

Converting string variables to numeric variables in SPSS involves entering numeric codes under the Numeric Expression based on string values. For instance, using the 'Transform' and 'Compute Variable' functions, enter '1' if Q2 equals 'Female' and '2' for 'Male', aligning string categories to numeric codes. This conversion is necessary for numerical analysis methods that require quantitative data, such as correlation and regression analyses, which cannot handle string data directly .

To create a new variable identifying drug users from the YRBSS data in SPSS, go to ‘Transform’ and select ‘Compute Variable’. Enter 'drug_use' as the Target Variable and '0' in Numeric Expression to represent non-use. Then, click on 'If' and use the condition Q30=1 & Q41=1 & Q47=1 for classifying as non-user. For users, change the expression to Q30>1 | Q41>1 | Q47>1 and set as 1. After defining both conditions, click ‘OK’. This process applies logical conditions to survey data to categorize participants into users (1) and non-users (0).

After running descriptive statistics, signals like unexpected frequencies, impossible values (e.g., ages outside reasonable range), and inconsistent data entries should be monitored to identify wrong data entries. Using SPSS, one should review frequency distributions to pinpoint discrepancies, cross-reference with physical records if data originated from surveys, and examine data ranges or outliers. To resolve such issues, manually correct entries, apply conditions to detect outliers or anomalies, and, if necessary, recode to address identified discrepancies .

To import data from an Excel file into SPSS, one needs to select ‘File’ and then ‘Open’ followed by ‘Data’. The file type should be changed to Excel, and then the desired file is selected and opened . Once imported, the data view is organized with columns representing variables and rows as cases. To prepare the data for analysis, key values and labels for each variable should be entered, frequency is run for each variable to find wrong entries, and missing values are checked. It is also recommended to recode any string variables into numeric variables if needed .

To filter data from participants who used substances in the past 30 days, navigate to ‘Data’ then ‘Select Cases’. Click the ‘If’ button and enter the condition Q30 > 1 | Q41 > 1 | Q47 > 1, indicating usage during this period. Click ‘Continue’ and then ‘OK’. This filters the dataset, creating a subset where only those with usage are included, excluding non-users to focus analyses on relevant subjects .

Skewness and kurtosis values are crucial in understanding data distribution characteristics in SPSS. Skewness measures the asymmetry of a distribution: a skewness value of zero indicates a symmetrical, normal distribution, positive skewness implies a longer right tail, and negative skewness a longer left tail . Kurtosis, on the other hand, evaluates the peaking of a distribution. A kurtosis statistic of zero aligns with a normal distribution. Leptokurtic distributions have higher peaks compared to a normal curve, while platykurtic ones are flatter than the normal curve . These statistics help assess deviations from normality, informing subsequent analysis choices.

You might also like