Introduction to SPSS
What is SPSS? (Ask for the definition or purpose, but ask anyone)
SPSS stands for Statistical Package for the Social Sciences.
It is a software program used for statistical analysis in social science, business, education, and
health research.
It allows researchers to:
Enter and organize data easily
Perform descriptive and inferential statistics
Create graphs, charts, and tables automatically
Interpret research results accurately
In simple words, SPSS is a tool that helps researchers analyze data scientifically and quickly.
Main Purpose of Using SPSS in Social Research
The main purpose is to make data analysis easier, faster, and more accurate.
SPSS helps:
Summarize survey or field data.
Test hypotheses and relationships between variables.
Identify trends and patterns in society
Present findings in graphs and tables
🔹 Example: A sociologist can use SPSS to find out if education level affects employment status.
What Type of Data Can Be Analyzed Using SPSS?
SPSS can analyze many types of data:
Survey data (questionnaires, Likert scales)
Experimental data
Observation data
Demographic data (age, gender, income, education)
Survey Data: Information collected from people through questionnaires or interviews to
study their opinions or behaviors.
Demographic Data: Information about people’s basic characteristics like age, gender, e
ducation, and income.
It supports both qualitative (categorical) and quantitative (numerical) data.
2. SPSS Interface & Files (VIP) (Ask anyone)
Data View and Variable View
SPSS has two main screens in the Data Editor window:
1. Variable View:
o Each row represents one variable (e.g., Gender, Age, Income).
o Columns define properties such as:
Name: Short name of variable (e.g., gender)
Label: Full description (e.g., Respondent’s gender)
Values: Coding of responses (e.g., 1 = Male, 2 = Female)
Measure: Type of data (Nominal, Ordinal, Scale)
o 🔹 Used to define variables before entering data.
2. Data View:
o Each row = one respondent (case)
o Each column = one variable
o 🔹 Used to enter actual data values.
Example:
ID Gender Age
1 1 22
2 2 25
Label and Values
Label: Full name or description of the variable.
🔹 Example: Variable name = “edu”; Label = “Education Level”
Values: Numeric codes for categories.
🔹 Example: 1 = Primary, 2 = Secondary, 3 = Graduate
3. Variable Types and Measurement Levels
Levels of Measurement (VIP) In SPSS: Must Learn
SPSS classifies variables based on how data is measured:
Level Meaning Examples
Nominal Categories without order Gender, Religion, Nationality
Ordinal Ordered categories Education Level (Primary, Secondary,
Higher)
Scale (Interval/Rati Numeric values with equal di Age, Income, Test Scores
o) stance
Explanation
1. Nominal Level:
o Data is categorized without any rank or order.
o Example: Male/Female, Rural/Urban.
2. Ordinal Level:
o Data is ranked, but the differences between ranks are not equal.
o Example: Agree, Neutral, Disagree (Likert Scale).
3. Scale Level:
o Numeric data with measurable and equal intervals.
o Example: Age = 20, 25, 30.
4. Data Entry and Coding
What is Data Coding?
Data coding means converting qualitative answers into numerical form for easy analysis.
🔹 Example:
Gender → 1 = Male, 2 = Female
Education → 1 = Primary, 2 = Secondary, 3 = Graduate
👉 This makes data readable for SPSS.
Why is Coding Necessary?
Helps SPSS process data efficiently.
Maintains uniformity and accuracy.
Allows comparison and statistical testing.
Handling Missing Values
If a respondent leaves a question blank, SPSS marks it as missing data.
Researchers can:
Leave it blank, or
Assign a code (e.g., 99 = missing)
to ensure it doesn’t affect analysis.
Importance of Assigning Variable Labels
Labels make data more understandable and professional.
Instead of reading “v1” or “q5,” the output shows meaningful labels like “Gender” or “Education
Level.”
👉 It helps in clear interpretation during analysis.
5. Descriptive Statistics
Meaning of Descriptive Statistics
Descriptive statistics in SPSS are used to summarize and describe the main features of a dataset.
They don’t test hypotheses — they only describe what the data shows.
It includes:
Mean, Median, Mode
Frequency and Percentage
Graphs (Histogram, Pie Chart, Bar Chart)
Mean, Median, and Mode
Mean: Average value = (Sum of all values ÷ Total number of values).
🔹 Example: (10+20+30)/3 = 20
Median: Middle value when data is arranged in order.
🔹 Example: 3, 5, 7 → Median = 5
Mode: Most frequent value in the data.
🔹 Example: 2, 3, 3, 4 → Mode = 3
Difference Between Frequency and Percentage
Frequency: Number of times a value appears.
🔹 Example: 15 students chose “Yes.”
Percentage: Frequency divided by total × 100.
🔹 Example: (15/50) × 100 = 30% chose “Yes.”
Frequency = Count, Percentage = Proportion
Use of Histogram
A histogram is a graph showing frequency distribution for continuous variables.
Bars represent intervals, and their heights show frequency.
Purpose:
To visualize data distribution.
To detect patterns like normal or skewed distribution.
🔹 Example: Histogram of Age shows how respondents are distributed by age group.
(Opti onal) Short Overview: Data Analysis Basics (T test)
Correlation Test
Measures the relationship between two variables.
👉 Example: Education ↑ → Income ↑ (positive correlation)
t-Test
Compares the mean scores of two groups.
👉 Example: Male vs Female academic performance.
Dependent vs Independent Variables
Independent variable: Cause or predictor (e.g., education level).
Dependent variable: Effect or outcome (e.g., income).
👉 “Education affects income.”
6. Application in Research (Theoretical)
How SPSS Helps in Sociological Research
Organizes survey data efficiently.
Tests relationships between variables.
Identifies patterns in society (like gender inequality, education gap).
Produces graphs for research reports.
Why SPSS is Preferred Over Manual Calculation
Saves time and avoids human error.
Can analyze large datasets.
Automatically generates results.
Provides accurate and scientific outcomes.
Example of Sociological Study
A research titled:
“Impact of Parental Education on Students’ Academic Performance.”
Independent variable: Parents’ Education
Dependent variable: Students’ Grades
SPSS is used to find the correlation between the two.