0% found this document useful (0 votes)
14 views3 pages

R Data Analysis Lab: Iris & mtcars Datasets

Uploaded by

SpiZz
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views3 pages

R Data Analysis Lab: Iris & mtcars Datasets

Uploaded by

SpiZz
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

WEEK 2 LAB EXERCISE – Exploring built in datasets

Duration: 2 hours
Mode: Guided and Hands-on Practice
Module: CT127-3-2 Programming for Data Analysis
Lecturer: Dr. Kulothunkan Palasundram (Dr. Kulo)

Objective
By the end of this lab, students will be able to:
1. Write basic R commands to perform simple exploratory data analysis on 2 built
in datasets – iris and mtcars

Dataset: Iris dataset


Part A — Explore iris
Load and inspect the famous Fisher’s iris dataset by running the commands below
one by one
data(iris) # Loads the built-in iris dataset
dim(iris)
names(iris) # Column names
str(iris) # Structure: types for each column;
head(iris, 3) # First 3 rows
summary(iris) # Summary stats for numeric; counts for factor
table(iris$Species) # Frequency table for the Species

Part B — Perform basic manipulations - Select, Filter, Sort


Create subsets of the data frame and order rows by a variable.
# Select two columns into a new object sl
sl <- iris[, c("[Link]", "Species")]
head(sl)

setosa_big <- subset(iris, Species == "setosa" & [Link] > 5)


nrow(setosa_big)

# Reorder rows
sorted <- iris[order(iris$[Link], decreasing = TRUE), ]

# Show top 5 rows and first two columns


head(sorted, 5)[, 1:2]

Notes:
• subset() uses a logical condition to filter rows.
• order() returns row indices for sorting; use inside [ ] to reorder the data frame.

1
Part C — New Variables & Grouping
Create a ratio variable and bin a numeric variable into categories; compute grouped
means.
# New numeric column
iris$[Link] <- iris$[Link] / iris$[Link]

summary(iris$[Link])

iris$SepalLenCat <- cut (


iris$[Link],
breaks = c(-Inf, 5.5, 6.5, Inf),
labels = c("short", "medium", "long")
)

table(iris$SepalLenCat)

# Mean [Link] per Species


tapply(iris$[Link], iris$Species, mean)

Notes:
• cut() converts a continuous variable into categorical bins (factor).
• tapply(x, g, f) applies f to x within each group g.

Part D — Quick Visuals (base R)


Produce a histogram, a grouped boxplot, and a scatterplot.
hist(iris$[Link], # Numeric vector for histogram
main = "Histogram of Sepal Length", # Title
xlab = "[Link]") # X-axis label

boxplot([Link] ~ Species, data = iris, # Formula: y ~ group


main = "Sepal Length by Species", # Title
ylab = "[Link]") # Y-axis label

plot(iris$[Link], iris$[Link], # Scatter: x then y


xlab = "[Link]", ylab = "[Link]",
pch = 19) # Solid points

Notes:
• Histogram shows a distribution; boxplot compares groups; scatter shows
relationships.

2
Dataset: mtcars
Part A — Explore mtcars

#load the dataset


data(mtcars)

1. Write R command to return the number of rows and columns in the dataset

2. List down the column names of the mtcars dataset

3. Identify the data types for all the columns

Part B — Checking the data distribution


4. Write R commands to get the average and middle point for a column (choose any
column)

Part C — Visuals
5. Plot a histogram for mpg. Explain what you see.

6. Plot a grouped boxplot. What is a boxplot used for?

7. Plot a scatterplot of hp and mpg. Explain the relationship between the 2


variables.

Part F — Data Wrangling (Base R: New Variables, Cross-Tabs, Reshape)

1. Create a new variable called mpg_band

The new variable will have values as defined below


Mpg Mpg_band
< 18 low
18 to 25 medium
> 25 high

2. Write a command to find out how many cars there are in each band

3. Write a command to split the cars based on their weight. Any cars weighing
more than the median should be categorized as heavy otherwise light. How
many are there in each category?

You might also like