0% found this document useful (0 votes)
11 views31 pages

Data Visualization Lab

The document provides a comprehensive guide on visualizing various datasets in R, including VADeaths, airquality, AirPassengers, iris, diamonds, HairEyeColor, mtcars, and others. It covers different types of visualizations such as histograms, line charts, bar charts, box plots, scatter plots, hexbin plots, mosaic plots, heat maps, and map visualizations using the leaflet library. Each section includes code snippets and explanations for creating the visualizations.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views31 pages

Data Visualization Lab

The document provides a comprehensive guide on visualizing various datasets in R, including VADeaths, airquality, AirPassengers, iris, diamonds, HairEyeColor, mtcars, and others. It covers different types of visualizations such as histograms, line charts, bar charts, box plots, scatter plots, hexbin plots, mosaic plots, heat maps, and map visualizations using the leaflet library. Each section includes code snippets and explanations for creating the visualizations.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

1.

a) Load VADeaths(Death Rates in Virginia)dataset in R and visualize the


data using different histograms.

Code :
# Load the VADeaths dataset
data("VADeaths")
VADeaths

# Convert the matrix to a vector for histogram plotting


death_rates <- [Link](VADeaths)

# 1. Simple Histogram
hist(death_rates,
main = "Histogram of VADeaths",
xlab = "Death Rates",
col = "lightblue",
border = "black")

# 2. Histogram with more breaks


hist(death_rates,
breaks = 10,
main = "Histogram with More Breaks",
xlab = "Death Rates",
col = "lightgreen",
border = "black")
# 3. Histogram with different colors
hist(death_rates,
col = "orange",
main = "Colored Histogram of VADeaths",
xlab = "Rates")

Explanation :

 VADeaths is a built-in dataset showing death rates for different groups in Virginia.
 [Link]() converts the matrix into a single numeric list to plot histograms.
 hist() draws the histogram.
 We draw :
o A simple histogram
o A histogram with more breaks
o A colored histogram
o A histogram with a smooth density curve
1. b) Load air quality dataset in R and visualize La Guardia Airport’s daily maximum
temperature using histogram

The airquality dataset is built into R. The Temperature (Temp) column represents the daily
maximum temperature recorded at La Guardia Airport, New York.

Code :

# Load the airquality dataset


data("airquality")
head(airquality)

# Extract the daily maximum temperature column


temp <- airquality$Temp

# Histogram of La Guardia Airport's daily maximum temperature


hist(temp,
main = "Daily Maximum Temperature at La Guardia Airport",
xlab = "Temperature (°F)",
col = "skyblue",
border = "black")

Explanation :

 airquality dataset contains daily air quality measurements from La Guardia Airport.
 Temp column = maximum temperature for each day.
 hist(temp) draws a histogram to show how temperatures are distributed.
2. Load Air Passengers dataset in R and visualize the data using line chart
that shows increase in air passengers over given time period.

The Air Passengers dataset contains monthly totals of international airline passengers from
1949 to 1960.

Code :

# Load AirPassengers dataset


data("AirPassengers")
AirPassengers

# Line chart of Air Passengers


plot(AirPassengers,
type = "l", # l = line chart
col = "blue",
lwd = 2,
main = "Increase in Air Passengers Over Time",
xlab = "Year",
ylab = "Number of Passengers")

Explanation :

 AirPassengers is a time-series dataset in R.


 plot(..., type="l") draws a line chart.
 The line chart clearly shows:
o Growth in air passengers every year
o Seasonal patterns (higher values in some months)
3 a) Load iris dataset in R, visualize the data using different Bar Charts
and also demonstrate the use of stacked plots.

The iris dataset contains measurements of flowers from 3 species:


Setosa, Versicolor, Virginica.

Code :

data("iris")
head(iris)

BAR CHARTS :

Bar Chart 1: Count of Each Species

species_count <- table(iris$Species)

barplot(species_count,
main = "Count of Each Species",
xlab = "Species",
ylab = "Count",
col = c("red", "green", "blue"))

Bar Chart 2: Mean Sepal Length for Each Species

mean_sepal_length <- tapply(iris$[Link], iris$Species, mean)

barplot(mean_sepal_length,
main = "Mean Sepal Length by Species",
xlab = "Species",
ylab = "Mean Sepal Length",
col = c("pink", "lightgreen", "lightblue"))
Bar Chart 3: Mean Petal Length for Each Species

mean_petal_length <- tapply(iris$[Link], iris$Species, mean)

barplot(mean_petal_length,
main = "Mean Petal Length by Species",
xlab = "Species",
ylab = "Mean Petal Length",
col = c("orange", "purple", "skyblue"))
Bar chart 4 : STACKED BAR PLOT

We create a stacked plot of mean Sepal Length + mean Petal Length for each species.

Prepare data :

stack_data <- rbind(mean_sepal_length, mean_petal_length)


rownames(stack_data) <- c("Sepal Length", "Petal Length")
stack_data

Stacked Bar Plot:


barplot(stack_data,
main = "Stacked Bar Plot: Sepal & Petal Length",
xlab = "Species",
ylab = "Lengths",
col = c("lightblue", "orange"),
legend = TRUE)

Explanation :

 table() gives count of each species.


 tapply() calculates the mean values for each species.
 barplot() draws the bar charts.
 Stacked bar plot shows two measurements together (Sepal + Petal lengths) for
comparison across the species.

3 b) Load air quality dataset in R and visualize ozone concentration in air.

The air quality dataset contains daily air quality measurements in New York.
The Ozone column gives the daily ozone concentration (parts per billion).
Code :

# Load airquality dataset


data("airquality")
head(airquality)

# Extract Ozone column


oz <- airquality$Ozone

# Remove missing values (NA)


oz <- [Link](oz)

# Histogram of Ozone concentration


hist(oz,
main = "Ozone Concentration in Air",
xlab = "Ozone (ppb)",
col = "lightblue",
border = "black")

Optional: Simple Line Plot of Ozone Levels Over Days

plot(oz,
type = "l",
main = "Daily Ozone Concentration",
xlab = "Day",
ylab = "Ozone (ppb)",
col = "blue",
lwd = 2)
Explanation :

 Air quality $Ozone gives ozone levels for each day.


 na. omit () removes missing values so the graph works correctly.
 hist() shows how ozone concentrations are distributed.
 Optional line plot shows how ozone changes day by day.

4 a) Load iris dataset in R, visualize the data using different Box plots including group
by option and also use color palette to represent species.

The iris dataset contains measurements of Sepal and Petal (Length & Width) for three species

1. Load the dataset

BOX PLOTS

Box Plot 1: Sepal Length for All Species (Grouped by


Species)
boxplot([Link] ~ Species,
data = iris,
main = "Sepal Length by Species",
xlab = "Species",
ylab = "Sepal Length",
col = c("red", "green", "blue")) # Color palette
Box Plot 2: Sepal Width for All Species

boxplot([Link] ~ Species,
data = iris,
main = "Sepal Width by Species",
xlab = "Species",
ylab = "Sepal Width",
col = c("orange", "lightgreen", "skyblue"))

Box Plot 3: Petal Length for All Species

boxplot([Link] ~ Species,
data = iris,
main = "Petal Length by Species",
xlab = "Species",
ylab = "Petal Length",
col = c("pink", "yellow", "purple"))
Box Plot 4: Petal Width for All Species

boxplot([Link] ~ Species,
data = iris,
main = "Petal Width by Species",
xlab = "Species",
ylab = "Petal Width",
col = c("cyan", "magenta", "lightgray"))

Using a Color Palette (RColorBrewer)

You can use a palette instead of manually selecting colors.

library(RColorBrewer)
palette <- [Link](3, "Set2")

boxplot([Link] ~ Species,
data = iris,
main = "Sepal Length by Species (Color Palette)",
xlab = "Species",
ylab = "Sepal Length",
col = palette)

Simple Explanation :

 [Link] ~ Species means group Sepal Length by Species.


 Box plots show median, quartiles, and spread of each measurement.
 col = c(...) adds colors to visually separate the species.
 RColorBrewer gives professional color palettes for clean visuals.

4 b) Load air quality dataset in R and visualize air quality parameters using box plots.

The airquality dataset contains the following important air parameters:

 Ozone
 Solar.R (Solar Radiation)
 Wind
 Temp (Temperature)

We will create box plots for each parameter.

1. Load Dataset

data("airquality")
head(airquality)
BOX PLOTS FOR AIR QUALITY PARAMETERS

Box Plot 1: Ozone Concentration


boxplot(airquality$Ozone,
main = "Ozone Concentration",
ylab = "Ozone (ppb)",
col = "lightblue")

Box Plot 2: Solar Radiation


boxplot(airquality$Solar.R,
main = "Solar Radiation",
ylab = "Solar.R",
col = "lightgreen")

Box Plot 3: Wind Speed


boxplot(airquality$Wind,
main = "Wind Speed",
ylab = "Wind (mph)",
col = "orange")
Box Plot 4: Temperature
boxplot(airquality$Temp,
main = "Daily Temperature",
ylab = "Temperature (°F)",
col = "pink")

Combined Box Plot (All Parameters Together)

To compare all parameters side by side:

boxplot(airquality[, c("Ozone", "Solar.R", "Wind", "Temp")],


main = "Air Quality Parameters",
col = c("skyblue", "lightgreen", "orange", "pink"),
ylab = "Values")
5. Visualize iris dataset using simple scatter, multivariate scatter plot and
also visualize scatter plot matrix to visualize multiple variables across each
other.

The iris dataset contains 4 numerical features :

 [Link]
 [Link]
 [Link]
 [Link]

and Species.

We will visualize all using :

Simple Scatter Plot


Multivariate Scatter Plot (color by species)
Scatter Plot Matrix (multiple variables vs each other)

1️ Load Dataset :
data("iris")

head(iris)
1️. Simple Scatter Plot (Sepal Length vs Sepal Width)

Code :

plot(iris$[Link],
iris$[Link],
main = "Simple Scatter Plot: Sepal Length vs Sepal Width",
xlab = "Sepal Length",
ylab = "Sepal Width",
col = "blue",
pch = 19)

2️. Multivariate Scatter Plot (Color by Species)

Code :

plot(iris$[Link],
iris$[Link],
col = iris$Species,
pch = 19,
main = "Multivariate Scatter Plot: Petal Length vs Petal Width",
xlab = "Petal Length",
ylab = "Petal Width")
legend("topleft",
legend = levels(iris$Species),
col = 1:3,
pch = 19)
3️. Scatter Plot Matrix (Compare all numeric variables)

Code :

pairs(iris[1:4],
main = "Scatter Plot Matrix for Iris Dataset",
col = iris$Species,
pch = 19)

6) Load diamonds dataset in R and visualize the structure in datasets with


large data points using hexagon binning and also add color palette then use
the

Code :

# Load libraries
library(ggplot2) # diamonds dataset + plotting
library(hexbin) # required by geom_hex
library(viridis) # pleasant color palettes (optional)
# [Link]("hexbin"); [Link]("viridis") if needed

# Data: diamonds (comes with ggplot2)


data("diamonds")

# Basic hex-binned plot: price vs carat


p <- ggplot(diamonds, aes(x = carat, y = price)) +
geom_hex(bins = 50) + # increase bins for finer detail; try 30, 50, 80
# Use a perceptually-uniform palette (viridis)
scale_fill_viridis(option = "magma", trans = "log", name = "count\n(log)") +
labs(
title = "Hexbin of diamonds: price vs carat",
subtitle = "Hexagon binning used to visualize dense scatter",
x = "Carat",
y = "Price (USD)"
)+
theme_minimal(base_size = 14)

# Print the plot


print(p)

# Save the plot (optional)


ggsave("diamonds_hexbin.png", plot = p, width = 8, height = 5, dpi = 300)

7) Load HairEyeColor dataset in R and plot categorical data using mosaic plot.

Code :

Load the dataset :


data(HairEyeColor)
HairEyeColor

1. Basic Mosaic Plot (Hair vs Eye Color) :


mosaicplot(HairEyeColor,
main = "Mosaic Plot of Hair and Eye Color",
xlab = "Hair Color",
ylab = "Eye Color",
color = TRUE)

2. Mosaic Plot grouped by Sex


mosaicplot(HairEyeColor,
main = "Mosaic Plot of Hair, Eye Color, and Sex",
xlab = "Hair Color",
ylab = "Eye Color",
color = c("lightblue", "lightgreen", "lightpink"))
8) Load mtcars dataset in R and visualize data using heat map.

Code:

Load the dataset

data(mtcars)
mtcars

1. Heat Map (Basic Heatmap of all variables)

We first convert the data into a matrix.

mtcars_matrix <- [Link](mtcars)

heatmap(mtcars_matrix,
main = "Heatmap of mtcars Dataset",
xlab = "Car Features",
ylab = "Car Models",
col = [Link](256))

2. Heat Map using color palette (RColorBrewer)

library(RColorBrewer)
palette <- [Link](9, "YlOrRd")

heatmap(mtcars_matrix,
main = "Heatmap of mtcars with Color Palette",
col = palette,
xlab = "Variables",
ylab = "Cars")
3. Heatmap of Correlation Matrix (Very Common in Labs)

cor_matrix <- cor(mtcars)

heatmap(cor_matrix,
main = "Correlation Heatmap of mtcars",
col = [Link](256),
xlab = "Variables",
ylab = "Variables")

9) Install leaflet library in R and perform different map visualizations.

Install and Load Leaflet


# Install (only once)
# [Link]("leaflet")

library(leaflet)
MAP VISUALIZATIONS

Map 1: Basic World Map

leaflet() %>%
addTiles() %>%
setView(lng = 0, lat = 20, zoom = 2)

Map 2: Add a Marker (Example: New York)

leaflet() %>%
addTiles() %>%
addMarkers(lng = -74.006, lat = 40.7128,
popup = "New York City")

Map 3: Add Multiple Markers

cities <- [Link](


name = c("London", "Tokyo", "Sydney"),
lat = c(51.5074, 35.6895, -33.8688),
lng = c(-0.1278, 139.6917, 151.2093)
)
leaflet(cities) %>%
addTiles() %>%
addMarkers(~lng, ~lat, popup = ~name)

Map 4: Add Circles (Radius Visualization)

Example: Circle around New Delhi

leaflet() %>%
addTiles() %>%
addCircles(lng = 77.1025, lat = 28.7041,
radius = 50000, # 50 km
color = "red",
popup = "New Delhi Circle")

Map 5: Add Polygons (Area Visualization)

Example: triangle region


coords <- list(
c(28.7, 77.1),
c(28.8, 77.2),
c(28.6, 77.25)
)

leaflet() %>%
addTiles() %>%
addPolygons(lng = sapply(coords, "[", 2),
lat = sapply(coords, "[", 1),
color = "blue",
fillColor = "lightblue",
popup = "Polygon Example")

Map 6: Using Different Tile Providers

leaflet() %>%
addProviderTiles(providers$[Link]) %>% # Satellite view
setView(lng = 78, lat = 22, zoom = 4)
10) Visualize iris dataset using 3d graphs such as scatter3d, cloud, xyplot.

Code :

1)Load the iris dataset

data(iris)
head(iris)

2. scatter3d (from the rgl package)

Interactive 3D scatter plot.

# [Link]("rgl") # run once


library(rgl)

with(iris, {
plot3d([Link], [Link], [Link],
col = [Link](Species),
size = 6,
type = "s",
xlab = "Sepal Length",
ylab = "Sepal Width",
zlab = "Petal Length")
})

3. cloud (from the lattice package)

Static 3D cloud plot grouped by species.

# [Link]("lattice") # usually pre-installed


library(lattice)

cloud([Link] ~ [Link] * [Link],


data = iris,
groups = Species,
[Link] = TRUE,
main = "3D Cloud Plot of iris Dataset")

4. xyplot (3D grouped lattice scatter)

xyplot is 2D but often used with groups to compare species.


We use conditioning to mimic multi-view visualization.

xyplot([Link] ~ [Link] | Species,


data = iris,
main = "xyplot: Petal Length vs Petal Width by Species",
xlab = "Petal Width",
ylab = "Petal Length",
pch = 19,
col = c("red", "green", "blue"))
11) Make use of correlogram to visualize data in correlation matrices for iris dataset.

1 Load Dataset

data(iris)
head(iris)

2. Install & Load Required Library

# [Link]("corrplot") # run once


library(corrplot)

3. Prepare Correlation Matrix

We remove the categorical column (Species) because correlation needs numeric data.

iris_numeric <- iris[, 1:4] # only numeric columns


cor_matrix <- cor(iris_numeric)
cor_matrix

4. Basic Correlogram

corrplot(cor_matrix,
method = "circle",
title = "Correlogram of iris Dataset",
mar = c(0,0,1,0))

5. Correlogram with Color Palette & Numbers

corrplot(cor_matrix,
method = "color",
type = "upper",
[Link] = "black", # add correlation values
[Link] = "blue", # text color
col = colorRampPalette(c("blue", "white", "red"))(200),
title = "Iris Correlation Matrix (Colored)",
mar = c(0,0,2,0))

6. Correlogram Using "pie" Style (Another Option)

corrplot(cor_matrix,
method = "pie",
title = "Correlogram (Pie Method)",
mar = c(0,0,1,0))

12) Install maps library in R and draw different map visualizations.

1. Install and Load Packages

# [Link]("maps") # run once


# [Link]("mapdata") # optional for detailed maps

library(maps)
library(mapdata) # gives higher-resolution maps
MAP VISUALIZATIONS

Map 1: World Map


map("world",
fill = TRUE,
col = "lightblue",
bg = "white",
main = "World Map")

Map 2: Map of India


map("world",
regions = "India",
fill = TRUE,
col = "lightgreen",
main = "Map of India")

Map 3: USA Map (States Outlined)


map("state",
fill = FALSE,
col = "blue",
main = "USA States Map")
Map 4: Filled USA States Map
map("state",
fill = TRUE,
col = [Link](50),
main = "USA States (Filled Map)")

Map 5: Add Points on a Map

Example: Showing major cities in India.

map("world", regions = "India", fill = TRUE, col = "lightyellow",


main = "India Map with Major Cities")

points(c(77.2090, 72.8777, 88.3639), # longitudes (Delhi, Mumbai, Kolkata)


c(28.6139, 19.0760, 22.5726), # latitudes
col = "red", pch = 19, cex = 1.5)

text(77.2090, 28.6139, labels = "Delhi", pos = 4)


text(72.8777, 19.0760, labels = "Mumbai", pos = 4)
text(88.3639, 22.5726, labels = "Kolkata", pos = 4)
Map 6: High-Resolution Map Using mapdata (Optional)
map("worldHires",
regions = "Japan",
fill = TRUE,
col = "lightpink",
main = "High-Resolution Map of Japan")

You might also like