0% found this document useful (0 votes)
78 views6 pages

Visualizing VADeaths and Air Quality Data

Uploaded by

k.abhiram4363
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views6 pages

Visualizing VADeaths and Air Quality Data

Uploaded by

k.abhiram4363
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DATA VISULAZATION LAB OBSERVATION

EXPERIMENT 1(a)

Experiment Title

Load the VADeaths Dataset in R and Visualize Death Rates Using Different Histograms

Aim

To load the built-in VADeaths dataset in R, convert it into a data frame, and visualize death
rates of different population groups using basic histograms.

Objectives

1. To examine the structure and values of the VADeaths dataset.


2. To convert a matrix into a data frame for easier manipulation.
3. To generate a basic histogram for all death-rate values.
4. To create individual histograms for Rural Male, Rural Female, Urban Male, and Urban
Female.
5. To compare and interpret death-rate distributions across demographic categories.

Software / Tools Required

• R / RStudio
• Google Colab
o Switch to R runtime → Runtime → Change runtime → R
o OR use %%R magic to run R code in Python runtime

About the Experiment (In Record)

Procedure / Algorithm

Step 1: Start R or RStudio.

Open an R session with a clear workspace.

Step 2: Load the VADeaths dataset.

Use data("VADeaths") to load the built-in dataset.


Step 3: Display the dataset.

Print the dataset using print(VADeaths) to examine age groups and categories.

Step 4: Convert the dataset to a data frame.

Use [Link](VADeaths) for easier access to columns.

Step 5: Check the structure of the data frame.

Use str(va_df) to verify column names and data types.

Step 6: Prepare layout for multiple histograms.

Use par(mfrow = c(2, 2)) to divide the screen into 4 panels.

Step 7: Plot individual histograms.

Use hist() to draw separate histograms for:

• Rural Male
• Rural Female
• Urban Male
• Urban Female

Ensure each plot has a title and axis labels.

Step 8: Reset plotting window.

Use par(mfrow = c(1, 1)) to return to normal layout.

Step 10: Interpret the results.

Observe differences between urban vs rural and male vs female groups.

Code Using R

Step 1: Load VADeaths dataset

data("VADeaths")

Step 2: Display the dataset


print("VADeaths Dataset:")
print(VADeaths)

Step 3: Convert matrix to data frame

va_df <- [Link](VADeaths)

Step 4: View structure

str(va_df)

par(mfrow = c(2, 2)) # 2x2 plotting layout

Step 5: Basic histogram (all values)

all_values <- [Link](unlist(va_df))

hist(all_values,
main = "Basic Histogram of All Death Rates",
xlab = "Death Rate",
col = "lightblue",
border = "black")

Step 6: Individual histograms (4 categories)

par(mfrow = c(2, 2)) # 4-panel layout

hist(va_df$`Rural Male`,
main = "Rural Male Death Rate",
xlab = "Death Rate",
col = "skyblue",
border = "black")

hist(va_df$`Rural Female`,
main = "Rural Female Death Rate",
xlab = "Death Rate",
col = "lightgreen",
border = "black")

hist(va_df$`Urban Male`,
main = "Urban Male Death Rate",
xlab = "Death Rate",
col = "salmon",
border = "black")
hist(va_df$`Urban Female`,
main = "Urban Female Death Rate",
xlab = "Death Rate",
col = "orchid",
border = "black")

Explanation:

Step 7: Reset layout

par(mfrow = c(1, 1))

Expected Output (write the Observations)


When the program is executed in R, the following outputs of different histograms will be
generated:

Result

The VADeaths dataset was successfully loaded and visualized using basic R histograms. A
combined histogram and four category-wise histograms were generated.

b) Load air quality dataset in R and visualize La Guardia Airport‟s dialy


maximum temperature using histogram.

EXPERIMENT 1(b)
Experiment Title
Load the Air Quality Dataset in R and Visualize La Guardia Airport’s Daily Maximum
Temperature Using Histogram

Aim
To load the built-in airquality dataset in R and visualize the daily maximum temperature
recorded at La Guardia Airport using a histogram.

Objectives

1. To load and inspect the structure of the air quality dataset in R.


2. To extract the La Guardia Airport daily maximum temperature data.
3. To generate and analyze the histogram for temperature distribution.

Software / Tools Required


• R / RStudio
• Google Colab

About the Experiment (In Record)

Procedure / Algorithm
Step 1: Start R or RStudio. Open an R session with a clear workspace.
Step 2: Load the air quality dataset using data("airquality").
Step 3: Display the dataset using print(airquality) to inspect values.
Step 4: Check the structure using str(airquality).
Step 5: Plot histogram of the Temp column to visualize daily max temperature.

Code Using R
Step 1: Load airquality dataset

data("airquality")

Step 2: Display the dataset

print("Air Quality Dataset:")


print(airquality)

Step 3: View structure

str(airquality)

Step 4: Plot histogram of La Guardia Airport daily maximum temperature

hist(airquality$Temp,
main = "Histogram of La Guardia Daily Max Temperature",
xlab = "Temperature (°F)")

Expected Output (write the Observations)


When the program is executed in R, the histogram of La Guardia Airport’s daily maximum
temperature shows the following:

Observations

• The temperature values are distributed approximately between 56°F and 97°F.
• The histogram exhibits a slight negative (left) skew, indicating a greater number of
warmer days than cooler days.
• The most frequent temperature range occurs roughly between 70°F and 90°F.
• A clear peak appears around 80–85°F, marking the most commonly recorded daily
maximum temperature zone.

Result
The airquality dataset was successfully loaded and the La Guardia Airport daily maximum
temperature was visualized using a histogram in R.

You might also like