0% found this document useful (0 votes)
10 views32 pages

R Data Visualization Techniques

The document provides an overview of data visualization techniques in R, including scatter plots, histograms, and box plots. It details how to load datasets, plot data, customize plots, and save graphics output. Additionally, it includes hands-on practice examples and solutions for creating specific visualizations using the ChickWeight dataset.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views32 pages

R Data Visualization Techniques

The document provides an overview of data visualization techniques in R, including scatter plots, histograms, and box plots. It details how to load datasets, plot data, customize plots, and save graphics output. Additionally, it includes hands-on practice examples and solutions for creating specific visualizations using the ChickWeight dataset.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

DATA

VISUALIZATION
GRAPHICS IN R
Agenda:
◦Scatter Plot
◦Histogram
◦Box Plot
◦Customizing Plots
Example Data
To view available datasets, type:
>data()

To load a dataset into memory, type


data(name of data set)
>data(ChickWeight)

To view the loaded dataset type its name:


>ChickWeight
The Data File
Weight Time Chick Diet
1 42 0 1 1
2 51 2 1 1
3 59 4 1 1
4 64 6 1 1
5 76 8 1 1
6 93 10 1 1
7 106 12 1 1
8 125 14 1 1
9 149 16 1 1
10 171 18 1 1
Summary of the Data File
>summary(ChickWeight)
Weight Time
Min. : 35 Min. : 0.00
1st Qu.: 63 1st Qu.: 4.00
Median :103 Median :10.00
Mean :121 Mean :10.72
3rd Qu.:163 3rd Qu.:16.00
Max. :373 Max. :21.00
Graphics in R
Plot() is the main graphing
function
Automatically produces simple
plots for vectors, functions or
data frames
Many useful customization
options…
Plotting a Vector
plot(v) will print the elements of
the vector v according to their
index
# Plot weight for each observation
> plot(ChickWeight$Weight)
# Plot weight against their ranks
> plot(sort(ChickWeight$Weight))
plot(ChickWeight$Weight)plot(sort(ChickWeight$Weight))
Common Parameters for
plot()
Specifying labels:
main – provides a title
xlab – label for the x axis
ylab – label for the y axis
Specifying range limits:
ylim – 2-element vector gives range for y
axis
xlim – 2-element vector gives range for x
axis
A Labeled Plot
>plot(ChickWeight$weight, ylim =
c(50,200), ylab = "Weight", xlab = "Rank",
main = "Distribution of Weights")
Plotting Two Vectors
 plot() can pair elements from 2 vectors to produce x-y coordinates
 You can exclude vectors using:
> plot(datasetname[-c(1,2)])
Plotting Two Vectors
>plot(ChickWeight$Diet,
ChickWeight$weight, xlab = "diet", ylab =
"Weight", main = "Type of Diet Effect on
Weight”, col = "blue”)
Plotting Contents of a Dataset
> plot(ChickWeight)
Plotting Contents of a Dataset
> plot(ChickWeight[-c(1)])
Histograms
 A diagram consisting of rectangles whose area is
proportional to the frequency of a variable
 Histograms effectively only work with one variable at a
time.
 The parameter breaks is key:
- Specifies the number of categories to plot
- Specifies the breakpoints for each category
- Controls the number of bars, cells or bins of the histogram.
 The xlab, ylab, xlim, ylim options work as expected
Histograms
hist(ChickWeight$weight, col = "lightblue", xlab =
"Weight", main = "Weight Histogram")
Histograms With Breaks
hist(ChickWeight$weight, col = "lightblue", xlab =
"Weight", main = "Weight Histogram", breaks =
seq (0,400, by=10) )
Boxplots
Generated by the boxplot()
function
Draws plot summarizing
- Median
◦Quartiles (Q1, Q3)*
◦Outliers – by default,
observations more than 1.5 * (Q1
– Q3) distant from nearest
quartile
Boxplot
boxplot(ChickWeight, col = rainbow(6), ylab = "ChickWeight
Boxplot")
Outliers
Boxplot for Weight
 rug() can add a tick for each
observation to the side of a
boxplot() and other plots.

 The side parameter specifies


where tick marks are drawn.

> boxplot(ChickWeight$weight, col = rainbow(6), ylab =


"ChickWeight Boxplot")
> rug(ChickWeight$weight,side=2)
Customizing Plots
R provides a series of functions for adding
text, lines and points to a plot
We will illustrate some useful ones, but
look at demo(graphics) for more examples
Type <Return> or Press enter for more
Drawing on a plot
To add additional data use
- points(x,y)
- lines(x,y)

For freehand drawing use


- polygon()
- rect()
Text Drawing
Two commonly used functions:
- text() – writes inside the plot region, could be used to label datapoints
- mtext() – writes on the margins
Plotting Two Data Series
> x <- seq(0,2*pi, by = 0.1)
> y <- sin(x)
> plot(x,y, col = "green", type = "l", lwd = 3)
> y1 <- cos(x)
> lines(x,y1, col = "red", lwd = 3)
> mtext("Sine and Cosine Plot", side = 3, line = 1)
Adding a Label & Rectangle
> rect(0,-1,2,0.5)
> text(1,0.6, "label here")
Multiple Plots on a Page
 Set the mfrow or mfcol options
 Take 2 dimensional vector as an argument
- The first value specifies the number of rows
- The second specifies the number of columns
 The 2 options differ in the order individual plots are
printed
Multiple Plots on a Page
>par(mfcol = c(3,1))

>hist(ChickWeight $weight*1000,
breaks = 10, main = "Weight (in
mg)", xlab = "Weight")

>hist(ChickWeight$weight, breaks
= 10, main = "Weight (in g)", xlab
= "Weight")

>
hist(ChickWeight$weight/1000,bre
aks = 10, main = "Weight (in kg)",
xlab = "Weight")
Saving R Plots
 R usually generates output to the screen
 R can also save its graphics output in a file that you
can distribute or include in a document prepared with
Word or LATEX . From File -> Save As
Hands On
Plot two graphs as follows:
◦Plot1: Histogram of ChickWeight weight vector (in kgs instead
of grams). Having main label “Weight in Kgs” and 8 breaks.
◦Plot 2: the box plot of the Weight vector in Kgs in red color.
◦Two plots in the same row
Solution
> par(mfrow=c(1,2))
> hist(ChickWeight$weight/1000, breaks=8, main = "Weight (in
kg)", xlab = "Weight")
> boxplot(ChickWeight$weight/1000, col = "red")
Practice
Note:
Cex=n plots a figure n times the default size
Pch denotes plot symbol
Practice
>f1<- function(){
x1<- rep(1:5, times=5)

#1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,
5
y1<- rep(1:5,each=5)

#1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,
5
plot(x1,y1,pch=1:25, cex =3, bg= "red", main="Data symbols 1:25")}
>f1()

Note:
Cex=n plots a figure n times the default size
Pch denotes plot symbol

You might also like