Visualization in R
Visualization in R
Base Graphics
ggplot2()
Visualization: Base
Graphics
Data Visualization: Base Graphics
plot(): Bivariate relationships and continuous variable across
groups
Doing Univariate analysis: hist(), boxplot()
Making more than 1 plot using mfrow()
Base Graphics: plot()
Plot
The package used for Base Graphics in R is “graphics”
• The plot function can be used for plotting
• numeric variables
• character and factor variables
• Scatter plots
• Entire dataset
Plot : scatter plot
• type :
• the type of plot
• Some of the types are :
• p=point
• l=line
• b= point and line
Plot : pch
• pch :
• Used for plotting
symbols
• Values = 0:18
Studying Univariates()
cex
Box and whiskers plot
• Boxplots are useful for
studying the distribution of a
variable
• Also useful for detecting
outliers
cex
Box and whiskers plot
Plot : Factor variables
Plot : Entire Dataframe
plot function on the entire
dataset generates pairwise
displays
cex
par()
• par() function is useful for setting additional graphics parameters
• Some of the parameters this function has are :
• mfrow
• [Link]
• [Link]
To know the entire list of grahics parameters in par that can be altered :
?par
cex
par(): multi-plotting
mfrow() :
It controls the number of rows and
columns, allowing you to put
multiple plots on a page
cex
Histograms
Histograms give the frequency
distribution of a variable.
hist() function is used
breaks : The number of bins
label : Labels the frequency of
each bin
xlim : sets the range for x-axis
cex
Histogram : freq
freq :
• freq=TRUE(default) means
frequency distribution
• freq=FALSE means
probability density
cex
Histogram : adding density lines
Visualization: ggplot2()
Visualization: ggplot2()
ggplot2(): What and Why
ggplot2(): Architecture : Understanding Grammar of Graphics
ggplot2(): Common plots
Visualization: ggplot2()
Base graphics: Good for simple tasks
Comparatively difficult syntax
Based on grammar of graphics: Simple syntax, interfaces with
ggmap and other packages
Grammar of Graphics
Visualization: ggplot2()
“Grammar of graphics”
A plot composed of : Aesthetic Mapping, Geoms, Statistical
Transformations, Coordinate Systems and Scales
Components Description
Aesthetic Mapping What component of data appears on X axis, Y axis, how is the color, size, fill
and position of elements is related with the data
Geoms (Geometrical Objects) What geometrical objects appear on the plot: points, lines, polygons, area,
boxplot, rectangle, tile etc
Statistical Transformations Compute density, counts, (Histogram: Need to bin and count data)
Scales and Coordinate System Discreet scale or Continous. Cartesian or Spherical.
Visualization: ggplot2()
temp dewpoint season
12 30 Autumn
34 28 Spring
… … Summer
0 0 Winter
… … ….
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Aesthetics:
Axis Mappings: X=temp,
y=dewpoint
Colour: Seasons
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Geoms:
Points (Scatter plot)
Bars, Lines, Polygons, Area,
Density, Boxplots….
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Statistical Transformation
Identity (none)
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Geoms:
Density
Aesthetics:
Axis Mappings:
X=temp,
Y= density
Fill: Seasons
Statistical Transformation:
Density computation
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Geoms:
?
Aesthetics:
Axis Mappings:
X=?,
Y= ?
Fill: ?
Statistical Transformation:
?
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Geoms:
Bar
Aesthetics:
Axis Mappings:
X=temp
Y= Counts
Fill: Pink
Statistical Transformation:
Counts in binned data
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Geoms:
?
Aesthetics:
Axis Mappings:
X=?,
Y= ?
Fill: ?
Statistical Transformation:
?
Position:?
Based on “grammar of graphics”
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
Geoms:
Bars
Aesthetics:
Axis Mappings:
X=temp
Y= Counts
Fill: seasons
Statistical Transformation:
Count
Position: Bars arranged side
Based on “grammar of graphics” by side (Dodged)
Components: Aesthetics, Geoms and Statistical
Transformations
Visualization: ggplot2()
How to code the grammar?
Setting up aesthetic maps:
ggplot(data,aes(x=variable to be mapped to x
axis,y=variable to be mapped to y axis,colour=variable
by which colour should change))
Visualization: ggplot2()
How to code the grammar?
Adding layer with geom:
p+geom_type of geometry(stat=“statistical
transformation on data”)
Visualization: ggplot2()
Visualization: ggplot2()
Geom Default Stat Default Aesthetics
geom_point “identity” colour,fill,shape,size,x,y
geom_histogram “bin” colour,fill,linetype,size,weight,x
geom_density “density” colour,fill,linetype,size,weight,x,y
geom_polygon “identity” colour,fill,linetype,size,x,y
geom_line “identity” colour, linetype, size, x, y
geom_tile “identity” colour, fill, linetype, size, x, y
geom_boxplot “boxplot” colour, fill, lower, middle, size, upper, weight, x,
ymax, ymin
*Items in bold are required, others are optional and have default values or are computed by a default stat
transform
Creating Common Plots
Visualization: ggplot2()
• Direct Marketing dataset with information demographic information such as age, location, income
level; and information on the amount of money spent by the individual
• Following are the variables:
• Age
• Gender
• Own Home
• Married
• Location
• Salary
• Children
• History
• Catalogs
• Amount Spent
Visualization: ggplot2()
• Bivariate Relationship: Scatter plot
Geoms:
Point: geom_point()
Aesthetics:
Axis Mappings:
X=X variable
Y= Y Variables
Fill, Color, Size…optional
Statistical Transformation
Identity (No change)
Visualization: ggplot2()
• Understanding univarites : Histograms
Geoms:
Bars: geom_histogram()
Aesthetics:
Axis Mappings:
X=X variable
Y= Count of data in Bins
Fill, Colour, …optional
Statistical Transformation
bins (Binned count of
observations, to be shown on Y
axis)
Visualization: ggplot2()
• Understanding univarites : Box and Whiskers
Geoms:
Boxplot: geom_boxplot()
Aesthetics:
Axis Mappings:
X=X variable (A factor variable)
Y=The variable whose mapping we
are interested in, Boxplot statistics :
lower, middle, upper, ymax, ymin
Colour, fill
Statistical Transformation
Boxplot statistics (To be shown on Y
axis)
Visualization: ggplot2()
• Understanding univariates : Density plots
Geoms:
geom_density()
Aesthetics:
Axis Mappings:
X=Variable whose density we are
interested in.
Y=Density measurements
Colour, fill…..
Statistical Transformation
Density (To be shown on Y axis)
Visualization: ggplot2()
• Understanding univariates : Density plots
Geoms:
geom_density()
Aesthetics:
Axis Mappings:
X=Variable whose density we are
interested in.
Y=Density measurements
Colour, fill…..
Statistical Transformation
Density (To be shown on Y axis)
Visualization: ggplot2()
• Understanding bivariate counts : 2 d bivariate plots, 2d
heatmaps
Geoms:
geom_bin2d()
Aesthetics:
Axis Mappings:
X= X variable
Y= Y variable
Colour, fill…..
Statistical Transformation
2 d density