What Is R?
(Recap)
R is a programming language designed for statistics, data analysis, and
visualization.
It's free, open-source, and highly extensible with thousands of packages.
R is often used in:
o Research
o Business analytics
o Bioinformatics
o Machine learning
o Academia
Why Choose R Programming?
R is a unique language that offers a wide range of features for data analysis, making it an
essential tool for professionals in various fields. Here’s why R is preferred:
Free and Open-Source: R is open to everyone, meaning users can modify, share and
distribute their work freely.
Designed for Data: R is built for data analysis, offering a comprehensive set of tools
for statistical computing and graphics.
Large Package Repository: The Comprehensive R Archive Network (CRAN) offers
thousands of add-on packages for specialized tasks.
Cross-Platform Compatibility: R can work on Windows, Mac and Linux operating
systems.
Great for Visualization: With packages like ggplot2, R makes it easy to create
informative, interactive charts and plots.
Key Features of R
Cross-Platform Support: R works on multiple operating systems, making it versatile
for different environments.
Interactive Development: R allows users to interactively experiment with data and see
the results immediately.
Data Wrangling: Tools like dplyr and tidyr help simplify data cleaning and
transformation.
Statistical Modeling: R has built-in support for various statistical models like
regression, time-series analysis and clustering.
Reproducible Research: With R Markdown, users can combine code, output and
narrative in one document, ensuring their analysis is reproducible.
Applications of R
R is used in a variety of fields, including:
Data Science and Machine Learning: R is widely used for data analysis, statistical
modeling and machine learning tasks.
Finance: Financial analysts use R for quantitative modeling and risk analysis.
Healthcare: In clinical research, R helps analyze medical data and test hypotheses.
Academia: Researchers and statisticians use R for data analysis and publishing
reproducible research.
Advantages of R Programming
Comprehensive Statistical Tools: R includes many statistical functions and models,
making it the ideal choice for data analysis.
Customizable Visualizations: R’s visualization tools allows for customizations for a
simple bar chart or a detailed heatmap.
Extensive Community Support: R has a large user base and there are countless
resources, forums and tutorials available.
Highly Extendable: The availability of over 15,000 R packages means we can extend
R's functionality to suit any project or need.
Disadvantages of R Programming
Memory Intensive: R can be slow with very large datasets, consuming a lot of
memory.
Limited Support for Error Handling: Unlike some other programming languages, R
has less robust error handling features.
Steeper Learning Curve: Beginners might face challenges with some of R’s complex
features and syntax.
Performance: R’s performance can lag behind languages like Python or C++ when it
comes to speed, especially for large-scale operations.
Creating Variables in R Language
R supports three ways of variable assignment:
Using equal operator: operators use an arrow or an equal sign to assign values to
variables.
Using the leftward operator: data is copied from right to left.
Using the rightward operator: data is copied from left to right.
Syntax
Types of Variable Creation in R:
Using equal to operators
variable_name = value
using leftward operator
variable_name <- value
using rightward operator
value -> variable_name
1. To output text in R, use single or double quotes:
Example: “Hello World”
2. To output numbers, just type the number (without quotes):
Example: 5
3. To do simple calculations, add numbers together:
Example: 5 + 5
Example of Creating Variables in R
# using equal to operator
var1 = "hello"
print(var1)
# using leftward operator
var2 <- "hello"
print(var2)
# using rightward operator
"hello" -> var3
print(var3)
R Print Output
Unlike many other programming languages, you can output code in R without using a print
function:
Example: “Hello World”
R does have a print() function available if you want to use it.
Example: print(“Hello World”)
And if you are working with loops (or statements which has to be print written inside the
curly
braces {} ) then print() function is mandatory to use.
What Are In-Built Functions in R?
In-built functions are predefined functions in R that perform common tasks, such as:
Mathematical operations
Statistical calculations
Data manipulation
Text processing
Plotting
1. Mathematical Functions
Function Description Example Output
abs(x) Absolute value abs(-5) 5
sqrt(x) Square root sqrt(25) 5
round(x) Round to nearest integer round(3.6) 4
floor(x) Round down floor(3.7) 3
ceiling(x) Round up ceiling(3.1) 4
log(x) Natural log log(10) 2.3026
2. Statistical Functions
Function Description Example Output
mean(x) Average of values mean(c(1,2,3)) 2
median(x) Middle value median(c(1,3,5)) 3
Function Description Example Output
sd(x) Standard deviation sd(c(1,2,3)) 1
var(x) Variance var(c(1,2,3)) 1
sum(x) Sum of all elements sum(c(1,2,3)) 6
min(x) Smallest value min(c(3,5,2)) 2
max(x) Largest value max(c(3,5,2)) 5
Sequence & Repetition
Function Description Example Output
seq() Creates a sequence seq(1, 5) 12345
rep() Repeats values rep(3, times = 4) 3 3 3 3
Text (Character) Functions
Function Description Example Output
nchar(x) Number of characters in a string nchar("Hello") 5
toupper(x) Convert to uppercase toupper("hello") "HELLO"
tolower(x) Convert to lowercase tolower("HELLO") "hello"
paste() Join strings paste("R", "Language") "R Language"
Plotting Functions
Function Description
plot() Creates a basic plot (scatter, line)
hist() Creates a histogram
boxplot() Creates a boxplot
barplot() Creates a bar chart
numbers <- c(10, 20, 30, 40, 50)
# Summary statistics
mean(numbers) # 30
median(numbers) # 30
sd(numbers) # 15.81
sum(numbers) # 150
max(numbers) # 50
min(numbers) # 10
values <- c(5, 7, 3, 8)
> barplot(values, [Link] = categories, col = "purple", main = "Bar Plot Example",
ylab = "Values")
Simple bar
# Data
categories <- c("Apple", "Banana", "Cherry")
values <- c(10, 15, 7)
# Simple bar plot
barplot(values, [Link] = categories, col = "skyblue", main = "Simple Bar Graph",
ylab = "Quantity")
Multiple Bar Graph (Grouped Bar Plot)
# Data
categories <- c("Q1", "Q2", "Q3", "Q4")
sales_A <- c(10, 12, 15, 20)
sales_B <- c(8, 14, 13, 18)
# Combine data into a matrix
sales <- rbind(sales_A, sales_B)
# Grouped bar plot
barplot(sales, beside = TRUE, col = c("lightgreen", "orange"),
[Link] = categories, main = "Sales Comparison by Quarter",
ylab = "Sales", [Link] = c("Product A", "Product B"))
3. Pie Chart
# Data
fruits <- c("Apple", "Banana", "Cherry", "Date")
quantities <- c(30, 25, 20, 15)
# Pie chart
pie(quantities, labels = fruits, col = rainbow(length(fruits)), main = "Fruit
Distribution")