0% found this document useful (0 votes)

19 views36 pages

Unit-1 (R Programming)

R is a programming language designed for statistical computing and data analysis, developed in the early 1990s, and is open-source with extensive package support. It is widely used across various industries for data science, finance, healthcare, and academia due to its comprehensive statistical tools and visualization capabilities. R has a rich history tied to the S programming language and has evolved significantly, with modern tools like RStudio and the Tidyverse enhancing its usability.

Uploaded by

tannutannu3849

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views36 pages

Unit-1 (R Programming)

Uploaded by

tannutannu3849

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unit-1

Introduction of R:-
R is a programming language designed for statistical computing, data analysis and visualization.
Developed in the early 1990s by Ross Ihaka and Robert Gentleman, it provides a flexible
environment for working primarily with structured (tabular) data, handling unstructured data
typically requires additional packages
 Specifically built for statistical analysis and data modeling
 Open-source and freely available to everyone
 Supported by thousands of packages via the Comprehensive R Archive Network
 Widely used for data analysis and decision-making across industries
Why Choose R Programming:-
R is a unique language that offers a wide range of features for data analysis, making it an
essential tool for professionals in various fields. Here’s why R is preferred:
 Free and Open-Source: R is open to everyone, meaning users can modify, share and
distribute their work freely.
 Designed for Data: R is built for data analysis, offering a comprehensive set of tools for
statistical computing and graphics.
 Large Package Repository: The Comprehensive R Archive Network (CRAN) offers
thousands of add-on packages for specialized tasks.
 Cross-Platform Compatibility: R can work on Windows, Mac and Linux operating
systems.
 Great for Visualization: With packages like ggplot2, R makes it easy to create informative,
interactive charts and plots.

Key Features of R:-

 Cross-Platform Support: R works on multiple operating systems, making it versatile for
different environments.
 Interactive Development: R allows users to interactively experiment with data and see the
results immediately.
 Data Wrangling: Tools like dplyr and tidyr help simplify data cleaning and transformation.
 Statistical Modeling: R has built-in support for various statistical models like regression,
time-series analysis and clustering.
 Reproducible Research: With R Markdown, users can combine code, output and narrative
in one document, ensuring their analysis is reproducible.
Example Program in R:-
 We first create a vector data that contains numerical values.
 We use the mean() function to calculate the mean of the dataset.
 The sd() function calculates the standard deviation.

data <- c(5, 10, 15, 20, 25, 30, 35, 40, 45, 50)
mean_data <- mean(data)

print(paste("Mean: ", mean_data))

std_dev <- sd(data)

print(paste("Standard Deviation: ", std_dev))

Applications of R:-
R is used in a variety of fields, including:
 Data Science and Machine Learning: R is widely used for data analysis, statistical
modeling and machine learning tasks.
 Finance: Financial analysts use R for quantitative modeling and risk analysis.
 Healthcare: In clinical research, R helps analyze medical data and test hypotheses.
 Academia: Researchers and statisticians use R for data analysis and publishing reproducible
research.
Advantages of R Programming:-
 Comprehensive Statistical Tools: R includes many statistical functions and models,
making it the ideal choice for data analysis.
 Customizable Visualizations: R’s visualization tools allow for customizations for a simple
bar chart or a detailed heatmap.
 Extensive Community Support: R has a large user base and there are countless resources,
forums and tutorials available.
 Highly Extendable: The availability of over 15,000 R packages means we can extend R's
functionality to suit any project or need.
Limitations of R Programming:-
 Can consume high memory with very large datasets
 Slower execution speed for large-scale computations
 Syntax may be challenging for beginners
 Error handling is less structured compared to some modern languages

History and Evolution of R:-

1. Roots in S (1970s)

The development of R is closely tied to the S programming language, created by John

Chambers and his colleagues at Bell Laboratories in 1976.
S laid the foundation for modern statistical computing and influenced R’s design and syntax.
2. Origins of R (Early 1990s)

R was developed in 1991 by Ross Ihaka and Robert Gentleman at the University of Auckland.
The name “R” reflects both:

 A play on the language S

 The initials of its creators

3. Public Release (1993–1995)

 1993: First announced publicly via StatLib.

 1995: With encouragement from Martin Mächler, R was released under the GNU
General Public License.
This made R free and open-source software, accelerating its adoption.

4. Infrastructure Development (1997)

 Formation of the R Core Team to maintain and develop R.

 Creation of the Comprehensive R Archive Network (CRAN), a central repository for R
packages and distributions.

5. Stable Release (2000)

 R version 1.0.0 was officially released on February 29, 2000, marking R as a mature
and stable platform for statistical computing.

6. Modern Era (2011–Present)

 2011: Launch of RStudio, a powerful integrated development environment that made R

more user-friendly.
 2016: Introduction of the Tidyverse, a set of packages that simplified data manipulation,
visualization, and analysis.

Today, R is widely used in:

 Data science
 Machine learning
 Academic research
 Business analytics

R and R studio setup:-

1. Install R (first, always)

R is the core engine—you must install it before RStudio.

1. Go to the official site:
👉 [Link]
2. Choose your OS (Windows / macOS / Linux)
3. Download and install the latest version
4. Use default settings unless you have specific needs

💻 2. Install RStudio

RStudio is the interface that makes R easier to use.

1. Go to:
👉 [Link]
2. Download RStudio Desktop (free version)
3. Install it like any normal software

🚀 3. Verify Installation

1. Open RStudio
2. You should see:
o Console (bottom left)
o Script editor (top left)
3. Test with this command:

print("Hello World")

If it runs without error, you’re set 👍

📦 4. Install Useful Packages

Run this in the console:

[Link]("tidyverse")
[Link]("ggplot2")

Load them:

library(tidyverse)

⚙️5. Optional Setup Tips

 Set CRAN mirror (RStudio usually prompts you)

 Change theme:
Tools → Global Options → Appearance
 Enable autosave for scripts
🛠️Common Issues

 R not detected in RStudio → reinstall R first, then RStudio

 Permission errors → run installer as admin
 Package install fails → check internet or mirror

Basic syntax in R:-

x <- 10

name <- "John"

Running R Programs:-

1. Run Code in Console (Quick Way)

You can directly type commands in the Console:

print("Hello World")

Press Enter → it runs immediately.

2. Run a Script File (Recommended)

Step 1: Create a script

 In RStudio: File → New File → R Script

Step 2: Write code

x <- 5
y <- 10
print(x + y)
Step 3: Run code

You have 3 options:

 Click Run button (top right of script editor)

 Press Ctrl + Enter (runs selected line)
 Select multiple lines → Ctrl + Enter

3. Run Entire Script

To run the whole file:

 Click Source button
OR
 Press Ctrl + Shift + Enter

4. Run an External .R File

If you already have a file:

source("file_name.R")

Example:

source("my_script.R")

5. Set Working Directory

R needs to know where your file is:

getwd() # check current folder

setwd("C:/Users/YourName/Documents")

Or in RStudio:

 Session → Set Working Directory → Choose Directory

6. Example Full Program

# Simple program
numbers <- c(1, 2, 3, 4, 5)

result <- sum(numbers)

print(result)

Run it using Source or Ctrl + Enter

Data Types in R:-


Data types in R specify the type of information stored in a variable and determine how it
behaves during calculations and analysis. They define how data is represented in memory and
how functions interpret it. R is a dynamically typed language, so the data type is assigned
automatically when a value is created.
 Choosing the correct data type improves program performance and memory efficiency.
 Proper data types ensure accurate mathematical, logical and statistical computations.
 Selecting the right data type simplifies data processing and improves overall code clarity.

1. Numeric Data Type:-

In R, numbers with decimal points are called numeric. This is the default data type for numbers
and it is used to store real values for calculations. Numeric values are stored using double-
precision floating-point format, which allows accurate representation of decimal numbers.

# Numeric variable with decimal

x <- 5.6

print(class(x))

print(typeof(x))

# Integer-like number without decimal

y <- 5

print(class(y))

print(typeof(y))

# Check if y is an integer

print([Link](y))

2. Integer Data Type

The integer data type is used for storing numbers without decimal points. Integers can be created
using the [Link]() function or by adding the suffix L to a number. This explicitly tells R to
store the value as an integer rather than the default double type.
 Integer values in R are stored as 32-bit signed integers, with a range from -231 to 231-1.
 Integers are used when exact whole numbers are required, such as counts, indexes or
categorical numeric codes.
# Creating integer using [Link]()
x <- [Link](5)
print(class(x))
print(typeof(x))
# Creating integer using L suffix
y <- 5L
print(class(y))
print(typeof(y))

3. Logical Data Type

Logical data types in R represent Boolean values as TRUE or FALSE. Logical values are often
created using comparisons between variables or by directly assigning Boolean values. This data
type is used in decision-making, conditional statements and filtering data.
# Creating a logical value using comparison
x <- 4
y <- 3
z <- x > y
print(z)
print(class(z))
print(typeof(z))

# Creating a logical value using direct assignment

logi <- FALSE
print(class(logi))
print(typeof(logi))

4. Complex Data Type

Complex data types are used to store numbers with both real and imaginary components. The
imaginary part is denoted using the suffix i. Complex numbers are useful in scientific
computations, signal processing and mathematical modeling where imaginary numbers are
required.

# Creating a complex number

x <- 4 + 3i

print(class(x))

print(typeof(x))
5. Character Data Type in R

Character data types are used to store text, including alphabets, numbers and special symbols.
Character values also called strings are enclosed in single (') or double (") quotes. This data type
is commonly used for names, labels, messages and textual information in datasets.

# Creating a character variable

char <- "Geeksforgeeks"

print(class(char))

print(typeof(char))

6. Raw Data Type in R

The raw data type in R is used to store and manipulate data at the byte level. It represents
unprocessed binary values, making it useful for low-level operations such as working with files,
network data or binary protocols. Raw vectors consist of elements in the range 00 to FF
(hexadecimal notation).

# Creating a raw vector

x <- [Link](c(0x1, 0x2, 0x3, 0x4, 0x5))

print(x)

Variables:-
A variable is a name that refers to a value or object in R, allowing you to store and manipulate
data and the name assigned to it allows you to access stored value. It acts as an identifier for the
memory block, which can hold values of different data types during the program’s execution.
 Variables store values in memory that can be accessed or updated later.
 R variables are dynamically typed their type is determined by the assigned value.
 The assignment operator <- is the standard way to assign values, though = can also be used.
How to Create Variables in R
In R a variable is created by assigning a value to a name. R supports three ways to assign values
to variables:
1. Using the Equal Operator (=)

While = can be used for assignment, <- is the preferred and widely used operator in R as it
clearly indicates assignment and avoids confusion with other uses.
# Assign a string using equal operator
var1 = "Hello Geeks"
print(var1)

Output

[1] "Hello Geeks"

2. Using the Leftward Operator (<-)

The leftward operator <- assigns a value from the right side to the variable on the left. It is
widely used in R because it makes the direction of assignment clear and helps distinguish
assignment from comparison.
# Assign a string using leftward operator
var2 <- "Ready to code"
print(var2)

Output

[1] "Ready to code"

3. Using the Rightward Operator (->)

The rightward operator -> assigns a value from the left side to the variable on the right. It works
the same way as <-, but the direction of assignment is reversed.
# Assign a string using rightward operator
"Byte-by-Byte" -> var3
print(var3)

Output

[1] "Byte-by-Byte"
Rules for naming a variables:-

1. Start with a letter or dot .

✔️x, name, .value

❌ 1name

2. Use only letters, numbers, _ or .

✔️marks_1, [Link]
❌ marks-1, total@score

3. Don’t use spaces

✔️student_name
❌ student name

4. R is case-sensitive

age ≠ Age ≠ AGE

5. Don’t use keywords or TRUE/FALSE

❌ if, for, TRUE, FALSE

Example:- age <- 20

student_name <- "Rahul"

total_marks <- 95

Scope of Variables in R programming:-

Scope of a variable determines where it can be accessed or used in a program. Understanding
variable scope helps prevent errors and manage data effectively. There are mainly two types of
variable scopes:

1. Global Variables

Global variables are defined outside any function and can be accessed or modified from
anywhere in the program. They exist for the entire duration of the program unless explicitly
removed.
 Remain in memory throughout the program which may increase memory usage.
 Can cause naming conflicts if multiple parts of the program use the same name.
 Defined outside functions and exist until the program ends or the variable is deleted.
global <- 5

display <- function(){

global <- 20
print(global)
}

display()
print(global)

2. Local Variables

Local variables are created inside a function or a specific block of code and can only be used
within that block. They exist only while the function is running and are removed from memory
once the function finishes.
 Defined inside a function and accessible only within that function.
 Exist only during the function’s execution and are destroyed after the function ends.
 Helps avoid conflicts with variables in other parts of the program.
 Uses memory only when needed and is removed afterward.

my_function <- function() {

local_var <- 10 # This is a local variable
print(local_var)
}

my_function()
print(local_var)

Important Methods for R Variables

R provides several built-in functions to work with variables. Understanding these functions
makes managing variables easier, especially in large programs.

1. class() Function

This built-in function is used to determine the data type of the variable provided to it. The R
variable to be checked is passed to this as an argument and it prints the data type in return.
Syntax:
class(variable)
Example:
var1 <- "HI Geeks 001"
print(class(var1))

Output

[1] "character"

2. ls() function

This built-in function is used to know all the present variables in the workspace. This is
generally helpful when dealing with a large number of variables at once and helps prevents
overwriting any of them.
Syntax:
ls()
Example:
var1 <- "hello"
var2 <- 20
var3 <- TRUE

print(ls())

Output

[1] "var1" "var2" "var3"

3. rm() function

rm() is a built-in function used to delete an unwanted variable within your workspace. This
helps clear the memory space allocated to certain variables that are not in use thereby creating
more space for others. The name of the variable to be deleted is passed as an argument to it.
Syntax:
rm(variable)
Example:
var3 <- "hello"
# Remove var3
rm(var3)
print(var3)
Output:
Error: object 'var3' not found
4. exists() Function

The exists() function checks whether a variable exists in the workspace. It returns TRUE if the
variable exists, otherwise FALSE.
Syntax:
exists("variable_name")
Example:
var1 <- 10
print(exists("var1"))
print(exists("varX"))

Output

[1] TRUE
[1] FALSE

R Operators:-

Operators in R are symbols that perform operations on variables and values (operands). They
allow you to carry out mathematical calculations, logical comparisons, assignments and other
operations efficiently.

Arithmetic Operators
Arithmetic operators perform mathematical operations on numeric values or vectors. In R, these
operations are applied element-wise when working with vectors.

1. Addition (+)

The values at the corresponding positions of both operands are added.

a <- c (1, 0.1)
b <- c (2.33, 4)
print (a+b)

Output

[1] 3.33 4.10

2. Subtraction (-)

The second operand values are subtracted from the first.

a <- 6
b <- 8.4
print (a-b)

Output

[1] -2.4

4. Division (/)

The first operand is divided by the second operand with the use of the '/' operator.
a <- 10
b <- 5
print (a/b)

Output

[1] 2

5. Power (^)

The first operand is raised to the power of the second operand.

a <- 4
b <- 5
print(a^b)

Output

[1] 1024

6. Modulo (%%)

It returns the remainder after dividing the first operand by the second operand.
a<- c(2, 22)
b<-c(2,4)
print(a %% b)

Output

[1] 0 2

Logical Operators
Logical Operators in R simulate element-wise decision operations, based on the specified
operator between the operands, which are then evaluated to either a True or False boolean value.
Any non-zero integer value is considered as a TRUE value, be it a complex or real number.

1. Element-wise AND (&)

Returns True if both the operands are True.

a <- c(TRUE, 0.1)
b <- c(0,4+3i)
print(a & b)

Output

[1] FALSE TRUE

2. Element-wise OR (|)

Returns True if either of the operands is True.

a <- c(TRUE, 0.1)
b <- c(0,4+3i)
print(a|b)

Output

[1] TRUE TRUE

3. NOT (!)

A unary operator that negates the status of the elements of the operand.
a <- c(0,FALSE)
print(!a)
Output

[1] TRUE TRUE

4. Short-circuit AND (&&)

Returns True if both the first elements of the operands are True.
a <- c(TRUE, 0.1)
b <- c(0,4+3i)
print(a[1] && b[1])

Output

[1] FALSE

5. Short-circuit OR (||)

Returns True if either of the first elements of the operands is True.

a <- c(TRUE, 0.1)
b <- c(0,4+3i)
print(a[1]||b[1])

Output

[1] TRUE

Relational Operators
The Relational Operators in R carry out comparison operations between the corresponding
elements of the operands. Returns a boolean TRUE value if the first operand satisfies the relation
compared to the second. In logical comparisons, TRUE is internally treated as 1 and FALSE as
0. However, comparisons involving logical values depend on context and type coercion.

1. Less than (<)

Returns TRUE if the corresponding element of the first operand is less than that of the second
operand. Else returns FALSE.
a <- c(TRUE, 0.1,"apple")
b <- c(0,0.1,"bat")
print(a<b)

Output

[1] FALSE FALSE TRUE

2. Less than or equal to (<=)

Returns TRUE if the corresponding element of the first operand is less than or equal to that of
the second operand. Else returns FALSE.
a <- c(TRUE, 0.1, "apple")
b <- c(TRUE, 0.1, "bat")

c <- [Link](a)
d <- [Link](b)

print(c <= d)

Output

[1] TRUE TRUE TRUE

3. Greater than (>)

Returns TRUE if the corresponding element of the first operand is greater than that of the second
operand. Else returns FALSE.
a <- c(TRUE, 0.1, "apple")
b <- c(TRUE, 0.1, "bat")
print(a > b)

Output

[1] FALSE FALSE FALSE

4. Greater than or equal to (>=)

Returns TRUE if the corresponding element of the first operand is greater or equal to that of the
second operand. Else returns FALSE.
a <- c(TRUE, 0.1, "apple")
b <- c(TRUE, 0.1, "bat")
print(a >= b)

Output

[1] TRUE TRUE FALSE

5. Not equal to (!=)

Returns TRUE if the corresponding element of the first operand is not equal to the second
operand. Else returns FALSE.
When different data types are combined in a vector, R performs type coercion by converting all
elements to a common type (usually the most flexible type, such as character).
a <- c(TRUE, 0.1,'apple')
b <- c(0,0.1,"bat")
print(a!=b)

Output

[1] TRUE FALSE TRUE

Assignment Operators
Assignment Operators in R are used to assigning values to various data objects in R. The objects
may be integers, vectors or functions. These values are then stored by the assigned variable
names.

1. Left Assignment (<- , <<- , =)

Assigns a value to a vector.

vec1 = c("ab", TRUE)
print (vec1)

Output

[1] "ab" "TRUE"

2. Right Assignment (-> , ->>)

Assigns value to a vector.

c("ab", TRUE) ->> vec1
print (vec1)

Output

[1] "ab" "TRUE"

Miscellaneous Operators
Miscellaneous operators in R are special-purpose operators used for tasks such as membership
checking (%in%) and matrix multiplication (%*%).

1. %in% Operator

Checks if an element belongs to a list and returns a boolean value TRUE if the value is
present else FALSE.
val <- 0.1
a <- c(TRUE, 0.1,"apple")
print (val %in% a)

Output

[1] TRUE

2. %*% Operator (Matrix Multiplication)

The %*% operator performs matrix multiplication.

 Columns of the first matrix must equal rows of the second.
 If A is (r × c) and B is (c × r), the result is (r × r).
mat = matrix(c(1,2,3,4,5,6),nrow=2,ncol=3)
print (mat)
print( t(mat))
pro = mat %*% t(mat)
print(pro)

Subsetting in R Programming:-

In R Programming Language, subsetting allows the user to access elements from an object. It
takes out a portion from the object based on the condition provided. There are 4 ways of
subsetting in R programming. Each of the methods depends on the usability of the user and the
type of object. For example, if there is a dataframe with many columns such as states, country,
and population and suppose the user wants to extract states from it, then subsetting is used to
do this operation. In this article, let us discuss the implementation of different types of
subsetting in R programming.

Subsetting is the process of extracting or selecting a portion of data from a larger data structure
such as a vector, matrix, list, or data frame. It allows the user to access specific elements, rows,
columns, or components based on position, logical conditions, or names.

🔹 2. Importance of Subsetting

Subsetting is a fundamental concept in R because:

 It helps in data analysis and manipulation

 It allows filtering of data
 It is used in data cleaning and transformation
 It supports efficient handling of large datasets

🔹 3. Data Structures Used in Subsetting

Subsetting can be applied to:

 Vectors (1-dimensional)
 Matrices (2-dimensional)
 Arrays (multi-dimensional)
 Lists (heterogeneous elements)
 Data Frames (tabular data)

🔹 4. Operators Used in Subsetting

📌 (1) Square Brackets [ ]

 Used for general subsetting

 Returns the same type of object

📌 (2) Double Brackets [[ ]]

 Used to extract a single element

 Mainly used for lists

📌 (3) Dollar $

 Used to access elements by name

 Works with lists and data frames
🔹 5. Methods of Subsetting

🔸 (A) Subsetting by Position (Indexing)

📘 Theory

In this method, elements are selected using their index (position).

 Index in R starts from 1

 Can select single or multiple elements

📌 Types:

1. Positive indexing → selects elements

2. Negative indexing → excludes elements

📌 Example
x <- c(10, 20, 30, 40)

x[2] # selects 20
x[c(1,3)] # selects 10 and 30
x[-2] # removes 20

🔸 (B) Subsetting by Logical Conditions

📘 Theory

In this method, elements are selected based on TRUE or FALSE conditions.

 Logical expressions return TRUE/FALSE

 Only TRUE values are selected

📌 Example
x <- c(5, 10, 15, 20)

x > 10 # FALSE FALSE TRUE TRUE

x[x > 10] # returns 15 and 20

👉 This method is widely used in filtering datasets.

🔸 (C) Subsetting by Names

📘 Theory

Elements can be accessed using their assigned names instead of position.

📌 Example
x <- c(a=10, b=20, c=30)

x["b"] # returns 20

👉 Useful when working with labeled data.

🔸 (D) Subsetting Using Functions

📌 1. subset() Function

📘 Theory

Used to extract subsets of data frames based on conditions.

subset(data, condition, select)

📌 Example
df <- [Link](x=1:5, y=6:10)

subset(df, x > 2)
subset(df, x > 2, select = y)

📌 2. which() Function

📘 Theory

Returns the index positions of TRUE values.

which(x > 10)

📌 3. %in% Operator

📘 Theory

Used to match elements with a given set.

x[x %in% c(10, 30)]

📌 4. [Link]() Function

📘 Theory

Used to handle missing values.

x[![Link](x)]

🔹 6. Subsetting Different Data Structures

🔸 (1) Vector

 Uses single index

x[1]
x[x > 5]

🔸 (2) Matrix

📘 Theory

Matrices use row and column indexing

m[row, column]

📌 Example
m[1,2] # element
m[ ,2] # column
m[2, ] # row

🔸 (3) List

📘 Theory

Lists require special operators:

 [ → returns sublist
 [[ → returns element
 $ → by name

lst[[1]]
lst$name
🔸 (4) Data Frame

📘 Theory

Data frame is a table (rows + columns)

df[row, column]

📌 Example
df[1,2]
df[df$x > 2, ]
df$y

🔹 7. Important Rules

 Indexing starts from 1

 Negative and positive indexing cannot be mixed
 Logical vectors must match length
 drop = FALSE preserves structure in matrices
 [[ ]] extracts value, [ ] keeps structure

🔹 8. Advantages of Subsetting

 Efficient data handling

 Flexible data selection
 Essential for data analysis
 Supports conditional filtering

Vectorized Operations in R

🔹 1. Definition

Vectorized operations refer to performing operations on an entire vector or array at once, rather
than processing each element individually using loops.

In R, most operators and functions are vectorized by default, meaning they automatically apply
to all elements of a vector.

🔹 2. Key Characteristics
📌 (1) High Performance

Vectorized operations are much faster than loops because:

 Computation is handled internally in optimized C/Fortran code

 Avoids repeated interpretation by R

📌 (2) Conciseness

 Eliminates long for loops

 Makes code shorter and easier to read

👉 Example:

x + y # instead of loop

📌 (3) Recycling Principle

When vectors have different lengths:

 The shorter vector is repeated (recycled) to match the longer one

x <- c(1,2,3,4)
y <- c(10,20)

x + y # 11 22 13 24

⚠️If lengths are not multiples, R gives a warning.

🔹 3. Types of Vectorized Operations

🔸 (A) Arithmetic Operations

Operators work element-wise:

x <- c(1, 2, 3)
y <- c(10, 20, 30)

x + y # 11 22 33
x * y # 10 40 90
🔸 (B) Logical Comparisons

Return a logical vector (TRUE/FALSE):

ages <- c(15, 25, 30, 10)

ages > 18
# FALSE TRUE TRUE FALSE

🔸 (C) Mathematical Functions

Functions operate on each element:

x <- c(1, 4, 9)

sqrt(x) # 1 2 3
log(x)
exp(x)

🔸 (D) Conditional Operations (ifelse())

Vectorized alternative to if-else:

marks <- c(40, 60, 80)

result <- ifelse(marks >= 50, "Pass", "Fail")

👉 Output: "Fail" "Pass" "Pass"

🔹 4. Scalar and Vector Operations

Scalars are automatically applied to all elements:

radii <- c(1, 2, 3)

2 * pi * radii

👉 Here, 2 and pi are recycled across all elements.

🔹 5. Comparison with Loops

❌ Using Loop
result <- numeric(3)
for(i in 1:3){
result[i] <- x[i] + y[i]
}
✅ Vectorized Approach
x+y

👉 Vectorized code is:

 Faster
 Simpler
 Less error-prone

🔹 6. Advantages

 Efficient computation
 Cleaner and shorter code
 Better readability
 Core feature of R programming

🔹 7. Important Points

 Operations are element-wise

 Works on vectors, matrices, and data frames
 Recycling rule applies
 Avoids explicit loops

🔹 8. Conclusion

Vectorized operations are a fundamental feature of R that enable efficient and concise data
processing. By operating on entire data structures at once, they improve performance and
simplify coding, making them essential for statistical computing and data analysis.

NA and NULL Values in R

🔹 1. Introduction
In data analysis, datasets often contain incomplete or missing information. R provides special
objects to represent such situations:

 NA (Not Available) → represents missing or undefined data

 NULL → represents absence of any value or object

These two are fundamentally different and are used in different contexts during data handling
and computation.

🔷 2. NA (Not Available)

🔹 2.1 Definition

NA (Not Available) is a special value used to denote missing or unknown data in R.

👉 It indicates that:

A value exists in the dataset, but it is not currently available.

🔹 2.2 Nature of NA

 NA is a logical constant, but it can be coerced into other types

 It occupies one position in a vector
 It propagates through operations (i.e., results remain NA)

🔹 2.3 Types of NA

R provides typed missing values to maintain consistency:

 NA_integer_ → integer missing value

 NA_real_ → numeric missing value
 NA_character_ → character missing value
 NA_complex_ → complex missing value

👉 This ensures type safety in computations.

🔹 2.4 Properties of NA

1. Length: NA has length 1

2. Propagation: Any operation involving NA returns NA

3. Comparison:

NA == NA # returns NA, not TRUE

👉 Because the value is unknown

🔹 2.5 Operations with NA

📌 Arithmetic
x <- c(10, 20, NA)
x+5

👉 Result: 15 25 NA

📌 Logical
x > 15

👉 Result: FALSE TRUE NA

🔹 2.6 Detection of NA

[Link](x)

 Returns TRUE where values are NA

 Essential for identifying missing data

🔹 2.7 Handling NA Values

📌 Removing NA
x[![Link](x)]

📌 Ignoring NA in functions
mean(x, [Link] = TRUE)
sum(x, [Link] = TRUE)

👉 [Link] = TRUE removes NA before computation.

🔹 2.8 Importance of NA

 Essential for real-world datasets

 Helps maintain data integrity
 Used in statistical modeling and analysis

🔷 3. NULL

🔹 3.1 Definition
NULL represents the absence of any object or value in R.

👉 It means:

No data exists at all.

🔹 3.2 Nature of NULL

 NULL is a special object, not a value

 It has length 0
 Represents an empty structure

🔹 3.3 Properties of NULL

1. Length:

length(NULL) # 0

2. No type or value
3. Not stored in atomic vectors

🔹 3.4 Behavior of NULL

📌 In vectors
c(1, 2, NULL, 3)

👉 Output: 1 2 3
👉 NULL is ignored

📌 In lists
lst <- list(a=1, b=2)
lst$b <- NULL

👉 Removes element b

🔹 3.5 Detection of NULL

[Link](x)

 Returns TRUE if object is NULL

🔹 3.6 Uses of NULL

 Initialize empty objects
 Remove elements from lists
 Represent absence of output in functions

Differences Between NA and NULL

Feature NA (Not Available) NULL

No value /
Meaning Missing value
no object

Value does
Existence Value exists but unknown
not exist

Length 1 0

Lists,
Data Structures Vectors, matrices, data frames
objects

Ignored in
Behavior Propagates in operations
vectors

Detection [Link]() [Link]()

Coding Standards in R

🔹 1. Introduction
Coding standards are a set of rules and guidelines used to write clean, readable, and
maintainable code.
In R, following coding standards ensures that code is:

 Easy to understand
 Easy to debug
 Easy to maintain and reuse

🔹 2. Importance of Coding Standards

Coding standards are important because they:

 Improve code readability

 Enhance consistency
 Reduce errors and bugs
 Help in team collaboration
 Make code easier to maintain and update

🔷 3. General Coding Guidelines in R

🔸 (1) Naming Conventions

📘 Theory

Use meaningful and descriptive names for variables and functions.

📌 Rules

 Use lowercase letters

 Separate words using _ (snake_case)
 Avoid spaces and special characters

📌 Example
student_marks <- 85
total_sum <- sum(x)

🔸 (2) Assignment Operator

📘 Theory

Use <- for assignment instead of = (recommended in R style guides).

📌 Example
x <- 10

🔸 (3) Spacing
📘 Theory

Proper spacing improves readability.

📌 Rules

 Add space around operators (+, -, =)

 No unnecessary spaces inside parentheses

📌 Example
x <- a + b

🔸 (4) Indentation
📘 Theory

Indent code blocks properly for clarity.

📌 Example
if (x > 10) {
print("Greater")
}

🔸 (5) Line Length

📘 Theory

Keep lines short (usually ≤ 80 characters).

👉 Long lines should be broken into multiple lines.

🔸 (6) Comments
📘 Theory

Use comments to explain code logic.

📌 Example
# Calculate average marks
mean_marks <- mean(x)

👉 Comments should be:

 Clear
 Short
 Meaningful

🔸 (7) Function Writing Style

📘 Theory

Functions should be well-structured and readable.

📌 Example
calculate_mean <- function(x) {
mean(x, [Link] = TRUE)
}

🔸 (8) Avoid Hard Coding

📘 Theory

Avoid directly using values in code; use variables instead.

📌 Example
threshold <- 50
if (marks > threshold) {
print("Pass")
}

🔸 (9) Consistent Style

📘 Theory

Maintain the same style throughout the program.

 Same naming pattern

 Same indentation
 Same spacing

🔸 (10) Use of Built-in Functions

📘 Theory

Prefer built-in vectorized functions instead of loops for efficiency.

🔷 4. Popular R Style Guides

Some widely followed standards:

 Google R Style Guide

 Tidyverse Style Guide

👉 These guides define best practices for writing clean R code.

🔷 5. Example of Good vs Bad Code

❌ Bad Code
x=1:10
y=mean(x)
print(y)

✅ Good Code
numbers <- 1:10
average <- mean(numbers)

print(average)

🔷 6. Advantages of Following Coding Standards

 Improves readability
 Makes debugging easier
 Enhances collaboration
 Produces professional-quality code

Statistical Analysis Using R Software
No ratings yet
Statistical Analysis Using R Software
29 pages
EDA Week1
No ratings yet
EDA Week1
24 pages
Data Analytics with R: Practical Guide
No ratings yet
Data Analytics with R: Practical Guide
55 pages
ASM Unit1
No ratings yet
ASM Unit1
15 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
25 pages
Unit 3 DataScience
No ratings yet
Unit 3 DataScience
12 pages
Install R and RStudio Guide
No ratings yet
Install R and RStudio Guide
84 pages
R Programming FULL
No ratings yet
R Programming FULL
140 pages
B SC VI Unit5to8
No ratings yet
B SC VI Unit5to8
23 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
6 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
50 pages
Call Type Analysis for Sales Reps
No ratings yet
Call Type Analysis for Sales Reps
25 pages
R Programming Basics for Data Management
No ratings yet
R Programming Basics for Data Management
25 pages
R Programming Basics and Installation Guide
No ratings yet
R Programming Basics and Installation Guide
35 pages
R Programming: Installation & Basics
No ratings yet
R Programming: Installation & Basics
58 pages
Intro To R Lesson 1
No ratings yet
Intro To R Lesson 1
9 pages
R Programming for Data Analysis Guide
No ratings yet
R Programming for Data Analysis Guide
28 pages
Introduction to R Programming
No ratings yet
Introduction to R Programming
72 pages
Ecose21 Unit I
No ratings yet
Ecose21 Unit I
11 pages
UNIT 1 - 2 Notes
No ratings yet
UNIT 1 - 2 Notes
80 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
15 pages
R Programming for Data Analysis Guide
No ratings yet
R Programming for Data Analysis Guide
60 pages
Getting Started with R Programming
No ratings yet
Getting Started with R Programming
13 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
181 pages
Unit 1 - Introduction To R Programming
No ratings yet
Unit 1 - Introduction To R Programming
45 pages
Introduction to R and RStudio
No ratings yet
Introduction to R and RStudio
35 pages
R-Programming Final
No ratings yet
R-Programming Final
31 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
17 pages
R Programming for Data Analytics Lab
No ratings yet
R Programming for Data Analytics Lab
57 pages
R Course: Statistical Graphics Basics
No ratings yet
R Course: Statistical Graphics Basics
169 pages
R Programming Unit1
No ratings yet
R Programming Unit1
50 pages
Importance of Toolkits in Data Science
No ratings yet
Importance of Toolkits in Data Science
8 pages
Introduction to R for Business Analytics
No ratings yet
Introduction to R for Business Analytics
64 pages
Introduction to R Programming
100% (1)
Introduction to R Programming
189 pages
R Regression Modeling Overview
No ratings yet
R Regression Modeling Overview
162 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
30 pages
Data Science and Machine Learning in R
100% (2)
Data Science and Machine Learning in R
34 pages
R Programming for Statistical Analysis
No ratings yet
R Programming for Statistical Analysis
22 pages
R Programming Basics and Installation Guide
No ratings yet
R Programming Basics and Installation Guide
32 pages
R Programming Assignment Overview
No ratings yet
R Programming Assignment Overview
31 pages
R Programming Language Overview
No ratings yet
R Programming Language Overview
19 pages
Introduction to R for Data Analysis
No ratings yet
Introduction to R for Data Analysis
22 pages
Introduction To R (Session1 and 2)
No ratings yet
Introduction To R (Session1 and 2)
15 pages
Data Analysis Using R
100% (1)
Data Analysis Using R
78 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
35 pages
R Programming Basics and Installation Guide
No ratings yet
R Programming Basics and Installation Guide
32 pages
R Programming Basics and Features
No ratings yet
R Programming Basics and Features
82 pages
Statistics with R Programming Basics
No ratings yet
Statistics with R Programming Basics
23 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
13 pages
R Programming Lab Manual for B.Tech
100% (1)
R Programming Lab Manual for B.Tech
46 pages
Overview of R Programming Language
No ratings yet
Overview of R Programming Language
39 pages
Introduction to R for Data Analytics
No ratings yet
Introduction to R for Data Analytics
77 pages
Introduction to R: Basics & Visualization
No ratings yet
Introduction to R: Basics & Visualization
392 pages
Culture and Traditions of St. Lucia
No ratings yet
Culture and Traditions of St. Lucia
23 pages
B Corp Certification's Impact on Growth
No ratings yet
B Corp Certification's Impact on Growth
18 pages
Kuhn 2011
No ratings yet
Kuhn 2011
1 page
ISA Assessment Test November 2025 Details
No ratings yet
ISA Assessment Test November 2025 Details
2 pages
Safety and Accuracy in Experiments
No ratings yet
Safety and Accuracy in Experiments
3 pages
Student Interest in Face-to-Face Classes
No ratings yet
Student Interest in Face-to-Face Classes
15 pages
Jar Doc 06 Jarus Sora Annex C v1.0
No ratings yet
Jar Doc 06 Jarus Sora Annex C v1.0
17 pages
Toy Calculator Type System Extension
No ratings yet
Toy Calculator Type System Extension
4 pages
Understanding SEO for Online Marketing
No ratings yet
Understanding SEO for Online Marketing
9 pages
Executive Order No. 180 Overview
No ratings yet
Executive Order No. 180 Overview
5 pages
Chaiyya Chaiyya Lyrics Overview
No ratings yet
Chaiyya Chaiyya Lyrics Overview
6 pages
CNS Physiology Viva Questions
100% (1)
CNS Physiology Viva Questions
5 pages
Legal and Administrative Project Analysis
No ratings yet
Legal and Administrative Project Analysis
7 pages
Key Accounting Concepts Explained
No ratings yet
Key Accounting Concepts Explained
12 pages
QC Manager CV Approval for Bala Murugesh
No ratings yet
QC Manager CV Approval for Bala Murugesh
6 pages
Maximizing Leadership Capital Efficiency
No ratings yet
Maximizing Leadership Capital Efficiency
6 pages
Spectrum TRD1 Tests U8 1-Opt PDF
No ratings yet
Spectrum TRD1 Tests U8 1-Opt PDF
3 pages
Liu 2018
No ratings yet
Liu 2018
19 pages
Fried 2014 MHMTN
No ratings yet
Fried 2014 MHMTN
1 page
Kindergarten ELA 5E Lesson Plan
No ratings yet
Kindergarten ELA 5E Lesson Plan
3 pages
FizzDragon: AIGC Innovation in Singapore
No ratings yet
FizzDragon: AIGC Innovation in Singapore
17 pages
Integrated Detector Dewar Cooler Assembly Guide
No ratings yet
Integrated Detector Dewar Cooler Assembly Guide
30 pages
Exogenous Pigmentation Overview
No ratings yet
Exogenous Pigmentation Overview
40 pages
Understanding Architectural Formalism
No ratings yet
Understanding Architectural Formalism
23 pages
Salesforce ISMS SoA for ISO 27001
No ratings yet
Salesforce ISMS SoA for ISO 27001
26 pages
Best Year Ever: Goal Setting Guide
100% (2)
Best Year Ever: Goal Setting Guide
47 pages
University Life and Achievements Overview
No ratings yet
University Life and Achievements Overview
13 pages
Vocabulary Insights and Definitions
100% (1)
Vocabulary Insights and Definitions
103 pages
Nursing Documentation and Incident Reporting Guide
No ratings yet
Nursing Documentation and Incident Reporting Guide
14 pages
0448 s17 QP 1
No ratings yet
0448 s17 QP 1
4 pages