0% found this document useful (0 votes)
11 views10 pages

R Software Basics: RStudio Guide

The document provides an introductory lecture on R and RStudio, covering key features, installation, and usage of the software for statistical computing and graphics. It includes instructions on creating scripts, running code, installing packages, importing data, and basic plotting techniques. Additionally, it outlines data types in R and offers applied examples from various fields, concluding with homework tasks for practical application.

Uploaded by

chrisslime5692
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

R Software Basics: RStudio Guide

The document provides an introductory lecture on R and RStudio, covering key features, installation, and usage of the software for statistical computing and graphics. It includes instructions on creating scripts, running code, installing packages, importing data, and basic plotting techniques. Additionally, it outlines data types in R and offers applied examples from various fields, concluding with homework tasks for practical application.

Uploaded by

chrisslime5692
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Midlands State University

Paul Kundai Ziwakaya

R Software Lecture 1 Notes - HSTA405 Level 1


By: Paul Kundai Ziwakaya
Midlands State University

April 25, 2025

1
April 25, 2025

1 Introduction to R and RStudio


R is a programming language and environment designed for statistical computing and graph-
ics. RStudio is an integrated development environment (IDE) for R.
Key Features of R:

• Open-source and widely used for statistical analysis.

• Extensive packages available for diverse analyses.

• Strong visualization capabilities.

Key Features of RStudio:

• User-friendly interface with script editor, console, and environment viewer.

• Support for version control and project management.

• Integrated help system and package manager.

2 Getting Started
2.1 Installation
• Download R from [Link]

• Download RStudio from [Link]

3 How to Use RStudio


Now that we successfully installed RStudio, let’s open it, explore its main parts, and try to
perform various operations on it.

2
Midlands State University
Paul Kundai Ziwakaya

3.1 RStudio Interface


Opening RStudio will automatically launch the R software. The platform interface looks as
follows:

3.1.1 Components of the RStudio Interface


• Source Pane: This is where you can write scripts and save your work.

• Console Pane: This pane displays output from R commands and allows you to run
commands interactively.

• Environment/History Pane: Here, you can view your existing variables and com-
mands you’ve executed.

• Files/Plots/Packages/Help Pane: This area allows you to manage files, view plots,
install packages, and access help documentation.

3.2 Creating a New Script


To create a new script, click File – New File – R Script. Scripts allow us to save our code
for later use or sharing. Remember to name your script meaningfully and save it regularly
(Ctrl + S on Windows/Linux, Cmd + S on Mac).

3
Midlands State University
Paul Kundai Ziwakaya

3.3 Running Code from a Script


To run a line of code, place the cursor on it and click the Run icon or use (Ctrl + Enter
on Windows/Linux or Cmd + Enter on Mac). For multiple lines, select them first. To run
all lines, select all and use the Run icon or the shortcut (Ctrl + A + Enter or Cmd + A +
Enter).

3.4 Commenting Code


In your scripts, use # for comments to explain your code. At the script’s start, include
context such as the author, date created, last updated, and scope. Additionally, load any
required R packages early in the script.

3.5 Installing R Packages


To install a package in R, use the syntax:

[Link]("package_name")

Replace package name with the name of the package you wish to install. This function
downloads packages from CRAN or other sources. After installation, load the package with
the library() function.
Example:

[Link]("tidyverse")

This installs the tidyverse package, which includes tools for data manipulation and visu-
alization.
In RStudio: Install packages in the console rather than a script, as they only need
to be installed once. You can also install packages directly via the RStudio interface by
opening the Packages tab, clicking ”Install,” and selecting the desired packages from CRAN,
separated by spaces or commas.

3.6 Checking Loaded R Packages


To list all loaded packages, run:

.packages()

or

search()

In RStudio: Open the Packages tab, search for a specific package, and check if the box
to the left of its name is ticked.

4
Midlands State University
Paul Kundai Ziwakaya

3.7 Getting Help on R Packages or Built-in Objects


To get help on an installed package, a function, or any built-in R object, use:

help(package_or_function_name) or help("package_or_function_name")

The help() function displays documentation and information about a package or function
in R. By passing the package or function name as an argument, it provides relevant details.
Example: To learn more about the ggplot2 package, use the following code:

help(ggplot2)

4 Importing data
To read a CSV file into R and store it as a dataframe, use the [Link]() function. For
instance, to read a file named world [Link], you can assign it to a variable as
follows:

world_population <- [Link]("world_population.csv")

To run the above piece of code, download the publicly available World Population Dataset
from Kaggle and unzip it into the same folder where you store your R script. You can access
the dataset using the following link:
[Link]
In RStudio, you can import the dataset using one of the following methods:
1. Navigate to File – Import Dataset. 2. Alternatively, click Import Dataset on the
Environment tab.
Then, select From Text (base)..., navigate to the right folder, and select the file to
import. Fill in or check the fields Name, Heading, Separator, and Decimal in the pop-up
window. Preview the dataset structure, and click Import.

4.1 Accessing built-in R datasets


To see the full list of available sample datasets preloaded in R, including their names and
short descriptions, run the following piece of code in the console:

data()

The data() function is a built-in function in R that displays a list of available datasets
that come pre-installed with R. When this function is called, it will print the names of all the
datasets that are available for use in the current R session. This can be useful for exploring
and experimenting with different datasets without having to download or import them from
external sources.
Like in any other R IDE, in RStudio, we can access, manipulate, transform, analyze, and
model the data in R. Below are some examples of standard operations performed on the
built-in CO2 dataset:

5
Midlands State University
Paul Kundai Ziwakaya

head(CO2)
tail(CO2)
colnames(CO2)
dim(CO2)
str(CO2)
summary(CO2)
summary(CO2$uptake)
median(CO2$uptake)
class(CO2$uptake)
unique(CO2$Treatment)
subset(CO2, conc == min(CO2$conc))

• head(CO2) displays the first few rows of the dataset.

• tail(CO2) displays the last few rows of the dataset.

• colnames(CO2) displays the column names of the dataset.

• dim(CO2) displays the dimensions of the dataset (number of rows and columns).

• str(CO2) displays the structure of the dataset, including the data types of each column.

• summary(CO2) provides summary statistics for each column in the dataset.

• summary(CO2$uptake) provides summary statistics for the ”uptake” column in the


dataset.

• median(CO2$uptake) calculates the median value of the ”uptake” column.

• class(CO2$uptake) displays the class of the ”uptake” column (in this case, it is a
numeric vector).

• unique(CO2$Treatment) displays the unique values in the ”Treatment” column.

• subset(CO2, conc == min(CO2$conc)) creates a subset of the dataset where the


”conc” column is equal to the minimum value of the ”conc” column.

Try running these commands one by one in RStudio and observe the output.

4.2 Plotting data in RStudio


We can plot the data. Below are some examples of creating simple plots for the built-in CO2
and Orange datasets. In both cases, the resulting plot appears on the Plots tab and can be
exported using the Export button of that tab:

6
Midlands State University
Paul Kundai Ziwakaya

4.2.1 Creating a Histogram


To create a histogram, use the following code:

hist(CO2$uptake)

Powered By: This code is written in R.


The hist() function is used to create a histogram of the values in the uptake column of
the CO2 data frame. The CO2$uptake syntax is used to specify the uptake column of the
CO2 data frame.
The resulting histogram will show the distribution of the values in the uptake column.
To create a scatter plot, use the following code:

plot(Orange$age, Orange$circumference)

The plot() function is used to create a scatter plot of the age and circumference variables
from the Orange dataset. The Orange dataset is a built-in dataset in R that contains
measurements of the circumference of orange trees at different ages.
We can enhance the aesthetics of the previous scatter plot by tuning a few parameters
available in the basic plot() function:

plot(Orange$age, Orange$circumference,
xlab = "Age",
ylab = "Circumference",
main = "Circumference vs. Age",
col = "blue",
pch = 16)

4.3 Creating data from scratch in R


Do it for this To create a vector:

oceans <- c("Arctic", "Atlantic", "Indian", "Pacific", "Southern")


avg_depth <- c(1.2, 3.65, 3.74, 3.97, 3.27)

To create a dataframe:

oceans_depth <- [Link](oceans, avg_depth)

Printing out the result:

print(oceans_depth)

7
Midlands State University
Paul Kundai Ziwakaya

3. Basic Arithmetic and Variable Assignment


1 2 + 3 # Addition
2 5 - 1 # Subtraction
3 4 * 2 # Multiplication
4 8 / 2 # Division
5 2^3 # Exponentiation
6
7 # Variable assignment
8 x <- 10
9 y = 5
10 z <- x + y
11 print ( z )

4. Data Types in R
Vectors
1 ages <- c (23 , 45 , 34 , 28)
2 names <- c ( " Alice " , " Bob " , " Carol " , " Dave " )

Matrices
1 m <- matrix (1:9 , nrow = 3 , ncol = 3)

Lists
1 lst <- list ( name = " Paul " , age =30 , scores = c (89 , 76 , 92) )

Data Frames
1 df <- data . frame ( Name = c ( " Tom " , " Sue " ) , Age = c (25 , 28) , Score = c
(88 , 91) )

8
Midlands State University
Paul Kundai Ziwakaya

5. Simple Plotting in R
Line Plot
1 x <- c (1 , 2 , 3 , 4)
2 y <- c (2 , 4 , 6 , 8)
3 plot (x , y , type = " b " , col = " blue " , main = " Line Plot " , xlab = " X " ,
ylab = " Y " )

Histogram
1 grades <- c (56 , 67 , 78 , 78 , 85 , 90 , 91 , 70 , 88)
2 hist ( grades , main = " Histogram of Grades " , col = " green " )

Bar Plot
1 subjects <- c ( " Math " , " Finance " , " Biomath " )
2 scores <- c (80 , 90 , 85)
3 barplot ( scores , names . arg = subjects , col = " orange " , main = " Average
Scores " )

6. Applied Examples from Different Fields


Financial Mathematics
1 interest _ rates <- c (3.2 , 3.4 , 3.3 , 3.5)
2 barplot ( interest _ rates , main = " Quarterly Interest Rates " , col = "
purple " )

Survival Analysis
1 surv _ times <- c (5.2 , 8.1 , 3.9 , 10.5 , 6.3)
2 hist ( surv _ times , main = " Distribution of Survival Times " , col = "
red " )

9
Midlands State University
Paul Kundai Ziwakaya

Actuarial Science
1 ages <- c (25 , 30 , 35 , 40 , 45)
2 premiums <- c (100 , 120 , 150 , 180 , 220)
3 plot ( ages , premiums , type = " b " , main = " Insurance Premiums by Age "
, col = " blue " )

7. Homework / Lab Tasks


• Install R and RStudio

• Create an R script with 5 arithmetic expressions

• Create one of each: vector, matrix, list, and data frame

• Generate one plot using plot(), hist(), or barplot()

5 Conclusion
R and RStudio offer powerful tools for data analysis, visualization, and reporting. Under-
standing the basics of syntax, data structures, and core functions will enable effective data
manipulation and analysis.

10

You might also like