0% found this document useful (0 votes)
15 views6 pages

dplyr Functions and Data Manipulation in R

The document contains a series of multiple-choice questions and answers related to the dplyr package in R, focusing on data manipulation functions such as filter(), select(), mutate(), and various types of joins. It highlights the purpose of these functions and their usage in data analysis. Additionally, it covers concepts related to handling missing values and outliers in R.

Uploaded by

Jeya preetha
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

dplyr Functions and Data Manipulation in R

The document contains a series of multiple-choice questions and answers related to the dplyr package in R, focusing on data manipulation functions such as filter(), select(), mutate(), and various types of joins. It highlights the purpose of these functions and their usage in data analysis. Additionally, it covers concepts related to handling missing values and outliers in R.

Uploaded by

Jeya preetha
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1. What is the primary purpose of the dplyr package in R?

A Data visualization B Data manipulation C Statistical modeling


D File input/output Answer: B

2. Which operator is used in dplyr to chain multiple functions together?


A -> B => C %>% D :: Answer: C

3. What does the filter() function do in dplyr?


A Reorders rows B Selects specific columns
C Filters rows based on condition D Combines two data frames Answer: C

4. What is the output of select(data, Name, Salary)?


A A list of all columns B Only the rows with Name and Salary greater than
average C Only the Name and Salary columns
D Sorted rows by Salary Answer: C

5. Which function is used to add a new column to a data frame?


A select() B filter() C mutate() D arrange() Answer: C

6. What does summarise() (or summarize()) do in dplyr?


A Combines data sets B Sorts rows C Calculates summary statistics
D Adds new columns Answer: C

7. What is the purpose of group_by() in dplyr?


A To merge multiple data frames B To convert columns to rows
C To group data before applying summary functions D To visualize grouped data
Answer: C

8. How would you arrange rows in descending order of Salary?


A arrange(df, Salary) B arrange(df, -Salary)
C arrange(df, desc(Salary)) D filter(df, Salary) Answer: C

9. What type of join keeps only matching rows from both data frames?
A left_join() B full_join() C inner_join()
D right_join() Answer: C
10. Which dplyr function is best used for selecting specific columns from a data frame?
A mutate() B summarise() C select() D filter() Answer: C

11. What is the result of using %>% with multiple functions?


A Creates a list B Chains commands in sequence
C Filters only numeric columns D Returns NULL Answer: B

12. In dplyr, which function would you use to combine two data frames by a common column?
A bind() B merge() C join() D group_by() Answer: C

13. Which function is typically used immediately after group_by()?


A select() B summarise() C mutate() D arrange() Answer: B

14. Which of the following is NOT a join function in dplyr?


A inner_join() B left_join() C outer_join() D full_join()
Answer: C

15. What package is dplyr a part of?


A ggplot2 B tidyverse C shiny D Rcpp Answer: B
1. Which function in dplyr is used to choose specific columns?
A filter() B mutate() C select() D arrange() Answer: C

2. What does the mutate() function do in R?


A Selects rows B Adds or modifies columns C Sorts data
D Removes duplicates Answer: B
3. What is the purpose of filter() in dplyr?
A Add columns B Reorder columns C Subset rows based on condition
D Combine datasets Answer: C

4. Which function is used to reorder rows based on column values?


A filter() B summarise() C select() D arrange() Answer: D

5. How do you calculate summary statistics in dplyr?


A group_by() B summarise() C select() D mutate() Answer: B

6. What does group_by() do?


A Sorts values B Converts data types
C Groups data for summary operations D Deletes duplicate rows Answer: C

7. Which function would you use to calculate the average salary?


A mutate(Salary) B arrange(Salary) C summarise(mean(Salary))
D select(Salary) Answer: C

8. What does the `%>%` operator do in R?


A Negation B Exponentiation C Chains functions together
D Defines new variables Answer: C

9. What is a benefit of using the pipe operator `%>%`?


A Makes nested functions less readable B Makes code longer
C Improves code readability D Used only in base R Answer: C

10. What is the output of `select(data, Name, Salary)`?


A Filters by name and salary B Adds new columns
C Only columns Name and Salary D Sorts by Salary Answer: C
11. What function is typically used after group_by()?
A arrange() B select() C mutate() D summarise() Answer: D

12. What function is used to add a new column called "Annual_Salary" as `Salary * 12`?
A summarise() B select() C filter() D mutate() Answer: D

13. What is the purpose of arrange(desc(Salary))?


A Filters rows by Salary B Sorts Salary in ascending order
C Sorts Salary in descending order D Removes missing values Answer: C

14. Which function filters rows where Age > 30?


A arrange(data, Age > 30) B mutate(data, Age > 30)
C select(data, Age > 30) D filter(data, Age > 30) Answer: D

15. What function is used to select only Name and Salary from a data frame?
A select(data, Name, Salary) B mutate(Name, Salary)
C filter(Name, Salary) D arrange(Name, Salary) Answer: A
1. What is the purpose of data blending and joining in R?
A To visualize data B To clean outliers
C To combine datasets for analysis D To export data Answer: C

2. Which R package provides join functions like inner_join and left_join?


A ggplot2 B dplyr C tidytext D tibble Answer: B

3. What does an inner join return?


A Only unmatched rows B All rows from both tables
C Only matching rows from both tables D Rows with NULL values only
Answer: C

4. Which join includes all rows from the left table and matching rows from the right?
A inner_join() B left_join() C right_join() D full_join()
Answer: B

5. What does right_join() return?


A Only right table rows B Matching rows from left table only
C All rows from right table and matched rows from left
D Combined sorted table Answer: C

6. What is the result of full_join() in dplyr?


A Intersection of both tables B Only left table rows
C All rows from both tables D Only right table rows Answer: C

7. Which join returns rows from the left table that match rows in the right table, but excludes
right table columns?
A anti_join() B semi_join() C full_join() D inner_join()
Answer: B

8. What does anti_join() return?


A Matching rows from both tables B All right table rows
C Unmatched rows from the left table D Full combination of rows Answer: C
9. What statistical method can be used to detect outliers?
A Z-scores and IQR B Mean and mode C Median filter
D Chi-square test Answer: A

10. What graphical method is commonly used to detect outliers?


A Pie chart B Histogram C Boxplot
D Bar graph Answer: C

11. How can you treat outliers using transformation?


A Drop them B Replace with NA C Apply log or sqrt
D Sort them Answer: C

12. What function is used to check for missing values in R?


A [Link]() B [Link]() C [Link]() D [Link]() Answer: B

13. Which function removes rows with missing values?


A [Link]() B [Link]() C [Link]() D [Link]() Answer: C

14. What is mean imputation for missing data?


A Remove all NAs B Replace with 0
C Replace with mean of columnD Use a random number Answer: C

15. Which R package can be used for multiple imputation of missing data?
A ggplot2 B mice C tidyverse D stringr Answer: B

You might also like