0% found this document useful (0 votes)
18 views21 pages

DataFrame Operations in Python

Uploaded by

ruhi61047
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views21 pages

DataFrame Operations in Python

Uploaded by

ruhi61047
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1.

Consider a set of information for an Exam conducted for students with following
details.
Names Marks Trails Passed
0 Saniya 95 2 Yes
1 Kris 70 3 No
2 Rishav 96.5 1 Yes
3 Deepak 75 2 No
4 Kirti 92 1 yes
Code to create the above DataFrame :
import pandas as pd
dict={"Names":["Saniya" , "Kris" , "Rishav" , "Deepak" , "Kirti"],
"Marks":[95,70,96.5,75,92],
"Trails":[2,3,1,2,1],
"Passed":["Yes","No","Yes","No","Yes"]}
exam=[Link](dict)
print(exam)

For the above DataFrame exam, write single line statements for each of the following
parts :

(a) To display the 'Names' and 'Marks' columns from the DataFrame.
Ans : print(exam[["Names","Marks"]])

(b) To change the 'Marks' in the 4th row (i.e. for index 3) to 91.5.
Ans : [Link][3, "Marks"]=91.5

(c) To display the rows where number of 'Trials' in the examination is less than 2.
Ans : print(exam[exam["Trails"]<2])

(d) To display the maximum value of the column Marks.


Ans : print(exam["Marks"].max())

(e) Rename the column name 'Trails' to 'Attempt'.


Ans : exam=[Link](columns={"Trails":"Attempt"})
print(exam)
2. Consider the following DataFrame : climate to record the data about climatic
conditions about four years.
Year Max Temp Min Temp Rainfall
0 2017 32 20 123
1 2018 33 22 140
2 2019 35 21 135
3 2020 34 23 160

Code to create the above DataFrame :


import pandas as pd
dict={"Year" : [2017,2018,2019,2020],
"MaxTemp" : [32,33,35,34],
"MinTemp" : [20,22,21,23],
"Rainfall" : [123,140,135,160]}
climate=[Link](dict)
print(climate)

(i) What will be the output of the following?

(a) print([Link][1:3,1:2])

MaxTemp
1 33
2 35

(b) print([Link](2))

Year MaxTemp MinTemp Rainfall


0 2017 32 20 123
1 2018 33 22 140

(ii) Write the Python code to display the temperature difference between
MaxTemp and MinTemp for all the rows in the DataFrame Climate.
Ans : print(climate["MaxTemp"] - climate["MinTemp"])

(iii) List 1st, 2nd and 3rd rows.


Ans : print([Link][0:2]) OR print([Link][0:3]) OR print(climate[0:3])

(iv) The exact number of values in each column of the DataFrame?


Ans : print([Link]())
3. Consider the DataFrame df shown below :

Year Month Passengers


0 2010 Jan 25
1 2010 Mar 50
2 2012 Jan 35
3 2010 Dec 55
4 2012 Dec 65
Code to create the above DataFrame :

import pandas as pd
dt={"Year":[2010,2010,2012,2010,2012],
"Month":["Jan","Mar","Jan" ,"Dec" ,"Dec"],
"Passengers":[25,50,35,55,65]}
df=[Link](dt)
print(df)
(i) What will be the output of the following statements?
(a) [Link]() Ans : (5 , 3)

(b) [Link]=["AirIndia", "Indigo", "Spicejet", "Jet", "Emirates"]


Ans : Year Month Passengers
AirIndia 2010 Jan 25
Indigo 2010 Mar 50
Spicejet 2012 Jan 35
Jet 2010 Dec 55
Emirates 2012 Dec 65

Note : The [Link] property can change the row index or (row labels)
(ii) Predict the output of print (df[1: 3]).
Year Month Passengers
1 2010 Mar 50
2 2012 Jan 35
(iii) Rename the column name 'Passengers' to 'Visitors'.
Ans : df=[Link](columns={"Passengers" : "Visitors"})
print(df)
(iv) Add a column 'Flights' to the DataFrame df and add common values as
"Flight Names"
Ans : df["Flights"]="Flight Names"
print(df)
4. Consider the given DataFrame "stock" :

Name Price
0 Item1 150
1 Item2 180
2 Item3 225
3 Item4 500

Code to create the above DataFrame :

import pandas as pd
dict={"Name":["Item1","Item2","Item3" ,"Item4"],
"Price":[150,180,225,500]}
df=[Link](dict)
print(df)

(i) Add a column called Special_Price with the following data: [135, 150, 200, 440).
Ans : df["Special Price"]=[135, 150, 200, 440]
print(df)

(ii) Add a new book named 'The Secret' having price 800 and special price 750,
Ans : [Link][4]=["The Secret" , 800,750]
print(df)

(iii) Remove the column Special_Price.


Ans : del df["Special Price"] OR df=[Link]("Special Price" , axis=1)
print(df) print(df)

(iv) Update the price of 'Item2' to 190.


Ans : [Link][1 , "Price"]=190
print(df)

(v) Filter the DataFrame to show only items and price with a price greater than 200.
Ans : print(df[df["Price"]>200] [["Name","Price"]])
Name Price
2 Item3 225
3 Item4 500
4 The Secret 800
5. Consider the DataFrame df shown below:

Code to create the above DataFrame :

import pandas as pd
dict={"ProductName":["Laptop","Smartphone","Chair" ,"Desk" , "Blender"],
"Category" : ["Electronics","Electronics","Furniture","Furniture","Appliances"],
"Price":[65000,31000,750,3500,2800]}
df=[Link](dict)
print(df)

Write Python statements for the DataFrame df to:

I. Print the first four rows of the DataFrame df.


Ans : print([Link](4))

II. Display the product names of all the products.


Ans : print(df["ProductName"])

III. Remove the column Category.


Ans : del df["Category"] OR df=[Link]("Category" , axis=1)
print(df) print(df)

IV. Display the data of the 'Price' column for index 2 and 3.
Ans : print([Link][[2,3],["Price"]]) OR print(df["Price"][2:4])

V. Rename the column name 'ProductName' to 'ItemName'.


Ans : df=[Link](columns={"ProductName" : "ItemName"})
print(df)
6. Ms. Jennifer Christiano wants to create a DataFrame "Football" and perform some
operations with it.

import pandas as pd
dict={"Player":["Ronaldo","Pele","Maradona"],
"Club" : ["AC Milan","Real Madrid","Royal Argentina"],
"Charges":[9.6,10.5,19.5]}
football=[Link](dict)
print(football)

(a) Write the output for the following:


print(football["Charges"]/2) print([Link])
Output : 0 4.80 Output : 9
1 5.25
2 9.75
(b) Help her to write the code for the following: To set the index as 'R' , 'P' and 'M',
respectively for the three rows.
Ans : football=[Link]({0:'R' , 1:'P' , 2:'M'})
print(football)
OR
[Link]=["R" , "P" , "M"]
print(football)

(c) To make all the charges to 11.9


Ans : football["Charges"]=11.9
print(football)

(d) Remove the column Charges.


Ans : del football["Charges"] OR football = [Link]("Charges" , axis=1)
7. Jacqueline has created the following DataFrame consisting of data of houses of a
school and the number of boys in the houses, shown as follows. She wants to
perform certain operation on the data set. Help her with the correct actions.

import pandas as pd
dict={"HouseName":["Ganga","Yamuna","Saraswati"],
"Location" : ["East","West","North"],
"Charges":[20,30,56]}
house=[Link](dict)
print(house)

(a) What will be the output of the following code?

(i) print([Link]) (ii) [Link]("Location")

Output : (3 , 3) print(House)

Output :

pop() can also be used to delete one column.

(b) To display total number of elements in the dataframe.


Ans : print([Link])

(c) She wants to see the number of columns.


Write the statement which will display the number of columns.
Ans : print(len([Link])) Output : 3
8. Write a Python program to create the following DataFrame using a list of
dictionaries.

Ans : import pandas as pd


d1 = {'Product': 'Laptop', 'Price': 60000}
d2 = {'Product': 'Desktop', 'Price': 45000}
d3 = {'Product': 'Monitor', 'Price': 15000}
d4 = {'Product': 'Tablet', 'Price': 30000}
data = [d1, d2, d3, d4]
df = [Link](data)
print(df)

9. Write a Python Program to create a Pandas Series as shown below using a


dictionary. Note that the left column indicates the indices and the right column
displays the data.

Russia Moscow
Hungary Budapest
Switzerland Bern

Ans : import pandas as pd


data = {'Russia':'Moscow' , 'Hungary':'Budapest' , 'Switzerland':'Bern'}
s = [Link](data)
print(s)

10. Consider the DataFrame df shown below.

Write Python statements for the DataFrame df to:


I Print the first two rows of the DataFrame df.
II Display titles of all the movies.
III Remove the column rating.
IV Display the data of the 'Title' column from indexes 2 to 4 (both included)
V Rename the column name 'Title' to 'Name'.
Ans :
I. print([Link](2))
II. print(df['Title'])
III. df = [Link](„Rating‟, axis=1)
IV. print([Link][2:4,'Title'])
V. df = [Link](columns={'Title':'Name'})
print(df)
OR
[Link](columns={'Title':'Name'}, inplace=True)

11. Create a DataFrame in Python from the given list:


[[„Divya‟,‟HR‟,95000],[„Mamta‟,‟Marketing‟,97000],[„Payal‟,‟IT‟,980000],
[„Deepak‟,‟Sales‟,79000]]
Also give appropriate column headings as shown below:

Ans : import pandas as pd

LIST1 =[["Divya","HR",95000],
["Mamta","Marketing",97000],
["Payal","IT",980000],
["Deepak","Sales",79000]]
df=[Link](LIST1,columns=["Name","Department", "Salary"])
print(df)

12. Consider the given DataFrame „Genre‟:


Type Code
0 Fiction F
1 Non Fiction NF
2 Drama D
3 Poetry P
Write suitable Python statements for the following:
i. Add a column called Num_Copies with the following data: [300,290,450,760].
Ans : Genre["Num_Copies"] = [300,290,450,760]

ii. Add a new genre of type „Folk Tale' having code as “FT” and 600 number of copies.
Ans : [Link][4] = ["Folk Tale","FT",600]

iii. Rename the column „Code‟ to „Book_Code‟.


Ans : Genre = [Link]({"Code":"Book_Code"}, axis=1)
OR
Genre = [Link](columns={"Code":"Book_Code"})
OR
[Link](columns={"Code":"Book_Code"} , inplace = True)

13. Ekam, a Data Analyst with a multinational brand has designed the DataFrame df
that contains the four quarter‟s sales data of different stores as shown below:
Store Qtr1 Qtr2 Qtr3 Qtr4
0 Store1 300 240 450 230
1 Store2 350 340 403 210
2 Store3 250 180 145 160
Answer the following questions:
i. Predict the output of the following python statement:
a. print([Link]) b. print(df[1:3])
Ans : 15 Ans :
Store Qtr1 Qtr2 Qtr3 Qtr4
1 Store2 350 340 403 210
2 Store3 250 180 145 160

ii. Delete the last row from the DataFrame.


Ans : df=[Link](2) OR [Link](2,axis=0)

iii. Write Python statement to add a new column Total_Sales which is the addition of
all the 4 quarter sales.
Ans : df["total"] = df["Qtr1"] + df["Qtr2"] + df["Qtr3"] + df["Qtr4"]

iv. Write Python statement to export the DataFrame to a CSV file named [Link]
stored at D: drive.
Ans : df.to_csv(“D:\[Link]”)
14. Consider the given DataFrame „Items‟:
Name Price Quantity
0 CPU 7750 15
1 Watch 475 50
2 Key Board 225 25
3 Mouse 150 20

Write suitable Python statements for the following:


i) Add a column called Sale_Price which is 10% decreased value of Price
Ans : Items['Sale_Price']=0.90 * Items[„Price‟]

ii) Add a new item named “Printer” having price 8000 and Quantity as 10.
Ans : [Link]['4']=[“Printer”, 8000, 10]

iii) Remove the column Quantity


Ans : Items=[Link]('Quantity', axis=1) OR del Items["Quantity"]

15. Riya, a Data Analyst working for a film studio, has created a DataFrame named
'movie_revenue' to store the revenue data of movies released by Viacom 18
in the year 2022. The DataFrame looks like this:
Movie Revenue (in crores)
A GangubaiKathiawadi 125
B The Kashmir Files 340
C Bhediya 75
D JugJuggJiyo 250
E Bachchhan Pandey 120

(i) Predict the output of the following python statement:


a. print(movie_revenue[„Revenue (in crores)‟].dtypes)
Ans : int64
b. print(movie_revenue.iloc[2,0])
Ans : Bhediya

(ii) Delete the Second last row from the DataFrame.


Ans : movie_revenue = movie_revenue.drop("D")

(iii) Write Python statement to export the DataFrame to a CSV file named
“hit_movies_2022.csv” in the directory named „PMDB‟ which is inside D Drive.
Ans : movie_revenue.to_csv("D:\PMDB\hit_movies_2022.csv")
16. Carefully observe the following code:
import pandas as pd
dic= {'pid': [101, 102, 103, 104, 105],
'pname': ['Shyam', 'Roushan', 'Archit', 'Medha', 'Lalit'],
'sports': ['Cricket', 'Tennis', 'Football', 'Cricket', 'Cricket'],
'points': [45000,20000, 15000, 53000, 60000]}
player=[Link] (dic)
print (player)

Write Python statements for the following:


i. In the dataframe player created above, set the row labels as :
'Player1', 'Player2', 'Player3', 'Player4', 'Player5'.
Ans : player=[Link](dic,index=['Player1','Player2','Player3','Player4','Player5'])
OR
I=['Player1','Player2','Player3','Player4','Player5']
player=[Link] (dic,index=I)
OR
player=[Link]({0:"Player1",1:"Player2", 2:"Player3",3:"Player4" , 4:"Player5"})
print(player)

ii. Rename the column 'points' to 'netpoint' in the DataFrame player.

Ans : player=[Link](columns={'points':'netpoints'})
OR
player=[Link]({'points':'netpoints'},axis=1)
OR
[Link]({'points':'netpoints'},axis=1,inplace=True)
OR
[Link](columns={'points':'netpoints'},inplace=True)
17. Kavyanjali, a chemical analyst, needs to arrange data of few elements in the form
of two series containing symbols and their atomic numbers respectively.
Thereafter, the data of these two series has to be arranged and displayed in the
form of Data Frame as shown below :

Help her in writing suitable Python code to complete the task.

Ans : import pandas as pd


I=['Hydrogen','Helium','Lithium','Beryllium']
D1=['H','He','Li','Be']
D2=[1,2,3,4]
S1=[Link](D1,index=I)
print("First Series: ")
print(S1)
S2=[Link](D2,index=I)
print("Second Series: ")
print(S2)
D = {'Symbol': S1,'Atomic Number': S2}
DF=[Link](D)
print("DataFrame:")
print(DF)

18. Consider the given DataFrame 'health'.

i. Remove the row containing details of disease named Tuberculosis.


Ans : [Link](3))

ii. Add a new disease named 'Malaria' caused by 'Protozoa'


Ans : [Link][4]=['Malaria' , 'Protozoa'] OR [Link][4 , :] = ['Malaria' , 'Protozoa']

iii. Display the last 2 rows.


Ans : print([Link](2)) OR print([Link][3:4]) OR print([Link][3:5])
19. Consider the following DataFrame : mdf

Code to create the above DataFrame :

import pandas as pd
dict= [[1, "Aditya", 23, 20, 28] , [2, "Balwant", 18, 1, 25] ,
[3, "Chirag", 27, 23, 30] , [4, "Deepak" ,11, 3, 7] ,
[5, "Eva", 17, 21, 24]]
mdf=[Link](dict , columns=["Rollno","Name","English","Hindi","Maths"])
print(mdf)

Write Python statements for the DataFrame 'mdf‟ :

i. To display the records of the students having roll numbers 2 and 3.

Ans : print(mdf[1:3]) OR print([Link][1:2]) OR print([Link][1:3])

ii. To increase the marks of subject Math by 4, for all students.

Ans : mdf['Maths']=mdf['Maths'] + 4
print(mdf)

iii. Write Python statement to display the Rollno and Name of all students who
secured less than 10 marks in Maths.

Ans : print(mdf[mdf['Maths'] <10][['Rollno','Name']])

iv. Write Python statement to display the total marks i.e., sum of marks secured in
English, Hindi and Maths for all students.

Ans : print(mdf[['English','Hindi','Maths']].sum())
OR
print(mdf[['English','Hindi','Maths']].sum(axis=0))
20. Kabir, a data analyst, has stored the voter's name and age in a dictionary. Now,
Kabir wants to create a list of dictionaries to store data of multiple voters. He also
wants to create a DataFrame from the same list of dictionaries having appropriate
row labels as shown below :

Help Kabir in writing a Python program to complete the task.

Ans :
import pandas as pd
voters = [{"Voter_Name":"Arjun", "Voter_Age":35},
{"Voter_Name":"Bala", "Voter_Age":23},
{"Voter_Name":"Govind", "Voter_Age":25},
{"Voter_Name":"Dhruv", "Voter_Age":19},
{"Voter_Name":"Navya", "Voter_Age":18}]
dfVoters=[Link](voters,index=["Ar1001","Ba3002","Go4002","Dh4007","Na6005"])
print(dfVoters)

21. Consider the given DataFrame 'password':

Write suitable Python statements for the following :

i. To add a new row with following values :


CodeName - 'abc123'
Category - alphanumeric
Frequency - 12.8

Ans : [Link][7] = ['abc123','alphanumeric',12.8]


OR
[Link][7] = ['abc123','alphanumeric',12.8]
ii. To delete the row with the row label 2.

Ans : password=[Link](2) OR password=[Link](2 , axis = 0)


OR [Link](2 , inplace=True)

iii. To delete the column having column label as Frequency.

Ans : password = [Link]("Frequency", axis=1)


OR
[Link]("Frequency", axis=1, inplace=True)
OR
del password["Frequency"]
print(password)

22. Ms. Ritika conducted an online assessment and stored the details in a DataFrame
result as given below :

i. Predict the output of the following Python statement:


print([Link] [:,'Attempts'] > 1)

Ans : a False
b True
c True
d False

ii. Write the Python statement to display the last three records.

Ans : print([Link](3))
OR
print([Link][1:4])
OR
print([Link][["b","c","d"]])

iii. Write Python statement to display records of 'a ' and 'd' row labels.

Ans : print([Link][["a","d"]]) OR print([Link][[0,3]])

iv. Write suitable Python statement to retrieve the data stored in the file,
'[Link]' into a DataFrame, 'regis'.

Ans : regis=pd.read_csv("[Link]")
23. Shobit needs to create the following two series named „Eng‟ and „Math‟. Help
him to create a DataFrame „mydata‟ from the given series „Eng‟ and „Math‟.

Ans : import pandas as pd


eng_data = {'Aditi': 25, 'bhavuk': 21, 'chirag': 23, 'deepak': 24, 'Gaurav': 27}
Eng = [Link](eng_data)
math_data = {'Aditi': 9, 'bhavuk': 29, 'chirag': 15, 'deepak': 14, 'Gaurav': 20}
Math = [Link](math_data)
mydata = [Link]({'English': Eng, 'Mathematics': Math})
print(mydata)

24. Carefully observe the following code :


import pandas as pd
product= {'prodid':[Link]([1,2,3,4,5]),
'pname':[Link](['pen','pencil','eraser','color','sharpener']),
'qty':[Link]([2,10,10,30,10]),
'price':[Link]([300,20,50,40,15])}
stock=[Link](product)
print(stock)

Write Python statements for the following :


(i) Display the names of products.
Ans : (i) print(stock['pname'])

(ii) Rename the column „price‟ to „newprice‟ in the DataFrame stock.


Ans : stock=[Link](columns={"price" : "new price"})
print(stock)
OR
[Link](columns={"price" : "new price"} , inplace=True)
print(stock)
25. Consider the given DataFrame 'result‟:
Name Percentile
0 Rohit 95
1 Mohit 76
2 Raman 98
3 Aditya 47
Write the suitable Python statements for the following :
i. Add a new column „Grade‟ to the dataframe having values A,B,A,C
Ans : result['Grade'] = ['A','B','A','C']

ii. Add a new row where Name is „Arti‟ with Percentile as 92 and Grade A.
Ans : [Link][4] = ['Arti',92,'A']

iii. Display the top 3 rows.


Ans : print([Link](3)) OR print([Link][0:2])

26. Consider the following DataFrame „employee‟:

i. To remove the column „Salary‟.


Ans : employee=[Link]('Salary', axis=1) OR del employee["Salary"]

ii. To remove the row having index 4.


Ans : employee=[Link](4) OR [Link](4 , inplace=True)

iii. Write Python statement to save the DataFrame „employee‟ to a CSV file
[Link] stored in D: drive of the computer
Ans : employee.to_csv("d:\\[Link]") OR employee.to_csv("d:/[Link]")

iv. Write a Python statement to display the new salary i.e., salary increased by
5000 for all employees.
Ans : employee['new_salary'] = employee['salary'] + 5000
print(employee['new_salary']) OR print(employee['salary']+5000)
27. (i ) Create two series – one to store various product names and the other to
store the corresponding price. Each series should have appropriate row label
as given below :

Ans : import pandas as pd


D1=['Butterscotch','Vanilla','Mango Zap','Magnum','Cassatta']
D2=[130,100,150,190,200]
I=['B1001','V3002','M4002','M4007','C6005']
S1=[Link](D1,index=I)
S2=[Link](D2,index=I)
print(S1)
print(S2)

ii. Create a dictionary containing „Product_name‟ and „Product_price‟ as keys. Add the
series created in part (i) as their corresponding values.
Ans : D={'Product_name':S1 , 'Product_price':S2}

iii. Create a DataFrame from the above created dictionary of series.


Help her in writing the Python program to accomplish the above mentioned tasks.
Ans : DF=[Link](D)

print(DF)

28. Consider the following DataFrame Cricket :


Write suitable Python statements to perform the following tasks :
i. Add a new column Rating to the DataFrame having the following values :
3, 1, 2, 4, 5
Ans : Cricket['Rating'] = [3,1,2,4,5]

ii. Change the row labels from A, B, C, D, E to Team A, Team B, Team C, Team D
and Team E.

Ans : Cricket = [Link]({"A":"Team A" , "B":"Team B" , "C":"Team C" ,


"D":"Team D", "E":"Team E"})
OR
[Link]({"A":"Team A", "B":"Team B", "C":"Team C", "D":"Team D",
"E":"Team E"}, inplace=True)

iii. Change the column label of first column from 'Won' to 'Matches won'.
Ans : Cricket = [Link](columns={"Won":"Matches won"})
OR
[Link](columns={"Won":"Matches won"}, inplace=True)

29. Ms. Shambhavi, a data analyst working on a college admission project, has
created the following DataFrame Sub_Details to store subjectwise details :

i. Write suitable Python command to display the row having index value 3.
Ans : print(Sub_Details.loc[3])

ii. Predict the output of the following Python statement :


print(Sub_Details.loc[2:3,'Total Students'])
Ans : 2 45
3 40

iii. Write suitable Python statement to display the list of various subjects along with
their corresponding seat availability.
Ans : print(Sub_Details[['Subject','Seat Availability']])
iv. Ms. Shambhavi has just created a folder named Project in the E: drive of her
computer to store necessary files related to the project. Write suitable Python
statement to export the given DataFrame into the file [Link], created inside
project folder in E: drive.
Ans : Sub_Details.to_csv("E:\project\[Link]")
OR
Sub_Details.to_csv("E:/project/[Link]")

30. Consider the following DataFrame SPORTS :


ID NAME GENDER
SD1 1 ANN F
SD2 2 RAM M
SD3 3 SITA F
SD4 4 RAJ M

Write commands to :

i. Add a new column ITEM to the Dataframe :


Ans : SPORTS[‘ITEM’] = [‘Swimming’, ‘Dancing ’, ‘Cricket’, ‘Singing’]

ii. Add a new row with values (5, SAM, M)


Ans : [Link][‘SD5’] = [5, ‘RAM’, ‘M’]

iii. Write python code to delete column gender


Ans : del SPORTS[‘GENDER’]
OR
SPORTS = [Link]("GENDER" , axis = 1)

iv. Write python statement to delete the row with index SD3.
Ans : SPORTS = [Link]("SD3"))

You might also like