0% found this document useful (0 votes)

11 views13 pages

Davppractical Solution

The document outlines various tasks to be performed using Python with libraries such as NumPy and Pandas. It includes computations like mean, variance, and standard deviation for arrays, operations on Pandas Series, DataFrame manipulations, and data visualizations using the Iris dataset. Additionally, it covers tasks related to the Titanic dataset, such as analyzing passenger demographics and fare statistics.

Uploaded by

rmelaap

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views13 pages

Davppractical Solution

Uploaded by

rmelaap

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

1. Write programs in Python using NumPy library to do the following: a.

Compute the mean, standard

deviation, and variance of a two dimensional random integer array along the second axis.

b. Create a 2-dimensional array of size m x n integer elements, also print the shape, type and data type of
the array and then reshape it into an n x m array, where n and m are user inputs given at the run time.

c. Test whether the elements of a given 1D array are zero, non-zero and NaN. Record the indices of these
elements in three separate arrays.

d. Create three random arrays of the same size: Array1, Array2 and Array3. Subtract Array 2 from Array3
and store in Array4. Create another array Array5 having two times the values in Array1. Find Co-variance and
Correlation of Array1 with Array4 and Array5 respectively.

e. Create two random arrays of the same size 10: Array1, and Array2. Find the sum of the first half of both
the arrays and product of the second half of both the arrays.

2. Do the following using PANDAS Series:

a. Create a series with 5 elements. Display the series sorted on index and also sorted on values
seperately
b. Create a series with N elements with some duplicate values. Find the minimum and maximum
ranks assigned to the values using ‘first’ and ‘max’ methods
c. Display the index value of the minimum and maximum element of a Series

import pandas as pd

# a. Create a series with 5 elements

s1 = [Link]([45, 12, 78, 23, 56], index=['e', 'a', 'd', 'b', 'c'])

print("Original Series:")

print(s1)

# Sorted by index

print("\nSeries Sorted by Index:")

print(s1.sort_index())
# Sorted by values

print("\nSeries Sorted by Values:")

print(s1.sort_values())

# b. Create a series with N elements containing duplicate values

s2 = [Link]([50, 20, 50, 40, 20, 60, 40])

print("\nSeries with Duplicate Values:")

print(s2)

# Ranking using 'first' method

print("\nRank using 'first' method:")

print([Link](method='first'))

# Ranking using 'max' method

print("\nRank using 'max' method:")

print([Link](method='max'))

# c. Display index of minimum and maximum element

s3 = [Link]([34, 67, 12, 89, 45], index=['A', 'B', 'C', 'D', 'E'])

print("\nSeries:")

print(s3)

print("\nIndex of Minimum Value:", [Link]())

print("Index of Maximum Value:", [Link]())

3. Create a data frame having at least 3 columns and 50 rows to store numeric data generated using
a random function. Replace 10% of the values by null values whose index positions are generated using
random function. Do the following: a. Identify and count missing values in a data frame.
b. Drop the column having more than 5 null values.
c. Identify the row label having maximum of the sum of all values in a row and drop that row.
d. Sort the data frame on the basis of the first column.
e. Remove all duplicates from the first column.
f. Find the correlation between first and second column and covariance between second and third
column.
g. Discretize the second column and create 5 bins.

import pandas as pd

import numpy as np

# Create DataFrame with 50 rows and 3 columns using random numbers

[Link](1)

df = [Link]([Link](1,100,size=(50,3)),

columns=['Col1','Col2','Col3'])

print("Original DataFrame:")

print(df)

# Replace 10% values with NaN at random positions

for i in range(int([Link] * 0.1)): # 10% of total values

r = [Link](0,50)

c = [Link](0,3)

[Link][r,c] = [Link]
print("\nDataFrame after inserting NaN values:")

print(df)

# a. Identify and count missing values

print("\nMissing values in each column:")

print([Link]().sum())

# b. Drop column having more than 5 null values

df = [Link][:, [Link]().sum() <= 5]

print("\nDataFrame after dropping columns with more than 5 null values:")

print(df)

# c. Find row with maximum row sum and drop it

row_label = [Link](axis=1).idxmax()

print("\nRow with maximum sum:", row_label)

df = [Link](row_label)

print("\nDataFrame after dropping that row:")

print(df)

# d. Sort the DataFrame by first column

df = df.sort_values(by=[Link][0])

print("\nSorted DataFrame by first column:")

print(df)

# e. Remove duplicates from first column

df = df.drop_duplicates(subset=[Link][0])

print("\nAfter removing duplicates from first column:")

print(df)
# f. Correlation between first and second column

if len([Link]) >= 2:

print("\nCorrelation between first and second column:")

print(df[[Link][0]].corr(df[[Link][1]]))

# Covariance between second and third column

if len([Link]) >= 3:

print("\nCovariance between second and third column:")

print(df[[Link][1]].cov(df[[Link][2]]))

# g. Discretize second column into 5 bins

if len([Link]) >= 2:

bins = [Link](df[[Link][1]], 5)

print("\nDiscretized second column into 5 bins:")

print(bins)

4. Consider two excel files having attendance of two workshos. Each file has three fields ‘Name’,
‘Date, duration (in minutes) where names are unique within a file. Note that duration may take one of three
values (30, 40, 50) only. Import the data into two data frames and do the following: a. Perform merging of
the two data frames to find the names of students who had attended both workshops.
b. Find names of all students who have attended a single workshop only.
c. Merge two data frames row-wise and find the total number of records in the data frame.
d. Merge two data frames row-wise and use two columns viz. names and dates as multi-row indexes.
Generate descriptive statistics for this hierarchical data frame.

import pandas as pd

# Import Excel files into DataFrames

df1 = pd.read_excel("[Link]")

df2 = pd.read_excel("[Link]")
print("Workshop 1 Data:")

print(df1)

print("\nWorkshop 2 Data:")

print(df2)

# a. Students who attended both workshops

both = [Link](df1, df2, on="Name")

print("\nStudents who attended both workshops:")

print(both["Name"])

# b. Students who attended only one workshop

only_df1 = df1[~df1["Name"].isin(df2["Name"])]

only_df2 = df2[~df2["Name"].isin(df1["Name"])]

single = [Link]([only_df1, only_df2])

print("\nStudents who attended only one workshop:")

print(single["Name"])

# c. Merge row-wise and count total records

combined = [Link]([df1, df2], axis=0)

print("\nTotal number of records after row-wise merge:")

print(len(combined))

# d. Row-wise merge and create multi-row index (Name, Date)

hier_df = combined.set_index (["Name", "Date"])

print("\nHierarchical DataFrame:")
print(hier_df)

print("\nDescriptive Statistics:")

print(hier_df.describe())

5. Using Iris data, plot the following with proper legend and axis labels: (Download IRIS data from:
[Link] or import it from sklearn datasets)
a. Plot bar chart to show the frequency of each class label in the data.
b. Draw a scatter plot for Petal width vs sepal width and fit a regression line
c. Plot density distribution for feature petal length.
d. Use a pair plot to show pairwise bivariate distribution in the Iris Dataset.
e. Draw heatmap for the four numeric attributes
f. Compute mean, mode, median, standard deviation, confidence interval and standard error for
each feature
g. Compute correlation coefficients between each pair of features and plot heatmap

import pandas as pd

import seaborn as sns

import [Link] as plt

from [Link] import load_iris

from scipy import stats

import numpy as np

# Load Iris dataset

iris = load_iris(as_frame=True)

df = [Link]

df['species'] = df['target'].map(dict(enumerate(iris.target_names)))

print("First 5 rows of dataset:")

print([Link]())

# a. Bar chart for frequency of each class label

[Link]()

df['species'].value_counts().plot(kind='bar')

[Link]("Frequency of each Iris Class")

[Link]("Species")

[Link]("Frequency")

[Link](["Count"])

[Link]()

# b. Scatter plot: Petal width vs Sepal width with regression line

[Link]()

[Link](x='sepal width (cm)', y='petal width (cm)', data=df)

[Link]("Petal Width vs Sepal Width")

[Link]("Sepal Width")

[Link]("Petal Width")

[Link]()

# c. Density distribution for Petal Length

[Link]()

[Link](df['petal length (cm)'], fill=True)

[Link]("Density Distribution of Petal Length")

[Link]("Petal Length")

[Link]()

# d. Pair plot

[Link](df, hue="species")

[Link]()
# e. Heatmap for numeric attributes

[Link]()

numeric_df = [Link][:,0:4]

[Link](numeric_df.corr(), annot=True)

[Link]("Heatmap of Numeric Attributes")

[Link]()

# f. Statistical measures

print("\nStatistical Measures:")

for col in numeric_df.columns:

mean = numeric_df[col].mean()

median = numeric_df[col].median()

mode = numeric_df[col].mode()[0]

std = numeric_df[col].std()

sem = [Link](numeric_df[col])

ci = [Link](0.95, len(numeric_df[col])-1, loc=mean, scale=sem)

print(f"\nFeature: {col}")

print("Mean:", mean)

print("Median:", median)

print("Mode:", mode)

print("Standard Deviation:", std)

print("Standard Error:", sem)

print("95% Confidence Interval:", ci)

# g. Correlation coefficients and heatmap

corr = numeric_df.corr()

print("\nCorrelation Matrix:")
print(corr)

[Link]()

[Link](corr, annot=True, cmap="coolwarm")

[Link]("Correlation Heatmap of Iris Features")

[Link]()

6. Consider the following data frame containing a family name, gender of the family member and
her/his monthly income in each record.
Write a program in Python using Pandas to perform the following:
a. Calculate and display familywise gross monthly income.
b. Calculate and display the member with the highest monthly income.
c. Calculate and display monthly income of all members with income greater than Rs. 60000.00.
d. Calculate and display the average monthly income of the female members
a. Find total number of passengers with age less than 30
b. Find total fare paid by passengers of first class
c. Compare number of survivors of each passenger class
d. Compute descriptive statistics for any numeric attribute gender wise

Name Gender MonthlyIncome

Shah Male 114000.00
Vats Male 65000.00
Vats Female 43150.00
Kumar Female 69500.00
Vats Female 155000.00
Kumar Male 103000.00
Shah Male 55000.00
Shah Female 112400.00
Kumar Female 81030.00
Vats Male 71900.00

import pandas as pd

# Creating the DataFrame

data = {
"Name": ["Shah", "Vats", "Vats", "Kumar", "Vats", "Kumar", "Shah", "Shah", "Kumar", "Vats"],

"Gender": ["Male", "Male", "Female", "Female", "Female", "Male", "Male", "Female", "Female", "Male"],

"MonthlyIncome": [114000.00, 65000.00, 43150.00, 69500.00, 155000.00, 103000.00, 55000.00, 112400.00,

81030.00, 71900.00]

df = [Link](data)

# a. Familywise gross monthly income

family_income = [Link]("Name")["MonthlyIncome"].sum()

print("Familywise Gross Monthly Income:")

print(family_income)

# b. Member with the highest monthly income

highest_income = [Link][df["MonthlyIncome"].idxmax()]

print("\nMember with Highest Monthly Income:")

print(highest_income)

# c. Members with income greater than Rs. 60000

income_above_60k = df[df["MonthlyIncome"] > 60000]

print("\nMembers with Monthly Income greater than Rs. 60000:")

print(income_above_60k)

# d. Average monthly income of female members

avg_female_income = df[df["Gender"] == "Female"]["MonthlyIncome"].mean()

print("\nAverage Monthly Income of Female Members:")

print(avg_female_income)

7. Load Titanic dataset from sklearn liabrary and do the following:

a. Find total number of passengers with age less than 30
b. Find total fare paid by passengers of first class
c. Compare number of survivors of each passenger class
d. Compute descriptive statistics for any numeric attribute genderwise
import pandas as pd
from [Link] import fetch_openml

# Load Titanic dataset

titanic = fetch_openml(name='titanic', version=1, as_frame=True)
df = [Link]

print("First 5 records of dataset:")

print([Link]())

# a. Total number of passengers with age < 30

age_less_30 = df[df['age'] < 30]
print("\nTotal passengers with age less than 30:", len(age_less_30))

# b. Total fare paid by passengers of first class

first_class_fare = df[df['pclass'] == 1]['fare'].sum()
print("\nTotal fare paid by passengers of first class:", first_class_fare)

# c. Compare number of survivors of each passenger class

survivors_by_class = df[df['survived'] == 1].groupby('pclass')['survived'].count()
print("\nNumber of survivors in each passenger class:")
print(survivors_by_class)

# d. Descriptive statistics for a numeric attribute (Fare) gender-wise

gender_stats = [Link]('sex')['fare'].describe()
print("\nDescriptive statistics of Fare gender-wise:")
print(gender_stats)

Data Visualization Practical by Vanshika Goyal
No ratings yet
Data Visualization Practical by Vanshika Goyal
31 pages
Data Visualization Practicals in Python
No ratings yet
Data Visualization Practicals in Python
31 pages
Variance and Data Analysis with NumPy
No ratings yet
Variance and Data Analysis with NumPy
28 pages
Data Manipulation and Visualization with Pandas
No ratings yet
Data Manipulation and Visualization with Pandas
46 pages
Python NumPy and Pandas Practical Tasks
No ratings yet
Python NumPy and Pandas Practical Tasks
33 pages
Python Data Science Comprehensive Cheat Sheet
No ratings yet
Python Data Science Comprehensive Cheat Sheet
19 pages
Data Science Practical Assignments
No ratings yet
Data Science Practical Assignments
21 pages
Python for Data Analysis and Visualization
No ratings yet
Python for Data Analysis and Visualization
10 pages
Python Data Analysis and Visualization Tasks
No ratings yet
Python Data Analysis and Visualization Tasks
12 pages
NumPy, Pandas, and Matplotlib Basics
No ratings yet
NumPy, Pandas, and Matplotlib Basics
30 pages
Data Science with Python: Key Techniques
No ratings yet
Data Science with Python: Key Techniques
21 pages
Machine Learning Labs: Python Code Examples
No ratings yet
Machine Learning Labs: Python Code Examples
43 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
31 pages
Python Data Analysis and Visualization
No ratings yet
Python Data Analysis and Visualization
14 pages
Data Analysis and Visualization Course
No ratings yet
Data Analysis and Visualization Course
4 pages
Data Analysis & Visualization Practical
No ratings yet
Data Analysis & Visualization Practical
26 pages
Data Analysis with Python Libraries
No ratings yet
Data Analysis with Python Libraries
18 pages
Guntur Menu: Python Data Operations
No ratings yet
Guntur Menu: Python Data Operations
7 pages
Data Analysis with Pandas and NumPy
No ratings yet
Data Analysis with Pandas and NumPy
10 pages
Excel and Data Science Practical Guide
No ratings yet
Excel and Data Science Practical Guide
47 pages
NumPy and Pandas Data Science Codes
No ratings yet
NumPy and Pandas Data Science Codes
18 pages
Generate and Analyze Synthetic Sales Data
No ratings yet
Generate and Analyze Synthetic Sales Data
11 pages
Python Data Analysis with NumPy & Pandas
No ratings yet
Python Data Analysis with NumPy & Pandas
24 pages
Data Analytics Lab Manual 2024-25
No ratings yet
Data Analytics Lab Manual 2024-25
26 pages
NumPy and Pandas Data Manipulation Guide
No ratings yet
NumPy and Pandas Data Manipulation Guide
11 pages
Data Analysis and Visualization Techniques
No ratings yet
Data Analysis and Visualization Techniques
14 pages
Python Data Analysis with Libraries
No ratings yet
Python Data Analysis with Libraries
70 pages
NumPy and Pandas Data Science Codes
No ratings yet
NumPy and Pandas Data Science Codes
2 pages
Data Manipulation with Pandas & NumPy
No ratings yet
Data Manipulation with Pandas & NumPy
9 pages
Python NumPy and Data Analysis Techniques
No ratings yet
Python NumPy and Data Analysis Techniques
26 pages
B. Sc. Programme / B. Sc. Mathematical Science SEC-1 Data Analysis Using Python Programming III 2019 Onwards
No ratings yet
B. Sc. Programme / B. Sc. Mathematical Science SEC-1 Data Analysis Using Python Programming III 2019 Onwards
4 pages
DS Output
No ratings yet
DS Output
15 pages
AI Lab Practical: BFS, DFS, EDA in Python
No ratings yet
AI Lab Practical: BFS, DFS, EDA in Python
23 pages
Excel Data Analysis Techniques Guide
No ratings yet
Excel Data Analysis Techniques Guide
27 pages
AD3301 - Dev Lab Manual
No ratings yet
AD3301 - Dev Lab Manual
27 pages
Python DataFrame Operations and Visualizations
No ratings yet
Python DataFrame Operations and Visualizations
17 pages
Data Analysis & Visualization with Python
No ratings yet
Data Analysis & Visualization with Python
24 pages
All Programs With Output
No ratings yet
All Programs With Output
9 pages
Da Pfile (Cuhp23cbb17) 111
No ratings yet
Da Pfile (Cuhp23cbb17) 111
33 pages
Final Project Class12 FormattedOutputs
No ratings yet
Final Project Class12 FormattedOutputs
55 pages
Kedar Dsbda Codes
No ratings yet
Kedar Dsbda Codes
18 pages
Keras MLP Regression with Krai 550
No ratings yet
Keras MLP Regression with Krai 550
77 pages
Data Statistical Functions in Pandas
No ratings yet
Data Statistical Functions in Pandas
12 pages
Python Data Analysis and Visualization Tasks
No ratings yet
Python Data Analysis and Visualization Tasks
28 pages
Python Data Analysis with NumPy and Pandas
No ratings yet
Python Data Analysis with NumPy and Pandas
14 pages
Data Science - BalaSunil
No ratings yet
Data Science - BalaSunil
14 pages
Data Analytics and Machine Learning Experiments
No ratings yet
Data Analytics and Machine Learning Experiments
27 pages
Python Statistical Data Analysis Tasks
No ratings yet
Python Statistical Data Analysis Tasks
6 pages
Data Cleaning and Visualization in Pandas
No ratings yet
Data Cleaning and Visualization in Pandas
5 pages
Machine Learning Course Manual
No ratings yet
Machine Learning Course Manual
36 pages
Data Analysis and Machine Learning Tasks
No ratings yet
Data Analysis and Machine Learning Tasks
1 page
ML Record PRG
No ratings yet
ML Record PRG
59 pages
Data Analysis & Visualization with Python
No ratings yet
Data Analysis & Visualization with Python
27 pages
SAI-Programs On Numpy and Pandas As On 05-01-2026
No ratings yet
SAI-Programs On Numpy and Pandas As On 05-01-2026
16 pages
P-Value Calculation in Python
No ratings yet
P-Value Calculation in Python
5 pages
Check Your Digital EC Status Online
No ratings yet
Check Your Digital EC Status Online
3 pages
IP Spoofing in Network Attacks
No ratings yet
IP Spoofing in Network Attacks
30 pages
TSB Business Current Account Terms
No ratings yet
TSB Business Current Account Terms
46 pages
Variational Autoencoder Architecture Explained
No ratings yet
Variational Autoencoder Architecture Explained
55 pages
RDSO Height Gauge Drawing Specifications
No ratings yet
RDSO Height Gauge Drawing Specifications
1 page
Nano Banana Photo Editing Guide
No ratings yet
Nano Banana Photo Editing Guide
1 page
Computer Science Grade XII Exam Questions
No ratings yet
Computer Science Grade XII Exam Questions
2 pages
Power Electronics Lecture Notes
No ratings yet
Power Electronics Lecture Notes
197 pages
Huge Page Caculator Tuning Red Hat Enterprise Linux For Oracle and Oracle RAC Performance - Red Hat Customer Portal
100% (1)
Huge Page Caculator Tuning Red Hat Enterprise Linux For Oracle and Oracle RAC Performance - Red Hat Customer Portal
22 pages
Overview of ASIC Design and VLSI in India
No ratings yet
Overview of ASIC Design and VLSI in India
463 pages
Millenium USB Charger Overview
No ratings yet
Millenium USB Charger Overview
13 pages
Overview of Mobile GIS Technology
No ratings yet
Overview of Mobile GIS Technology
15 pages
Daikin VRV Commissioning Guidelines
No ratings yet
Daikin VRV Commissioning Guidelines
2 pages
Quick Call Guide: Distress
No ratings yet
Quick Call Guide: Distress
5 pages
IC Auto Repair Receipt 11290 PDF
No ratings yet
IC Auto Repair Receipt 11290 PDF
1 page
Drawing Rectangles with Python Turtle
No ratings yet
Drawing Rectangles with Python Turtle
6 pages
Class IX Mathematics Exam Paper
No ratings yet
Class IX Mathematics Exam Paper
4 pages
Microsoft Word: Inserting Images & Screenshots
No ratings yet
Microsoft Word: Inserting Images & Screenshots
4 pages
Counting Techniques in Passwords
No ratings yet
Counting Techniques in Passwords
19 pages
Landscan
No ratings yet
Landscan
4 pages
SA37/T DRS71S4 Gearmotor Specifications
No ratings yet
SA37/T DRS71S4 Gearmotor Specifications
2 pages
Foam Injection Inspection Plan
No ratings yet
Foam Injection Inspection Plan
1 page
Object-Oriented Metrics Overview
No ratings yet
Object-Oriented Metrics Overview
28 pages
User Manual: in Case You Needed One
No ratings yet
User Manual: in Case You Needed One
178 pages
Feature-Constrained Stereo Image Matching
No ratings yet
Feature-Constrained Stereo Image Matching
5 pages
Hardware Acceleration of FIR Filter Implementation On ZYNQ SoC
No ratings yet
Hardware Acceleration of FIR Filter Implementation On ZYNQ SoC
6 pages
Maharashtra Labour Index Report 2024
No ratings yet
Maharashtra Labour Index Report 2024
3 pages
PRINCE2 Process and Product Quiz
No ratings yet
PRINCE2 Process and Product Quiz
13 pages
2021 Ford F-150 Keyless Entry Install Guide
No ratings yet
2021 Ford F-150 Keyless Entry Install Guide
12 pages
Grounding and Bonding for Flammable Tanks
No ratings yet
Grounding and Bonding for Flammable Tanks
4 pages