0% found this document useful (0 votes)
28 views39 pages

Data Analysis Lab Manual for Python

This lab manual outlines the course 'Data Analysis Using Python' for Bachelor of Technology students in Computer Science and Engineering. It includes a series of experiments covering Python installation, programming basics, data structures, and data manipulation using libraries like NumPy and Pandas. The manual also features mini projects and exercises to reinforce learning through practical application.

Uploaded by

chaitanyagarg50
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views39 pages

Data Analysis Lab Manual for Python

This lab manual outlines the course 'Data Analysis Using Python' for Bachelor of Technology students in Computer Science and Engineering. It includes a series of experiments covering Python installation, programming basics, data structures, and data manipulation using libraries like NumPy and Pandas. The manual also features mini projects and exercises to reinforce learning through practical application.

Uploaded by

chaitanyagarg50
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Bachelor of Technology

LAB MANUAL
For

Department of Computer Science and Engineering


(CSE)
Bachelor of Technology (AIML)

Course Name: Data Analysis Using Python


Course Code: 130202120
Semester: 2nd

Approved by:
Prepared by:
Ms. Shabda
Name: Chaitanya Garg
AIML Trainer
Reg. No.: 241302053

Faculty of Engineering and Technology


S. No. Name of Experiment Date Sign

1 Python Installation

2 Python Programming Basics

3 Python Data Structures

4 NumPy, Array & Vectorized Computation

5 Data Manipulation with Pandas.

6 Data Visualization in Python using Matplotlib and


Seaborn

7 Mini Projects

• Project-1_Chess Board
• Project-2_Quiz on Animal Name
• Project-3_Tic Tac Toe Game
Python Installation
Installing Anaconda on Windows

Experiment No. 1
• Download the Anaconda installer from the following link
[Link]

Double click the installer to launch.


Note If you encounter issues during installation, temporarily disable your anti-
virus software during install, then re-enable it after the installation concludes. If
you installed for all users, uninstall Anaconda and re-install it for your user only
and try again.
Click Next.
Read the licensing terms and click “I Agree”.
Select an install for “Just Me” unless you’re installing for all users (which requires
Windows Administrator privileges) and click Next.
Select a destination folder to install Anaconda and click the Next button.

• Choose whether to add Anaconda to your PATH environment variable. We


recommend not adding Anaconda to the PATH environment variable, since
this can interfere with other software. Instead, use Anaconda software by
opening Anaconda Navigator or the Anaconda Prompt from the Start
Menu.
• Choose whether to register Anaconda as your default Python. Unless you
plan on installing and running multiple versions of Anaconda or multiple
versions of Python, accept the default and leave this box checked.
• Click the Install button. If you want to watch the packages Anaconda is
installing, click Show Details.
• Click the Next button.
• Optional: To install PyCharm for Anaconda, click on the link to
[Link]
• Or to install Anaconda without PyCharm, click the Next button.
• After a successful installation you will see the “Thanks for installing
Anaconda” dialog box:
• If you wish to read more about [Link] and how to get started with
Anaconda, check the boxes “Anaconda Individual Edition Tutorial” and
“Learn more about Anaconda”.
• Click the Finish button.
Anaconda Navigator
Create New Notebook Document on Jupyter
Spyder
Python Programming Basics

Experiment No. 2
Q. Add Two Numbers with “+” Operator:
Here num1 and num2 are variables and we are going to add both variables
with the + operator in Python.

# Python3 program to add two numbers


num1 = 15
num2 = 12

# Adding two nos


sum = num1 + num2

# printing values
print("Sum of", num1, "and", num2 , "is", sum)

Output:
Sum of 15 and 12 is 27

Experiment No. 3
Q. Add Two Numbers with User Input:
In the below program to add two numbers in Python, the user is first asked to
enter two numbers, and the input is scanned using the Python input()
function and stored in the variables number1 and number2.
Then, the variable’s number1 and number2 are added using the arithmetic
operator +, and the result is stored in the variable sum.

# Python3 program to add two numbers


number1 = input("First number: ")
number2 = input("\n Second number: ")

# Adding two numbers


# User might also enter float numbers
sum = float(number1) + float(number2)

# Display the sum


# will print value in float
print("The sum of {0} and {1} is {2}" .format(number1,
number2, sum))

Output:
First number: 13.5 Second number: 1.54
The sum of 13.5 and 1.54 is 15.04

Experiment No. 4
Q. Add Two Numbers Using [Link]() Method:
Initialize two variables num1, and num2. Find sum using the [Link]() by
passing num1, and num2 as arguments and assign to sum. Display num1, num2
and sum.

# Python3 program to add two numbers

num1 = 15
num2 = 12

# Adding two nos


import operator
sum = [Link](num1,num2)

# printing values
print("Sum of {0} and {1} is {2}" .format(num1,
num2, sum))

Output
Sum of 15 and 12 is 27
Python Data Structures

Experiment No. 5
Q. Add Two Numbers in Python Using Function
This program show adding two numbers in Python using function. We can
define a function that accepts two integers and returns their sum.

#To define a function that take two integers


# and return the sum of those two numbers
def add(a,b):
return a+b

#initializing the variables


num1 = 10
num2 = 5

#function calling and store the result into sum_of_twonumbers


sum_of_twonumbers = add(num1,num2)

#To print the result


print("Sum of {0} and {1} is {2};" .format(num1,
num2, sum_of_two numbers))

Output
Sum of 10 and 5 is 15;

Experiment No. 6
Q. Write a program in Python to make a simple calculator.

# Python program for simple calculator

# Function to add two numbers


def add(num1, num2):
return num1 + num2

# Function to subtract two numbers


def subtract(num1, num2):
return num1 - num2

# Function to multiply two numbers


def multiply(num1, num2):
return num1 * num2

# Function to divide two numbers


def divide(num1, num2):
return num1 / num2

print("Please select operation -\n" \


"1. Add\n" \
"2. Subtract\n" \
"3. Multiply\n" \
"4. Divide\n")

Experiment No. 7

Q. Create a Python program to find sum of elements in list (using for loop)

total = 0

# creating a list
list1 = [11, 5, 17, 18, 23]

# Iterate each element in list


# and add them in variable total
for ele in range(0, len(list1)):
total = total + list1[ele]

# printing total value


print("Sum of all elements in given list: ", total)

Output
Sum of all elements in given list: 74
Experiment No. 8

Q. Write a Python program to find sum of elements in list (using while loop)

total = 0
ele = 0

# creating a list
list1 = [11, 5, 17, 18, 23]

# Iterate each element in list


# and add them in variable total
while(ele < len(list1)):
total = total + list1[ele]
ele += 1

# printing total value


print("Sum of all elements in given list: ", total)

Output
Sum of all elements in given list: 74

Experiment No. 9
Q. Write a Python Program to count the number of vowels in a string.
string=raw_input("Enter string:")
vowels=0
for i in string:
if(i=='a' or i=='e' or i=='i' or i=='o' or i=='u' or i=='A' or i=='E'
or i=='I' or i=='O' or i=='U'):
vowels=vowels+1
print("Number of vowels are:")
print(vowels)
Output:
Case 1:
Enter string:Hello world
Number of vowels are:
3

Case 2:
Enter string:WELCOME
Number of vowels are:
3

Experiment No. 10
Q. Create a Python program and check how many dimensions the
arrays have
import numpy as np

a = [Link](42)
b = [Link]([1, 2, 3, 4, 5])
c = [Link]([[1, 2, 3], [4, 5, 6]])
d = [Link]([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print([Link])
print([Link])
print([Link])
print([Link])

Output:
0
1
2
3
Experiment No. 11
Q. Create an array with 5 dimensions using ndmin using a vector with
values 1,2,3,4 and verify that last dimension has value 4
import numpy as np

arr = [Link]([1, 2, 3, 4], ndmin=5)

print(arr)
print('shape of array :', [Link])
Output:
[[[[[1 2 3 4]]]]]
shape of array : (1, 1, 1, 1, 4)

Experiment No. 12
Q. Convert the following 1-D array with 12 elements into a 2-D array.

The outermost dimension will have 4 arrays, each with 3 elements


import numpy as np

arr = [Link]([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = [Link](4, 3)

print(newarr)

Output:
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
Experiment No. 13
Q. Write Python code to demonstrate trigonometric function.

import numpy as np
# create an array of angles
angles = [Link]([0, 30, 45, 60, 90, 180])

# conversion of degree into radians


# using deg2rad function
radians = np.deg2rad(angles)

# sine of angles
print('Sine of angles in the array:')
sine_value = [Link](radians)
print([Link](radians))

# inverse sine of sine values


print('\n\n Inverse Sine of sine values:')
print(np.rad2deg([Link](sine_value)))

# hyperbolic sine of angles


print('\n\n Sine hyperbolic of angles in the array: ')
sineh_value = [Link](radians)
print([Link](radians))
# inverse sine hyperbolic
print('\n\n Inverse Sine hyperbolic:')
print([Link](sineh_value))

# hypot function demonstration


base = 4
height = 3
print('\n\n hypotenuse of right triangle is:')
print([Link](base, height))

Output:
Sine of angles in the array:
[0.00000000e+00 5.00000000e-01 7.07106781e-01 8.66025404e-01
1.00000000e+00 1.22464680e-16]

Inverse Sine of sine values:


[0.0000000e+00 3.0000000e+01 4.5000000e+01 6.0000000e+01 9.00000
00e+01
7.0167093e-15]

Sine hyperbolic of angles in the array:


[ 0. 0.54785347 0.86867096 1.24936705 2.3012989 11.
54873936]

Inverse Sine hyperbolic:


[ 0. 0.52085606 0.76347126 0.94878485 0.74483916 -0.
85086591]

hypotenuse of right triangle is:


5.0
Experiment No. 14
Q. Python code demonstrate statistical function.

import numpy as np
# construct a weight array
weight = [Link]([50.7, 52.5, 50, 58, 55.63, 73.25, 49.5, 45])

# minimum and maximum


print('Minimum and maximum weight of the students: ')
print([Link](weight), [Link](weight))

# range of weight i.e. max weight-min weight


print('\n Range of the weight of the students: ')
print([Link](weight))

# percentile
print('\n Weight below which 70 % student fall: ')
print([Link](weight, 70))

# mean
print('\n Mean weight of the students: ')
print([Link](weight))
# median
print('\n Median weight of the students: ')
print([Link](weight))

# standard deviation
print('\n Standard deviation of weight of the students: ')
print([Link](weight))

# variance
print('\n Variance of weight of the students: ')
print([Link](weight))

# average
print('\n Average weight of the students: ')
print([Link](weight))

Output:
Minimum and maximum weight of the students:
45.0 73.25

Range of the weight of the students:


28.25

Weight below which 70 % student fall:


55.317

Mean weight of the students:


54.3225

Median weight of the students:


51.6

Standard deviation of weight of the students:


8.052773978574091

Variance of weight of the students:


64.84716875

Average weight of the students:


54.3225

Experiment No. 15
Q. (a) creating series from ndarray
data = [Link](['a','b','c','d'])
print(data)
s_arr = [Link](data)
print(s_arr)

Output:
['a' 'b' 'c' 'd']
0 a
1 b
2 c
3 d
dtype: object

(b) creating series from dict


data = {'a' : 0., 'bat' : 1., 10: 2.3}
s_dict = [Link](data)
print(s_dict)

Output:
a 0.0
bat 1.0
10 2.3
dtype: float64

(c) changing the index


s_change = [Link](data,['bat',2,10])
print(s_change)

Output:
bat 1.0
2 NaN
10 2.3
dtype: float64

Experiment No. 16
Q. Create a DataFrame in Python using Pandas:
import pandas as pd
data = {'state':['Ohio', 'Delhi', 'Ohio', 'Nevada','Nevada','Nevada'],
'year':[2000,2003 ,2001 ,2000, 2001, 2002],
'pop':[1.5, 1.6, 5,3.9, 4, 6.7]}

#data
frame = [Link](data)
print(frame)
frame = [Link](data,[10,11,12,13,14,15],['year','state','pop'])
frame
Output:
state year pop
0 Ohio 2000 1.5
1 Delhi 2003 1.6
2 Ohio 2001 5.0
3 Nevada 2000 3.9
4 Nevada 2001 4.0
5 Nevada 2002 6.7

year state pop

10 2000 Ohio 1.5

11 2003 Delhi 1.6

12 2001 Ohio 5.0

13 2000 Nevada 3.9

14 2001 Nevada 4.0

15 2002 Nevada 6.7

Experiment No. 17

Q. (a) Reorder the existing data to match a new set of labels.


(b) Insert missing value (NA) markers in label locations where
no data for the label existed.

import pandas as pd
# Create dataframe
info = [Link]({"P":[4, 7, 1, 8, 9],
"Q":[6, 8, 10, 15, 11],
"R":[17, 13, 12, 16, 14],
"S":[15, 19, 7, 21, 9]},
index =["Parker", "William", "Smith", "Terry", "Phill"])
# Print dataframe
info
Output:

# reindexing with new index values


info1=[Link](["P", "B", "R", "D", "E"],axis=1)
info1
Output:

# filling the missing values by 100


[Link](["A", "B", "C", "D", "E",'F'], fill_value =100)
Output:
Experiment No. 18

import pandas as pd
import numpy as np
N=5
df = [Link]({
'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
'x': [Link](0,stop=N-1,num=N),
'y': [Link](N),
'C': [Link](['Low','Medium','High'],N).tolist(),
'D': [Link](100, 10, size=(N)).tolist()
})
Df

Output:
Experiment No. 19

Matplotlib Inline
The output of plotting commands is displayed inline within frontends like the Jupyter
notebook, directly below the code cell that produced it. The resulting plots will then also
be stored in the notebook document. %matplotlib inline should be the first
command before import command

#%matplotlib inline

import numpy as np
import [Link] as plt
data = [Link](10)
print(data)
[Link](data)
[Link]()
[0 1 2 3 4 5 6 7 8 9]
x = [Link](0.0, 6.0, 0.01)
[Link](x, [x**2 for x in x]);
#[Link]()

Multiline Plot

plots are reset after each cell is evaluated, so for more complex plots you must put all of
the plotting commands in a single notebook cell

x = [Link](0.0, 6.0, 0.01)


[Link](x, [x**2 for x in x])
[Link](x, [x**2.25 for x in x])
[Link](x, [x**2.5 for x in x]);
Experiment No. 20
Seaborn Session
import numpy as np
from matplotlib import pyplot as plt

def sinplot(flip=0.5):
x = [Link](0, 14, 100)
for i in range(1, 5):
[Link](x, [Link](x + i * .5) * (7 - i) * flip)

sinplot()
[Link]()
import seaborn as sns
sns.set_style("dark")
sinplot()
[Link]()
sns.set_style("white")
sinplot()
[Link]()
[Link]()
Qualitative or categorical palettes are best suitable to plot the categorical data.

current_palette = sns.color_palette()
[Link](current_palette)
[Link]()

Appending an additional character ‘s’ to the color passed to the color parameter will plot
the Sequential plot.
current_palette = sns.color_palette()
[Link](sns.color_palette("Reds"))
[Link]()

Diverging palettes use two different colors. Each color represents variation in the value
ranging from a common point in either direction.

Assume plotting the data ranging from -1 to 1. The values from -1 to 0 takes one color
and 0 to +1 takes another color.

By default, the values are centered from zero. You can control it with parameter center
by passing a value.

current_palette = sns.color_palette()
[Link](sns.color_palette("BrBG", 7))
[Link]()
Project-1
Create a Chessboard with Python
To create a chessboard with the Python programming language, we will use two
Python libraries; Matplotlib for visualization, and NumPy for building an algorithm
which will help us to create and visualize a chessboard. Let’s see how we can code
to create and visualize a chessboard:

Source Code
Project-2
Create a Quiz Game with Python

We will create a quiz game with Python. I will create an animal quiz here. Even
though the questions are about animals, this quiz can be easily changed to cover
any other topic.
The Quiz game asks the player questions about animals. They have three chances
to answer each question you don’t want to take the quiz too difficult. Each correct
answer will score a point. At the end of the game, the program will reveal the
player’s final score.
This quiz game uses a function; a block of code with a name that performs a
specific task. A function allows you to use the same code several times, without
having to type everything each time. Python has a lot of built-in functions, but it
also allows you to create your functions.
The program should continue to check if there are any questions to ask and if the
player has exhausted all his chances. The score is stored in a variable during the
game. Once all the questions have been answered, the game ends.
Source Code

Let’s Create the Quiz Game with Python


Project-3
Tic Tac Toe GUI with Python

Here we will introduce an advanced Python project on Tic Tac Toe GUI with
Python. This game is very popular and is quite simple in itself. It is a two-player
game. In this game, there is a board with 3×3 squares.
In this game a player can choose between two symbols with his opponent, the
usual games use “X” and “O”. If the first player chooses “X”, then the second
player must play with “O” and vice versa.
A player marks one of the 3×3 squares with his symbol (perhaps “X” or “O”) and
he aims to create a straight line horizontally or vertically or diagonally with two
intensions:
1. Create a straight line before your opponent to win the game.
2. Prevent his opponent from creating a straight line first.
If no one can logically create a straight line with its symbol, the game ends in a tie.
So there are only three possible outcomes: one player wins, his opponent (human
or computer) wins, or there is a tie.

There are two basic logics in this game; when both players are human, and when
one is a computer.
Output

X---------------------------------------X----------------------------------------X

Common questions

Powered by AI

NumPy's array functionality simplifies vectorized computations by allowing operations to be performed element-wise efficiently without explicit loops, enhancing both performance and code readability. Examples include creating multidimensional arrays, reshaping arrays, and performing mathematical operations like addition, multiplication, or computing trigonometric functions across all elements. These capabilities make handling large datasets easier and more intuitive.

Jupyter Notebook is ideal for data visualization, exploration, and collaborative work, offering an interactive platform where code, outputs, and documentation coexist seamlessly. Spyder provides a more traditional IDE experience, suitable for robust software development with debugging, profiling, and complex project handling capabilities. Jupyter is preferred for data-centric tasks and education, while Spyder is better for developing large-scale applications or scripts requiring systematic debugging.

To install Anaconda on Windows, download the installer from the Anaconda website and launch it. Temporarily disable your antivirus software during installation and re-enable it afterwards if issues occur. If installed for all users initially, uninstall and reinstall it for just one user to avoid permission issues. Select 'Just Me' to avoid needing Windows Administrator privileges. Opt not to add Anaconda to PATH; use Anaconda Navigator or Anaconda Prompt instead. Choose whether Anaconda should be the default Python unless multiple versions are needed. Complete the installation by following the prompts.

Calculating the sum of elements in a list using a for loop or a while loop depends on the loop structure and readability. A for loop is generally more efficient for iterating over a fixed list size, as it reduces the potential for errors and enhances readability by directly iterating over elements. A while loop, while flexible, requires manual index handling and condition management, which can lead to errors and is often less efficient.

Palettes in Seaborn enhance visual understanding by providing coherent and aesthetically pleasing color schemes that help distinguish categorical data or gradient values clearly. They improve the communicability of plots by making patterns and insights more discernible, supporting a viewer's ability to mentally connect data points with respective categories or trends, thus facilitating more effective data storytelling and decision-making processes.

Creating a DataFrame in Pandas involves organizing data into a tabular format, typically using a dictionary where keys are column names. Indexing in DataFrames is crucial for operations such as data slicing, selection, and alignment. Proper indexing allows efficient access and mutation of data, supports label-based slicing, and helps in handling missing data or merging datasets. Reindexing enables alignment with new data, which is essential for maintaining consistency across transformations.

Statistical functions in NumPy provide essential tools for summarizing and understanding datasets, such as calculating mean, median, variance, and percentiles. These functions support data analysis by facilitating the understanding of data distribution and variance. In a simple analytics project, these functions could help identify trends in data, compute summary statistics, or identify outliers, significantly aiding in initial data exploration and hypothesis testing.

Python's operator.add() method performs addition similar to the '+' operator but allows function-style usage, beneficial in higher-order functions or lambdas where a function reference is necessary. This approach enables passing the addition operation where an operation must be dynamically chosen at runtime, enhancing flexibility. Use cases include functional programming patterns that require operation abstraction.

Using functions for arithmetic operations in Python provides modularity, reusability, and clarity to programs. In a simple calculator application, functions encapsulate each operation—addition, subtraction, multiplication, division—thus allowing easy updates and debugging. This approach also supports code reuse, as the same function can be called multiple times with different arguments, reducing redundancy.

Using numpy.hypot is advantageous in scenarios involving large-scale computations where calculating the hypotenuse of right triangles is required, such as in graphics processing or scientific computations involving vector magnitudes. The function handles numerical precision and overflow errors gracefully, offering an efficient vectorized operation compared to manual calculations with basic arithmetic operations, enhancing performance and accuracy.

You might also like