Week 5 :- Numerical Python
Vectorized Array Operations
NumPy vectorization involves performing mathematical operations on entire arrays,
eliminating the need to loop through individual elements.
Example:
import numpy as np
array1 = [Link]([1, 2, 3, 4, 5 ])
number = 10
# number sums up with each array element
result = array1 + number
print(result)
Output
[11 12 13 14 15]
Example: Numpy Vectorization to Add Two Arrays Together
import numpy as np
# define two 2D arrays
array1 = [Link]([[1, 2, 3], [4, 5, 6]])
array2 = [Link]([[0, 1, 2], [0, 1, 2]])
# add two arrays (vectorization)
array_sum = array1 + array2
print("Sum between two arrays:\n", array_sum)
Output
Sum between two arrays:
[[1 3 5]
[4 6 8]]
Example:
• x=[Link]([1,2,3]); y=[Link]([4,5,6])
• print(x+y)
• print(x*y)
• print([Link](x))
• print([Link](x))
Linear Algebra Operations
The Linear Algebra module of NumPy offers various methods to apply linear algebra on any
numpy array. One can find:
rank, determinant, trace, etc. of an array.
eigen values of matrices
matrix and vector products (dot, inner, outer,etc. product), matrix exponentiation
solve linear or tensor equations and much more!
Example:
import numpy as np
A = [Link]([[6, 1, 1],
[4, -2, 5],
[2, 8, 7]])
# Rank of a matrix
print("Rank of A:", [Link].matrix_rank(A))
# Trace of matrix A
print("Trace of A:", [Link](A))
# Determinant of a matrix
print("Determinant of A:", [Link](A))
# Inverse of matrix A
print("Inverse of A:\n", [Link](A))
print("Matrix A raised to power 3:\n", [Link].matrix_power(A, 3))
Output
Rank of A: 3
Trace of A: 11
Determinant of A: -306.0
Inverse of A:
[[ 0.17647059 -0.00326797 -0.02287582]
[ 0.05882353 -0.13071895 0.08496732]
[-0.11764706 0.1503268 0.05228758]]
Matrix A raised to power 3:
[[336 162 228]
[406 162 469]
[698 702 905]]
Matrix and Vector Products
Using dot() function: It returns the dot product of two arrays. It works with both 1D (vectors)
and 2D (matrices). When used on 2D arrays, it performs matrix multiplication.
For N dimensions it is a sum product over the last axis of a and the second-to-last of b:
dot(a, b)[i, j, k, m] = sum(a[i, j, :] * b[k, :, m])
import numpy as np
# Scalars
prod = [Link](5, 4)
print("Dot Product of scalar values:", prod)
# 1D array
a = 2 + 3j
b = 4 + 5j
res = [Link](a, b)
print("Dot Product:", res)
Output
Dot Product of scalar values: 20
Dot Product: (-7+22j)
Explanation:
a = 2 + 3j
b = 4 + 5j
Now, dot product
= 2(4 + 5j) + 3j(4 - 5j)
= 8 + 10j + 12j - 15
= -7 + 22j
Using vdot() function: This function also returns the dot product, but with one key difference
it takes the complex conjugate of the first argument before multiplying. This makes it useful
for complex-valued arrays.
import numpy as np
# 1D array
a = 2 + 3j
b = 4 + 5j
res = [Link](a, b)
print("Dot Product:",res)
Output
Dot Product: (23-2j)
Explanation:
a = 2 + 3j
b = 4 + 5j
As per method, take conjugate of a i.e. 2 - 3j
Now, dot product = 2(4 - 5j) + 3j(4 - 5j)
= 8 - 10j + 12j + 15
= 23 - 2j
Common Matrix and Vector Product Functions
The following table lists commonly used NumPy functions for performing various types of
matrix and vector multiplications:
FUNCTION DESCRIPTION
matmul() Matrix product of two arrays.
inner() Inner product of two arrays.
outer() Compute the outer product of two vectors.
Compute the dot product of two or more arrays in a single function
linalg.multi_dot()
call, while automatically selecting the fastest evaluation order.
Compute tensor dot product along specified axes for arrays >= 1-
tensordot()
D.
einsum() Evaluates the Einstein summation convention on the operands.
Evaluates the lowest cost contraction order for an einsum
einsum_path()
expression by considering the creation of intermediate arrays.
linalg.matrix_power() Raise a square matrix to the (integer) power n.
kron() Kronecker product of two arrays.
Solving Equations and Inverting Matrices
Using [Link]() function: It is used to find the exact solution of a linear system of equations
of the form Ax = b, where:
A is the coefficient matrix
b is the constant matrix
x is the unknown we want to solve for
import numpy as np
a = [Link]([[1, 2], [3, 4]])
b = [Link]([8, 18])
print(("Solution of linear equations:", [Link](a, b)))
Output
('Solution of linear equations:', array([2., 3.]))
Using [Link]() function: finds the best possible fit minimizing the difference between
When a system doesn’t have an exact solution (for example, when it’s overdetermined),
predicted and actual values.
import numpy as np
import [Link] as plt
# x co-ordinates
x = [Link](0, 9)
A = [Link]([x, [Link](9)])
# linearly generated sequence
y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24]
# obtaining the parameters of regression line
w = [Link](A.T, y)[0]
# plotting the line
line = w[0]*x + w[1] # regression line
[Link](x, line, 'r-')
[Link](x, y, 'o')
[Link]()
Output
Common Functions for Solving and Inverting Matrices
Below are some of the most useful functions to solve, invert or compute pseudoinverses of
matrices:
FUNCTION DESCRIPTION
[Link]() Solve the tensor equation a x = b for x.
[Link]() Compute the (multiplicative) inverse of a matrix.
[Link]() Compute the (Moore-Penrose) pseudo-inverse of a matrix.
[Link]() Compute the ‘inverse’ of an N-dimensional array.
Special Functions in NumPy Linear Algebra
1. Finding the Determinant - [Link](): The determinant is a number that can be
calculated from a square matrix. It helps determine whether a matrix is invertible and is often
used in solving systems of linear equations.
import numpy as np
A = [Link]([[6, 1, 1],
[4, -2, 5],
[2, 8, 7]])
print(("Determinant of A:", [Link](A)))
Output
('Determinant of A:', np.float64(-306.0))
2. Finding the Trace - [Link](): The trace of a matrix is the sum of its diagonal
elements. It’s often used in linear algebra and statistics.
import numpy as np
A = [Link]([[6, 1, 1],
[4, -2, 5],
[2, 8, 7]])
print("Trace of A:", [Link](A))
Output
Trace of A: 11
Common Special Functions
Here’s a list of other specialized linear algebra functions available in NumPy:
FUNCTION DESCRIPTION
[Link]() Matrix or vector norm.
[Link]() Compute the condition number of a matrix.
[Link].matrix_rank() Return matrix rank of array using SVD method
[Link]() Cholesky decomposition.
[Link]() Compute the qr factorization of a matrix.
FUNCTION DESCRIPTION
[Link]() Singular Value Decomposition.
Matrix Eigenvalues Functions in NumPy
NumPy provides functions in its linalg (Linear Algebra) module to calculate eigenvalues and
eigenvectors of matrices.
Using [Link]() function: It is used for Hermitian (complex symmetric) or real symmetric
matrices. It returns two values:
An array of eigenvalues
A matrix of eigenvectors (each column corresponds to one eigenvalue)
import numpy as np
from numpy import linalg as geek
# Creating an array using array function
a = [Link]([[1, -2j], [2j, 5]])
print("Array is:\n", a)
# Calculating eigenvalues and eigenvectors using eigh() function
c, d = [Link](a)
print("Eigenvalues are:", c)
print("Eigenvectors are:\n", d)
Output
Array is:
[[ 1.+0.j -0.-2.j]
[ 0.+2.j 5.+0.j]]
Eigenvalues are: [0.17157288 5.82842712]
Eigenvectors are:
[[-0.92387953+0.j -0.38268343+0.j ]
[ 0. +0.38268343j 0. -0.92387953j]]
Using [Link]() function: It is used for general square matrices (not necessarily symmetric).
It also returns both eigenvalues and eigenvectors.
import numpy as np
from numpy import linalg as geek
# Creating an array using diag function
a = [Link]((1, 2, 3))
print("Array is:\n", a)
# Calculating eigenvalues and eigenvectors using eig() function
c, d = [Link](a)
print("Eigenvalues are:", c)
print("Eigenvectors are:\n", d)
Output
Array is:
[[1 0 0]
[0 2 0]
[0 0 3]]
Eigenvalues are: [1. 2. 3.]
Eigenvectors are:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
Common Eigenvalue Functions
This table summarizes various functions related to eigenvalue and eigenvector computations:
FUNCTION DESCRIPTION
[Link]() Compute the eigenvalues of a general matrix.
[Link](a[, Compute the eigenvalues of a complex Hermitian or real
UPLO]) symmetric matrix.
Pseudo Random number generation in python
Many computer applications need random numbers to be generated. However, none of them
generate an actual random number. Python, like any other programming language, uses a
pseudo-random generator. Python's random generation is based on the Mersenne Twister
algorithm that produces 53-bit precision floats. The technique is fast and thread-safe but not
suitable for cryptographic purposes. Python's standard library contains the random module
which defines various functions for handling randomization.
The following functions handle random integer number generation.
[Link]() ? This function initializes the random number generator. When the
random module is imported, the generator is initialized with the help of system time.
To reseed the generator, use any int, str, byte or byte array object.
[Link]() ? This function along with setstate() function helps in reproducing
the same random data again and again. The getstate() function returns the internal state
of random number generator.
[Link]() ? This function generates a random integer between a given range.
It can take three parameters.
Using '[Link]()'
The [Link]() method is used to generate the random numbers. It is done with the help of
a pseudo-random number generator by performing some operation on the given value. The
random module creates a random number in Python using the seed value as a base.
Example
We can provide an int, str, bytes, or byte array to reseed the generator, ensuring reproducible
results.
import random
# Seed with a specific value
[Link](10)
# Will always produce the same output if seeded with 10(base)
print([Link]())
[Link](12)
print([Link]())
Following is the output of the above code ?
0.5714025946899135
0.4745706786885481
Using '[Link]()'
The [Link] method is used to retrieve an object capturing the current internal state of
the random number generator. This object can later be passed to the setstate() method to restore
the generator to this state.
Example
In the below example, getstate() returns the current internal state of the generator, which can
be saved to reproduce the same sequence of random numbers later. And the setstate(state)
function restores the saved state.
import random
state = [Link]()
# Generate some random numbers
print([Link](1, 100))
print([Link](1, 100))
# Restore the saved state
[Link](state)
# Generates the same random numbers again
print([Link](1, 100))
print([Link](1, 100))
Output
For every each time, we run the above program it generates random numbers.
70
60
70
60
Using '[Link]()'
This [Link]() method returns a randomly selected element from the given range.
This method accepts two parameters start, and stop. Therefore, it generates random numbers
between the start value and the end value of the range.
Example
In the below example, [Link](start, stop=None) Generates a random integer within
a specified range. the start is inclusive, while stop is exclusive.
Open Compiler
import random
# Random integer from 0 to 9
print([Link](10))
# Random integer from 10 to 19
print([Link](10, 20))
Output
For every each time, we run the above program it generates random numbers based on the
given start and stop values(range).
15
Random Walks
A random walk is a process where each step you take is determined randomly, often used to
model unpredictable movement. which is used to characterize a path that consists of a sequence
of random moves.
A simple random walk can be one-dimensional where a particle has to move either towards left
or the right without any bias. With regards to the concept of random walk applied in higher
dimensions like 2D, 3D and 4D, movement is made in random directions in the particular
dimensions.
Each of the additional dimensions leads to additional difficulty of the walk and offers more
information on random processes and space searching. These are theories with Python code for
Random Walk in 1D, 2D, 3D, & 4D to explain how they can be simulated computer graphics.
Installation of Required Libraries
1. NumPy
The NumPy library for numerical computations in Python, useful for handling arrays and
performing mathematical operations.
Syntax
pip install numpy
2. Matplotlib
The Matplotlib is a plotting library for creating static, animated, and interactive visualizations
in Python.
Syntax
pip install matplotlib
Implementation of 1D Random Walk
The following code is for 1D random walk implementation in Python −
import numpy as np
import [Link] as plt
def random_walk_1d(steps):
"""Generate a 1D random walk."""
walk = [Link](steps)
for i in range(1, steps):
step = [Link]([-1, 1])
walk[i] = walk[i - 1] + step
return walk
# Number of steps
steps = 1000
walk = random_walk_1d(steps)
# Plot the random walk
[Link](figsize=(10, 6))
[Link](walk, label='1D Random Walk')
[Link]('Steps')
[Link]('Position')
[Link]('1D Random Walk')
[Link]()
[Link]()
Output
Code Explanation
Imports − numpy for numerical operations. [Link] for plotting the
walk. random_walk_1d
Function − Creates a random walk in one dimension.
For each step, randomly decide to move left (-1) or right (+1).
Accumulates these steps to compute the position at each step.
Plotting − Plots the position against number of steps in order to visualise the random
walk.
Implementation of 2D Random Walk
The following code is for 2D random walk implementation in Python −
import numpy as np
import [Link] as plt
def random_walk_2d(steps):
"""Generate a 2D random walk."""
positions = [Link]((steps, 2))
for i in range(1, steps):
step = [Link](['up', 'down', 'left', 'right'])
if step == 'up':
positions[i] = positions[i - 1] + [0, 1]
elif step == 'down':
positions[i] = positions[i - 1] + [0, -1]
elif step == 'left':
positions[i] = positions[i - 1] + [-1, 0]
elif step == 'right':
positions[i] = positions[i - 1] + [1, 0]
return positions
# Number of steps
steps = 1000
positions = random_walk_2d(steps)
# Plot the random walk
[Link](figsize=(10, 10))
[Link](positions[:, 0], positions[:, 1], label='2D Random Walk')
[Link]('X Position')
[Link]('Y Position')
[Link]('2D Random Walk')
[Link]()
[Link](True)
[Link]()
Output
Code Explanation
Moves in one of the four ways − upwards, downwards, leftwards or rightwards. Updates the
position accordingly and records each step.
Implementation of 3D Random Walk
The following code is for 3D random walk implementation in Python −
import numpy as np
import [Link] as plt
from mpl_toolkits.mplot3d import Axes3D
def random_walk_3d(steps):
"""Generate a 3D random walk."""
positions = [Link]((steps, 3))
for i in range(1, steps):
step = [Link](['x+', 'x-', 'y+', 'y-', 'z+', 'z-'])
if step == 'x+':
positions[i] = positions[i - 1] + [1, 0, 0]
elif step == 'x-':
positions[i] = positions[i - 1] + [-1, 0, 0]
elif step == 'y+':
positions[i] = positions[i - 1] + [0, 1, 0]
elif step == 'y-':
positions[i] = positions[i - 1] + [0, -1, 0]
elif step == 'z+':
positions[i] = positions[i - 1] + [0, 0, 1]
elif step == 'z-':
positions[i] = positions[i - 1] + [0, 0, -1]
return positions
# Number of steps
steps = 1000
positions = random_walk_3d(steps)
# Plot the random walk
fig = [Link](figsize=(10, 10))
ax = fig.add_subplot(111, projection='3d')
[Link](positions[:, 0], positions[:, 1], positions[:, 2], label='3D Random Walk')
ax.set_xlabel('X Position')
ax.set_ylabel('Y Position')
ax.set_zlabel('Z Position')
ax.set_title('3D Random Walk')
[Link]()
[Link]()
Output
Code Explanation
Moves in one of six directions − x+, x-, y+, y-, z+, or z-. Updates the position in three
dimensions.
Plotting − 2D random walk is plotted with matplotlib. 3D random walk is plotted using
mpl_toolkits.mplot3d for three-dimensional visualization.
Real-World Applications
1. In computer networks, random walks can model the number of transmission packets
buffered at a server.
2. In population genetics, a random walk describes the statistical properties of genetic
drift.
3. In image segmentation, random walks are used to determine the labels (i.e., "object" or
"background") to associate with each pixel.
4. In brain research, random walks and reinforced random walks are used to model
cascades of neuron firing in the brain.
5. Random walks have also been used to sample massive online graphs such as online
social networks.
Pandas
The Pandas or Python Data Analysis Library is another important tool in Data Sciences and
provides utilities that help in Data Analysis.
pandas is a fast, powerful, flexible and easy to use open source data analysis and
manipulation tool, built on top of the Python programming language.
Pandas is built on top of NumPy. It also provides a huge number of functions and can be
accessed by installing the pandas library.
Python Pandas Data Structures
Data structures in Pandas are designed to handle data efficiently. They allow for the
organization, storage, and modification of data in a way that optimizes memory usage and
computational performance. Python Pandas library provides two primary data structures for
handling and analyzing data −
Series
DataFrame
In general programming, the term "data structure" refers to the method of collecting,
organizing, and storing data to enable efficient access and modification. Data structures are
collections of data types that provide the best way of organizing items (values) in terms of
memory usage.
Pandas is built on top of NumPy and integrates well within a scientific computing environment
with many other third-party libraries. This tutorial will provide a detailed introduction to these
data structures.
Dimension and Description of Pandas Data Structures
Data Dimensions Description
Structure
Series 1 A one-dimensional labeled homogeneous array, sizeimmutable.
Data Frames 2 A two-dimensional labeled, size-mutable tabular structure with
potentially heterogeneously typed columns.
Working with two or more dimensional arrays can be complex and time-consuming, as users
need to carefully consider the data's orientation when writing functions. However, Pandas
simplifies this process by reducing the mental effort required. For example, when dealing with
tabular data (DataFrame), it's more easy to think in terms of rows and columns instead of axis
0 and axis 1.
Mutability of Pandas Data Structures
All Pandas data structures are value mutable, meaning their contents can be changed. However,
their size mutability varies −
Series − Size immutable.
DataFrame − Size mutable.
Series
A Series is a one-dimensional labeled array that can hold any data type. It can store integers,
strings, floating-point numbers, etc. Each value in a Series is associated with a label (index),
which can be an integer or a string.
Name Steve
Age 35
Gender Male
Rating 3.5
Example
Consider the following Series which is a collection of different data types
import pandas as pd
data = ['Steve', '35', 'Male', '3.5']
series = [Link](data, index=['Name', 'Age', 'Gender', 'Rating'])
print(series)
On executing the above program, you will get the following output −
Name Steve
Age 35
Gender Male
Rating 3.5
dtype: object
Key Points
Following are the key points related to the Pandas Series.
Homogeneous data
Size Immutable
Values of Data Mutable
DataFrame
A DataFrame is a two-dimensional labeled data structure with columns that can hold different
data types. It is similar to a table in a database or a spreadsheet. Consider the following data
representing the performance rating of a sales team −
Name Age Gender Rating
Steve 32 Male 3.45
Lia 28 Female 4.6
Vin 45 Male 3.9
Katie 38 Female 2.78
Example
The above tabular data can be represented in a DataFrame as follows −
import pandas as pd
# Data represented as a dictionary
data = {
'Name': ['Steve', 'Lia', 'Vin', 'Katie'],
'Age': [32, 28, 45, 38],
'Gender': ['Male', 'Female', 'Male', 'Female'],
'Rating': [3.45, 4.6, 3.9, 2.78]
# Creating the DataFrame
df = [Link](data)
# Display the DataFrame
print(df)
Output
On executing the above code you will get the following output −
Name Age Gender Rating
0 Steve 32 Male 3.45
1 Lia 28 Female 4.60
2 Vin 45 Male 3.90
3 Katie 38 Female 2.78
Key Points
Following are the key points related the Pandas DataFrame −
Heterogeneous data
Size Mutable
Data Mutable
Purpose of Using More Than One Data Structure
Pandas data structures are flexible containers for lower-dimensional data. For instance, a
DataFrame is a container for Series, and a Series is a container for scalars. This flexibility
allows for efficient data manipulation and storage.
Building and handling multi-dimensional arrays can be boring and require careful
consideration of the data's orientation when writing functions. Pandas reduces this mental effort
by providing intuitive data structures.
Example
Following example represents a Series within a DataFrame.
import pandas as pd
# Data represented as a dictionary
data = {
'Name': ['Steve', 'Lia', 'Vin', 'Katie'],
'Age': [32, 28, 45, 38],
'Gender': ['Male', 'Female', 'Male', 'Female'],
'Rating': [3.45, 4.6, 3.9, 2.78]
# Creating the DataFrame
df = [Link](data)
# Display a Series within a DataFrame
print(df['Name'])
Output
On executing the above code you will get the following output −
0 Steve
1 Lia
2 Vin
3 Katie
Name: Name, dtype: object