0% found this document useful (0 votes)
7 views81 pages

Python Programming Basics for Data Analysis

Placement readiness ppt

Uploaded by

adityasing604
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views81 pages

Python Programming Basics for Data Analysis

Placement readiness ppt

Uploaded by

adityasing604
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Title: Python Programming and Data

Analysis Fundamentals
(UNIT-1)
Introduction to Python
High-level, Interpreted Language:
• High-level: Python code is written in a human-readable format,
abstracting away many low-level details of the computer's hardware.
This makes it easier to learn and write code compared to lower-level
languages like C or assembly.
• Interpreted: Python code is executed line by line by an interpreter,
unlike compiled languages where the entire code is translated into
machine code before execution. This provides flexibility and makes
debugging easier.
Versatility:
• Data Science: Python excels in data analysis, manipulation, and
visualization with libraries like NumPy, Pandas, and Matplotlib.
• Machine Learning: Powerful libraries like scikit-learn, TensorFlow,
and PyTorch enable building and deploying various machine learning
models.
• Web Development: Frameworks like Django and Flask facilitate the
creation of dynamic and interactive web applications.
• Scripting and Automation: Python's simplicity makes it ideal for
automating tasks, scripting system administration, and creating small
utility programs.
• Scientific Computing: Libraries like SciPy provide tools for numerical
computing, optimization, and scientific simulations.
Learning Objectives:
• Understand the fundamental concepts of Python programming: Data
types, operators, control flow, functions, and object-oriented
programming.
• Learn to use NumPy for efficient numerical computations and array
operations.
• Gain knowledge of linear algebra concepts and their implementation in
Python.
• Develop basic programming skills for data analysis and manipulation.
• Prepare a strong foundation for further exploration in data science and
machine learning.
Python Data Types and Operators
Data Types
• Numeric Types:
• int: Represents whole numbers (e.g., 10, -5, 0)
• float: Represents numbers with decimal points (e.g., 3.14, -2.5,
0.0)
• complex: Represents complex numbers (e.g., 2+3j)
•Sequence Types:
•str: Represents a sequence of characters (e.g., "Hello", 'Python')
•list: An ordered, mutable collection of items (e.g., [1, 2, 3], ["apple",
"banana"])
•tuple: An ordered, immutable collection of items (e.g., (1, 2, 3),
("apple", "banana"))
Mapping Type:
•dict: A collection of key-value pairs (e.g., {"name": "Alice", "age":
30})
•Set Types:
•set: An unordered collection of unique items (e.g., {1, 2, 3})
•frozenset: An immutable set (e.g., frozenset({1, 2, 3}))
•Boolean Type:
•bool: Represents truth values (True or False)
Operators
•Arithmetic Operators:
•+ (addition)
•- (subtraction)
•* (multiplication)
•/ (division)
•// (floor division)
•% (modulo - returns the remainder)
•** (exponentiation)
Comparison Operators:
•== (equal to)
•!= (not equal to)
•> (greater than)
•< (less than)
•>= (greater than or equal to)
•<= (less than or equal to)
•Logical Operators:
•and (returns True if both operands are True)
•or (returns True if at least one operand is True)
•not (returns the opposite boolean value)
Assignment Operators:

•= (simple assignment)
•+= (add and assign)
•-= (subtract and assign)
•*= (multiply and assign)
•/= (divide and assign)
•//= (floor divide and assign)
•%= (modulo and assign)
•**= (exponentiate and assign)
Bitwise Operators:

•& (bitwise AND)


•| (bitwise OR)
•^ (bitwise XOR)
•~ (bitwise NOT)
•<< (left shift)
•>> (right shift)
•Membership Operators:
•in (returns True if a value is found in a sequence)
•not in (returns True if a value is not found in a sequence)
Identity Operators:
•is (returns True if both operands refer to the same object)
•is not (returns True if both operands refer to different objects)
Code:
• # Arithmetic operators
• x = 10
•y = 3
• print(x + y) # Output: 13
• print(x - y) # Output: 7
• print(x * y) # Output: 30
• print(x / y) # Output: 3.3333333333333335
• print(x // y) # Output: 3
• print(x % y) # Output: 1
• print(x ** y) # Output: 1000
• # Comparison operators
• print(x > y) # Output: True
• print(x < y) # Output: False
• print(x == y) # Output: False

• # Logical operators
• a = True
• b = False
• print(a and b) # Output: False
• print(a or b) # Output: True
• print(not a) # Output: False
• # Membership operators
• my_list = [1, 2, 3]
• print(2 in my_list) # Output: True
• print(4 in my_list) # Output: False
Control Flow: Loops and Conditional Statements
• Control flow is a fundamental concept in programming that allows you
to dictate the order in which statements are executed in your code. In
Python, control flow is managed through the use of conditional
statements (if statements) and loops.
• Conditional statements
• Conditional statements are used to guide a program's flow based on
certain conditions. The main conditional statements include "if," "elif"
(short for "else if"), and "else." These statements allow you to run
different code blocks depending on whether a specified condition is
true or false.
Python If Statement

• The "if" statement stands as the most elementary decision-making


construct. Its purpose is to determine the execution or omission of
specific statements or a block of statements based on a given
condition. The "if" statement is employed to assess a condition.
Should the condition prove to be true, the indented code block beneath
the "if" statement is executed. Conversely, if the condition is false, the
code block is bypassed.
Python If Statement
Python If Statement

Exp- X=10
if x>5:
Print(“x is greater than 5”)
Python If-Else Statement

• The "if" statement, in isolation, signifies that upon the condition being
true, a designated block of statements will execute; conversely, if
false, it will not. However, to introduce an alternative course of action
in the event of a false condition, the "else" statement can be coupled
with the "if" statement, enabling the execution of a distinct code block
when the "if" condition is false.
Python If-Else Statement
Python If-Else Statement

Exp- a=6
if a>5:
Print(‘Hello’)
else:
Print(‘Bye’)
Python if-elif-else Statement

• The "if" statements are processed in a top-down manner. Upon


encountering the first true condition, the corresponding "if" statement's
associated code is executed, and the subsequent conditions are
disregarded. Should none of the conditions prove true, the concluding
"else" statement is activated.
The "elif" statement facilitates the sequential examination of multiple
conditions. Should the condition following an "if" statement prove
false, the subsequent "elif" statement is appraised. In the event that its
condition is true, the associated code block is executed, and any
subsequent "elif" or "else" statements are bypassed.
Python if-elif-else Statement
Python if-elif-else Statement

Exp- a=5
b=6
if a==b:
Print(“How did that happen?”)
elif a>b:
Print(“yikes”)
else:
Print(“All is Well with the world”)
Nested-If Statement in Python

• A nested "if" is an "if" statement positioned within the scope of


another "if" statement. Essentially, nested "if" statements entail the
inclusion of an "if" statement within another. Python affords the
capability to nest "if" statements, enabling the placement of an "if"
statement inside another.
Nested-If Statement in Python

Exp- num=15
if num>0:
if num % 2 == 0:
Print(“ num is positive and even”)
else:
Print(“num is positive and odd”)
else:
Print(“num is non positive”)
Loops in Python: For Loop
A for loop constitutes a control flow statement present in numerous
programming languages, including Python, C, Java, and others. Its
purpose is to iterate over a sequence, be it a list, tuple, string, or range,
and execute a designated block of code for each item within the
sequence.
• iterable is a sequence (e.g., a list, tuple, string, or range).
• variable that takes on the value of each element in the iterable during
each iteration.
For Loop:
For Loop
Exp- Numbers = [1,2,3,4,5,6,7,8,9]
for num in numbers:
if (num%2==0):
Print(“even” ,num)
elif(num%2!=0):
Print(“odd” ,num)
Range Function

• the range() function is often used in conjunction with loops to generate


a sequence of numbers. The range() function returns an object that
produces a sequence of numbers based on the specified parameters. It
is commonly used in for loops.
Range Function

Exp- for I in range (5)


Print(i)

for I in range (2, 8):


Print(i)
While loop

• A while loop is employed to iteratively execute a block of statements


until a specified condition is met. Upon the condition becoming false,
the program proceeds to execute the line immediately following the
loop code to be executed while the condition is true
This block will keep executing until the condition becomes false
While loop
While loop

Exp- count = 0
while count < 5:
Print(count)
count += 1
Functions and Modules
Just see this
Image…….We
understand that
Package contains a
collection of modules
and a module
contains a collection
of Functions.
Thus, we can say
Functions are the
subset of modules
and Modules are the
subset of Packages.
Modules
• A module is simply a Python file with a .py extension that can be
imported inside another Python program.
• The name of the Python file becomes the module name.
• The module contains —
1) definitions and implementation of classes
2) variables and
3) functions that can be used inside another program.
Advantages of modules –
• Reusability : Working with modules makes the code reusable.
• Simplicity: Module focuses on a small proportion of the problem,
rather than focusing on the entire problem.
• Scoping: A separate namespace is defined by a module that helps to
avoid collisions between identifiers.
Steps in modules-
Creating a Module
To create a function — A function is defined using the def keyword
1. Creating a Module
##python program to create a module
## Defining a function
def Module ():
Print (“Hey, I am Module”)

#Defining a variable
location = “python”
In this program, a function is created with the name “Module” and saving this
file with name [Link] i.e. name of the file and with extension .py
Example:

In this program, we have created 4 functions for adding,


multiplying, subtracting and division. Saving this file as
[Link]
Functions

• A function is a block of code which only runs when it is


[Link] can pass data, known as parameters, into a function.
• A function can return data as a result.
Functions
1. User-defined Functions :
• Functions that we define ourselves to do certain specific task are
referred as user-defined functions.
• As u see in above example of [Link] file , we created our own
function to perform certain operation.
• Advantages of user-defined functions
1. User-defined functions help to decompose a large program into small
segments which makes program easy to understand, maintain and
debug.
2. If repeated code occurs in a program. Function can be used to include
those codes and execute when needed by calling that function.
2. Built-in Functions :
• Python has several functions that are readily available for use.
These functions are called built-in functions.
2. Built-in Functions :
• abs()returns the absolute value of a number. A negative value’s
absolute is that value is positive.
• all() returns True if all values in a python iterable have a
Boolean value of True otherwise False.
• ascii()returns a printable representation of a python object
(string or a Python list).
• bin() converts an integer to a binary string.
• bytearray() returns a python array of a given byte size.
• compile() returns a Python code object.
3. Lambda Functions :
• They are called as anonymous function that are defined
without a name.
• While normal functions are defined using the def keyword in
Python, anonymous functions are defined using
the lambda keyword.
Use of Lambda Function in python —
• To require a nameless function for a short period of [Link]
Python, we generally use it as an argument to a higher-order
function (a function that takes in other functions as arguments).
• Lambda functions are used along with built-in functions like
filter(), map() etc.
3. Lambda Functions :

• filter() — As the name suggests, it is used to filter the iterables


as per the conditions. Filter filters the original iterable and
passes the items that returns True for the function provided to
filter.
• map() — Map executes all the conditions of a function on the
items in the iterable and allows you to apply a function on it and
then passes it to the output which can have same as well as
different values .
3. Lambda Functions :
3. Lambda Functions :

• As U see in program where we we divide each element with 2


and its modulus equals to 0. The new list which
uses filter() returns those number only which are satisfied by
the condition while when we use map() returns the elements in
boolean form and which are satisfied by the condition returns
True.
4. Recursion Functions

A recursive function is a
function defined in
terms of itself
via self-referential expr
essions. This means that
the function will
continue to call itself
and repeat its behavior
until some condition is
met to return a result
4. Recursion Functions

• It is an example of a recursive function to find the factorial of an


integer.
• Factorial of a number is the product of all the integers from 1 to
that number. For example, the factorial of 3(denoted as 3!) is
1*2*3 = 6.
• Here, function ‘factorial’ recursively calls itself till the condition
becomes False.
4. Recursion Functions
Linear algebra
• linear algebra provides a framework for handling and manipulating
data, which is often represented as vectors and matrices. These
mathematical constructs enable efficient computation and provide
insights into the underlying patterns and structures within the data.
• In machine learning, linear algebra operations are used extensively in
various stages, from data preprocessing to model training and
evaluation. For instance, operations such as matrix multiplication,
eigenvalue decomposition, and singular value decomposition are
pivotal in dimensionality reduction techniques like Principal
Component Analysis (PCA). Similarly, the concepts of vector spaces
and linear transformations are integral to understanding neural
networks and optimization algorithms.
Continue…
• Linear algebra is a field of mathematics that plays a crucial role in
many machine learning algorithms. It helps us represent and
manipulate data, perform operations like transformations, and
understand relationships between data points. In machine learning,
linear algebra enables us to work with datasets efficiently, especially
when dealing with high-dimensional data. Let’s dive deeper into the
fundamental concepts of linear algebra and how they relate to machine
learning.
Continue…
• linear algebra deals with vectors, matrices, and linear
transformations. It provides the tools to perform operations on these
mathematical objects, allowing us to understand how data points relate
to each other, how they can be transformed, and how we can use them
in machine learning models.
• For example, when you have a dataset with multiple features (such as
height, weight, or temperature), each of these features can be
represented as a vector. These vectors can then be grouped together
into a matrix, which represents the entire dataset. Linear algebra
allows us to perform various operations on these vectors and matrices,
such as scaling, rotations, and transformations, which are essential in
building machine learning models.
[Link]
• A vector is simply an ordered collection of numbers, where each
number represents a value in a particular dimension. You can think of
a vector as a point in space, where each number in the vector
represents a coordinate. Vectors are commonly used in machine
learning to represent data points, where each element in the vector
corresponds to a specific feature of the data point.
• For instance, if you are analyzing weather data, a vector might contain
three values: temperature, pressure, and humidity. This vector can be
visualized as an arrow pointing to a specific point in 3-dimensional
space. Each dimension corresponds to one of the features
(temperature, pressure, or humidity), and the values in the vector
determine the location of the data point in that space.
Continue…
Example of a 3-dimensional vector representing temperature,
pressure, and humidity:

Here:
•25 represents temperature (in degrees Celsius),
•1013 represents atmospheric pressure (in hPa),
•60 represents humidity (in %).
2. Matrices
• A matrix is essentially a table of numbers arranged in rows and
columns. Each row in a matrix typically represents a data point, and
each column represents a feature. In machine learning, matrices are
used to represent entire datasets, where the rows are the data points
and the columns are the features (or attributes) of the data.
• For example, if you have three data points (each representing weather
conditions) and three features (temperature, pressure, humidity), the
matrix might look like this:
2. Matrices

In this matrix:
•Each row corresponds to a different data point (different weather readings),
•Each column corresponds to a different feature (temperature, pressure, or
humidity).
Matrices allow us to perform a wide variety of operations such as addition,
multiplication, and transformations, which help us manipulate and understand
data in machine learning.
3. Scalars
A scalar is just a single number. While vectors and matrices are
collections of numbers, a scalar is a single value. In linear algebra,
scalars are often used to scale vectors or matrices by multiplying
each element by the scalar value.
For example, if we have a scalar 2 and a vector v:

This operation scales the vector by multiplying each element by 2,


which can be useful in situations where you want to adjust the
magnitude of a vector.
Operations in Linear Algebra
1. Addition and Subtraction
2. Scalar Multiplication
Scalar multiplication involves multiplying each element in a vector
or matrix by a scalar. This operation is useful when you want to
scale a dataset or modify the magnitude of a vector or matrix.
For instance, if we have the matrix B and scalar 3:

Scalar multiplication is a simple yet powerful operation that is used


in many machine learning algorithms, such as gradient descent,
where the learning rate is a scalar applied to the gradient vector.
3. Dot Product (Vector Multiplication)
• The dot product is an operation
between two vectors that results
in a single scalar. It is calculated
by multiplying corresponding
elements from both vectors and
summing the results. The dot
product is often used to measure
the similarity between two
vectors in machine learning.
For two vectors a and b:
• The dot product is commonly
used in machine learning tasks
such as measuring similarity in
text processing or computing
weights in neural networks.
4. Cross Product (Vector Multiplication for 3D Vectors)
• The cross product is specific to 3D vectors. Unlike the dot product,
which results in a scalar, the cross product results in another vector
that is perpendicular to the two input vectors. This operation is mostly
used in 3D geometry and physics, but it also has applications in certain
areas of machine learning involving 3D data.
• For vectors a and b:
Linear Transformations
• Linear transformations are one of the most important concepts in
linear algebra. A linear transformation is a function that takes a
vector and maps it to another vector in the same or a different space.
These transformations preserve operations like addition and scalar
multiplication. In machine learning, transformations are used to
manipulate and represent data in different ways.
1. Definition and Explanation
• A linear transformation maps vectors from one space to another, and it
can be represented by multiplying a vector by a matrix. For example,
applying a transformation to a vector can scale it, rotate it, or translate
it in space.
2. Common Linear Transformations in Machine Learning
• Translation: Shifts a point by adding a translation vector to each
coordinate. In machine learning, this might be used to shift data points
to new locations without changing their relative distances.
• Scaling: Resizes vectors by multiplying them by a scalar or scaling
matrix. This is useful when you need to normalize data or adjust the
scale of features.
• Rotation: Rotates vectors around an axis, often used in dimensionality
reduction techniques like PCA to rotate data to align with principal
components.
Matrix Operations
• Matrix operations are the corner stone of linear algebra and are
essential for manipulating data in machine learning. They allow us to
perform complex calculations efficiently, especially when dealing with
large datasets. In this section, we’ll dive into some key matrix
operations that are frequently used in machine learning, such as matrix
multiplication, transpose, inverse, and determinants.
A. Matrix Multiplication
• Matrix multiplication is one of the most fundamental operations in
linear algebra. It allows us to combine two matrices to produce a new
matrix. This operation is crucial in machine learning algorithms,
especially when dealing with transformations, data projections, and
neural networks.
Definition of Matrix Multiplication:
• Matrix multiplication is
only possible when the
number of columns in the
first matrix matches the
number of rows in the
second matrix. Given two
matrices A and B:
• Matrix multiplication is
often used in machine
learning to perform
transformations on data,
such as when applying
weights to inputs in a
neural network or
performing
dimensionality reduction
with PCA.
B. Transpose and Inverse of Matrices
1. Transpose of a Matrix:
The transpose of a matrix is obtained by flipping the matrix over its diagonal, which
turns the rows of the matrix into columns and vice versa. Given a matrix A, its
transpose is denoted by AT.
Continue…

• The transpose operation is often used in machine learning algorithms


when working with data, especially in vectorized operations where
rows and columns need to be swapped.
2. Inverse of a Matrix
The inverse of a matrix is
analogous to the reciprocal of a
number. For a square matrix A,
its inverse A−1 is the matrix that,
when multiplied with A, results
in the identity matrix I:

Where I is the identity matrix, a square matrix with


ones on the diagonal and zeros elsewhere:
Continue…
How to Find the Inverse:
• A matrix must be square (i.e., the same number of rows and columns)
to have an inverse.
• Not all square matrices have an inverse. A matrix that does not have an
inverse is called singular.
• Finding the inverse of a matrix can be done through various methods,
such as Gaussian elimination, but in machine learning, we often rely on
computational tools like Python’s NumPy library to handle matrix
inverses efficiently.
• In machine learning, matrix inverses are used in algorithms such as
linear regression, where the normal equation involves the inverse of the
design matrix to compute the optimal weights.
C. Determinants
• The determinant is a scalar
value that can be computed
from a square matrix. It
provides important information
about the matrix, such as
whether the matrix has an
inverse or not. If the
determinant of a matrix is zero,
the matrix is singular and does
not have an inverse.
• For a 2×2 matrix A:
Importance of Determinants:

• The determinant helps determine if a matrix is invertible. If det A≠0,


the matrix is invertible; otherwise, it is singular.
• Determinants are used in solving systems of linear equations, in
calculating volumes, and in understanding the properties of
transformations applied to vectors.
• Determinants, though not always computed directly in machine
learning, play a role in understanding the invertibility of matrices used
in algorithms like linear regression and matrix factorizations.
Eigenvalues and Eigenvectors
• In linear algebra, eigenvalues and eigenvectors are important
concepts that play a crucial role in machine learning, particularly in
algorithms that involve dimensionality reduction, such as Principal
Component Analysis (PCA). Understanding eigenvalues and
eigenvectors allows us to analyze the structure of matrices and how
they transform data.
A. Definition and Significance
Eigenvalues
An eigenvalue is a scalar that indicates how much a corresponding
eigenvector is stretched or compressed during a linear transformation. It
represents the factor by which the eigenvector is scaled when a matrix is
applied to it.
Eigenvectors
An eigenvector is a non-zero vector that remains in the same direction
after a linear transformation is applied to it by a matrix, though its
magnitude may be scaled. Mathematically, if A is a square matrix, an
eigenvector v, and its corresponding eigenvalue λ satisfy the equation:
Continue…

Here:
•A is the matrix,
•v is the eigenvector,
•λ is the eigenvalue.
In other words, multiplying the matrix A by the
eigenvector v results in a scaled version of the
eigenvector, where the scaling factor is the
eigenvalue λ.
Geometric Interpretation
• The geometric interpretation of eigenvectors and eigenvalues is that
eigenvectors indicate directions in the data that are invariant under the
transformation applied by the matrix. The corresponding eigenvalue
tells us how much the eigenvector is stretched or compressed in that
direction.
• For example, in a 2D plane, if a matrix transforms a vector but keeps
its direction intact while only scaling it by some factor, that vector is
an eigenvector, and the scaling factor is its eigenvalue.
B. Applications in Machine Learning
• Eigenvalues and eigenvectors have several key applications in
machine learning, particularly in techniques that involve data
transformation or dimensionality reduction.
[Link] Reduction (Principal Component Analysis – PCA)
• Principal Component Analysis (PCA) is a widely used technique for
reducing the dimensionality of large datasets while retaining most of
the important information. PCA works by finding the eigenvectors and
eigenvalues of the covariance matrix of the data. The eigenvectors
represent the principal components, which are the directions in which
the data varies the most, and the eigenvalues indicate how much
variance there is along each principal component.
Steps involved in PCA:
1. Compute the covariance matrix of the dataset.
2. Perform eigen decomposition to obtain eigenvalues and eigenvectors
of the covariance matrix.
3. The eigenvectors corresponding to the largest eigenvalues are chosen
as the principal components.
4. Project the original data onto these principal components, reducing the
number of features while retaining most of the variance.

You might also like