Python Notes
Python Notes
Libraries Usage
Django Flask, Pyramid, CherryPy Web development (Server-side)
# Print statement
print("Hello, World!")
# Conditional statement
if x > 5:
print("x is greater than 5")
# Loop statement
for i in range(5):
print(i)
# Boolean expression
y = x > 5 # 'x > 5' is an expression
# Combined expression
result = (x * 2) + (y * 3) # Combination of multiple
expressions
Key Differences Between Statements and
Expressions
Variables
• A variable is the name given to a memory location. A value-holding
Python variable is also known as an identifier.
• Since Python is an infer language that is smart enough to determine
the type of a variable, we do not need to specify its type in Python.
• Variable names must begin with a letter or an underscore, but they can
be a group of both letters and digits.
• The name of the variable should be written in lowercase. Both India
and india are distinct variables.
Operators
• Operators are special symbols that perform operations on variables
and values. For example,
• print(5 + 6) # 11
Types of Python Operators
[Link] Operators
[Link] Operators
[Link] Operators
[Link] Operators
[Link] Operators
[Link] Operators
Arithmetic Operators
• Arithmetic operators are used to perform mathematical operations
like addition, subtraction, multiplication, etc.
Operator Operation Example
+ Addition 5+2=7
- Subtraction 4-2=2
* Multiplication 2*3=6
/ Division 4/2=2
// Floor Division 10 // 3 = 3
% Modulo 5%2=1
** Power 4 ** 2 = 16
Arithmetic Operators in Python
a=7
b=2
# addition
print ('Sum: ', a + b)
Sum: 9
# subtraction
Subtraction: 5
print ('Subtraction: ', a - b)
Multiplication: 14
# multiplication Division: 3.5
print ('Multiplication: ', a * b) Floor Division: 3
Modulo: 1
# division Power: 49
print ('Division: ', a / b)
# floor division
print ('Floor Division: ', a // b)
# modulo
print ('Modulo: ', a % b)
# a to the power b
print ('Power: ', a ** b)
Assignment Operators
• Assignment operators are used to assign values to variables. For
example,
• # assign 5 to x
• x=5
• Here, = is an assignment operator that assigns 5 to x.
+= Addition Assignment a += 1 # a = a + 1
-= Subtraction Assignment a -= 3 # a = a - 3
*= Multiplication Assignment a *= 4 # a = a * 4
/= Division Assignment a /= 3 # a = a / 3
%= Remainder Assignment a %= 10 # a = a % 10
• # assign 5 to b
• b=5
• print(a)
• # Output: 15
Comparison Operators
• Comparison operators compare two values/variables and return a
boolean result: True or False. Operator Meaning Example
a=5 3 == 5 gives us
== Is Equal To
b=2 False
Greater Than or
>= 3 >= 5 give us False
Equal To
Logical OR:
True if at least
or a or b
one of the
operands is True
Logical NOT:
True if the
not not a
operand is False
and vice-versa.
Logical Operators Program
• # logical AND
• print(True and True) # True
• print(True and False) # False
• # logical OR
• print(True or False) # True
• # logical NOT
• print(not True) # False
Bitwise operators
• Bitwise operators act on operands as if they were strings of binary
digits. They operate bit by bit, hence the name.
• For example, 2 is 10 in binary, and 7 is 111.
• In the table below: Let x = 10 Operator Meaning Example
• (0000 1010 in binary) and & Bitwise AND
x & y = 0 (0000
0000)
• y = 4 (0000 0100 in binary)
x | y = 14 (0000
| Bitwise OR
1110)
~x = -11 (1111
~ Bitwise NOT
0101)
x ^ y = 14 (0000
^ Bitwise XOR
1110)
• It's important to note that having two variables with equal values doesn't
necessarily mean they are identical.
Operator Meaning Example
True if
value/variable is
in 5 in x
found in the
sequence
True if
value/variable is
not in 5 not in x
not found in the
sequence
Membership operators in Python
• message = 'Hello world'
• dict1 = {1:'a', 2:'b'}
() Parentheses
** Exponent
+, - Addition, Subtraction
^ Bitwise XOR
| Bitwise OR
==, !=, >, >=, <, <=, is, is not, in, not in Comparisons, Identity, Membership operators
or Logical OR
• # Precedence of or & and
• meal = "fruit"
• money = 0
money = 0
Output
3
0
Note: Exponent operator ** has right-to-left associativity in Python.
# Expression is invalid
# (Non-associative operators)
# SyntaxError: invalid syntax
x = y = z+= 2
• Float – This value is represented by the float class. It is a real number with a
floating-point representation. It is specified by a decimal point. Optionally, the
character e or E followed by a positive or negative integer may be appended to
specify scientific notation.
• c = 2 + 4j
• print("\nType of c: ", type(c))
Sequence Data Types in Python
• The sequence Data Type in Python is the ordered collection of
similar or different Python data types.
• Sequences allow storing of multiple values in an organized and
efficient fashion.
• There are several sequence data types of Python:
• Python String
• Python List
• Python Tuple
String Data Type
• Strings in Python are arrays of bytes representing Unicode
characters. A string is a collection of one or more characters put in a
single quote, double-quote, or triple-quote.
• In Python, there is no character data type Python, a character is a
string of length one. It is represented by str class.
• Creating String
• Strings in Python can be created using single quotes, double quotes, or
even triple quotes.
• Example: This Python code showcases various string creation
methods. It uses single quotes, double quotes, and triple quotes to
create strings with different content and includes a multiline string.
The code also demonstrates printing the strings and checking their
data types.
s1 = 'Welcome to the World of AI'
print("String with Single Quotes: ", s1)
s2 = “Techno Space"
print("String with Double Quotes: ", s2)
s4 = ‘’MCA
MBA
MTech''
print("Multiline String: ", s4)
Accessing elements of String
• In Python programming , individual characters of a String can
be accessed by using the method of Indexing. Negative
Indexing allows negative address references to access
characters from the back of the String, e.g. -1 refers to the last
character, -2 refers to the second last character, and so on.
• Example: This Python code demonstrates how to work with a
string named ‘ String1′ . It initializes the string
with “GeeksForGeeks” and prints it. It then showcases how to
access the first character ( “G” ) using an index of 0 and the
last character ( “s” ) using a negative index of -1.
s = “Unity in Diversity"
t2 = (Keep', ‘Reinvennting')
print("\nTuple with the use of String: ", t2)
Access Tuple Items
• In order to access the tuple items refer to the index number.
Use the index operator [ ] to access an item in a tuple.
• The index must be an integer. Nested tuples are accessed
using nested indexing.
• The code creates a tuple named ‘ tuple1′ with five elements: 1,
2, 3, 4, and 5 . Then it prints the first, last, and third last
elements of the tuple using indexing.
t1 = tuple([1, 2, 3, 4, 5])
print("First element of tuple")
print(t1[0])
print("\nLast element of tuple")
print(t1[-1])
if 5 > 2:
if 5 > 2:
print("Five is greater than two!")
print("Five is greater than two!")
if 5 > 2:
print("Five is greater than two!")
print("Five is greater than two!")
Comments
• Comments can be used to explain Python code.
• Comments can be used to make the code more
readable.
• Comments can be used to prevent execution when
testing code.
Creating a Comment
• Comments starts with a #, and Python will ignore them:
• #This is a comment
print("Hello, World!")
• Comments can be placed at the end of a line, and
Python will ignore the rest of the line:
• print("Hello, World!") #This is a comment
•
Multiline Comments
• Python does not really have a syntax for multiline comments.
• To add a multiline comment you could insert a # for each line:
• #This is a comment
#written in
#more than just one line
print("Hello, World!")
• """
This is a comment
written in
more than just one line
"""
print("Hello, World!")
Program Execution
• Python program execution is the process of running Python code,
transforming it from high-level instructions into machine-executable
operations.
• Here’s an overview of how this works:
1. Source Code:You write your program in a .py file using Python syntax.
• print("Hello, World!")
2. Compilation:
• Python Interpreter first compiles the source code into bytecode (an intermediate
representation).
• Bytecode files have the extension .pyc and are stored in a __pycache__ directory.
• This step is transparent to the user and happens automatically when the program is
executed.
• Bytecode is platform-independent.
• 3. Execution in the Python Virtual Machine (PVM)
• The bytecode is then executed by the Python Virtual Machine (PVM).
• PVM is a part of the Python interpreter and is responsible for converting bytecode
into machine code for the host system.
• This step involves interpreting bytecode instructions one by one and executing them.
• 4. Runtime
• During execution, the interpreter manages memory, handles exceptions, and
interacts with operating system resources (like files, databases, or networks).
• Key components at runtime include:
• Memory Management (for variables, data structures, etc.)
• Dynamic Typing (Python determines types at runtime)
• Garbage Collection (automatic cleanup of unused objects)
Execution Modes
• Direct Execution (Interpreter Mode):
• Run the file directly using the python command
• python [Link]
• Interactive Mode:
• Start the interpreter and run code interactively:bashCopy code
• python
• >>> print("Hello")
• Integrated Development Environment (IDE):
• Use an IDE like PyCharm, VS Code, or Jupyter Notebook to write and execute code.
• Compiled Executable:
• Tools like pyinstaller can package a Python program into an executable file for
distribution.
Reading Input
• Developers often have a need to interact with users, either to
get data or to provide some sort of result.
• Most programs today use a dialog box as a way of asking the
user to provide some type of input.
• While Python provides us with two inbuilt functions to read
the input from the keyboard.
• input ( prompt )
• raw_input ( prompt )
• input ():
• This function first takes the input from the user and converts it into a
string. The type of the returned object always will be <class ‘str’>.
• It does not evaluate the expression it just returns the complete
statement as String. For example, Python provides a built-in function
called input which takes the input from the user.
• When the input function is called it stops the program and waits for
the user’s input. When the user presses enter, the program resumes
and returns what the user typed. inp = input('STATEMENT')
Example:
1. >>> name = input('What is your name?\n') # \n --->
newline ---> It causes a line break
>>> What is your name?
Ram
>>> print(name)
Ram
• sep=’separator’ : (Optional) Specify how to separate the objects, if there is more than [Link] :’ ‘
• flush : (Optional) A Boolean, specifying if the output is flushed (True) or buffered (False). Default: False
•
How print() works in Python?
• You can pass variables, strings, numbers, or other data types
as one or more parameters when using the print() function.
• Then, these parameters are represented as strings by their
respective str() functions.
• To create a single output string, the transformed strings are
concatenated with spaces between them.
name = “Arjun"
age = 25
• print(x is y) # False, because `x` and `y` are two different objects with
the same content
• print(x is z) # True, because `z` is a reference to `x`
Key Points:
• is vs ==:
• is compares the identity of objects (memory location).
• == compares the value of objects.
• a = [1, 2, 3]
• b = [1, 2, 3]
print("Program ended")
if else Statement
• In conditional if Statement the additional block of code is
merged as else statement which is performed when if
condition is false.
• Python if-else Statement Syntax
• Syntax: if (condition): # Executes this block if # condition is true
• else: # Executes this block if # condition is false
Flow Chart of if-else Statement
if else statement Example
# if..else statement example
x=3
if x == 4:
print("Yes")
else:
print("No")
Nested if Statement
• if statement can also be checked inside other if statement.
• This conditional statement is called a nested if statement.
• This means that inner if condition will be checked only if outer
if condition is true and by this, we can see multiple conditions
to be satisfied.
•
Syntax
• if (condition1): # Executes when condition1 is true
• if (condition2): # Executes when condition2 is true
• # if Block is end here
• # if Block is end here
Flow chart of Nested If Statement
Nested if Example
# Nested if statement example
num = 10
if num > 5:
print("Bigger than 5")
if letter == "B":
print("letter is B")
else:
print("letter isn't A, B or C")
Can We Use Elif in Nested If?
Are You Allowed to Nest If Statements
Inside Other If Statements in Python?
• What is the Difference Between if-else and Nested If Statements in
Python?
What is the Maximum Number of Elif Clauses
You Can Have in a Conditional?
Looping Statements
• Loops are used to repeat a block of code multiple times.
• Statements used to control loops and change the course of iteration
are called control statements.
• All the objects produced within the local scope of the loop are
deleted when execution is completed.
•
While Loop in Python
• A while loop is used to execute a block of statements
repeatedly until a given condition is satisfied.
• When the condition becomes false, the line immediately after
the loop in the program is executed.
• Python While Loop Syntax:
• while expression:
• statement(s)
While Loop Example
count = 0
while (count < 3):
count = count + 1
print(“Luck Favors Brave")
Using else statement with While Loop
in Python
• The else clause is only executed when your while condition
becomes false.
• If you break out of the loop, or if an exception is raised, it won’t
be executed.
• Syntax of While Loop with else statement:
while condition:
# execute these statements
else:
# execute these statements
count = 0
while (count < 3):
count = count + 1
print("Hello Geek")
else:
print("In Else Block")
Infinite While Loop
• If we want a block of code to execute infinite number of time,
we can use the while loop in Python to do so.
• The code uses a ‘while' loop with the condition (count == 0). This loop
will only run as long as count is equal to 0.
• Since count is initially set to 0, the loop will execute indefinitely
because the condition is always true.
count = 0
while (count == 0):
print(“Game on")
for loop
• Python's for loop is designed to repeatedly execute a code
block while iterating through a list, tuple, dictionary, or other
iterable objects of Python.
• The process of traversing a sequence is known as iteration.
• Syntax of the for Loop
• for value in sequence:
• { code block }
# Python program to show how the for loop works
# Initiating a loop
for s in a string:
# giving a condition in if block
if s == "o":
print("If block")
# if condition is not satisfied then else block will
be executed
else:
print(s)
Break Statement
• The break statement in Python is used to terminate the loop or
statement in which it is present.
• After that, the control will pass to the statements that are
present after the break statement, if available.
• If the break statement is present in the nested loop, then it
terminates only those loops which contain the break
statement.
•
Syntax of Break Statement
for / while loop:
# statement(s)
if condition:
break
# statement(s)
# loop end
Working of Break Statement
Example Program
# Python program to demonstrate
# break statement
s = ‘karnataka'
# Using for loop
for letter in s:
print(letter)
# break the loop as soon it sees ‘a'
# or ‘k'
if letter == ‘a' or letter == ‘k':
break
print("Out of for loop")
print()
i=0
# Using while loop
while True:
print(s[i])
# loop from 1 to 10
for i in range(1, 11):
# If i is equals to 6,
# continue to next iteration
# without printing
if i == 6:
continue
else:
# otherwise print the value
# of i
print(i, end = " ")
Sequences – Strings
• Strings are sequences of characters enclosed in quotes (single ' ',
double " ", or triple ''' ''').
string1 = 'Hello'
string2 = "World"
string3 = '''Python Programming'''
String Operations:
Operation Description Example
a + b → "Hello" + "World" →
Concatenation Combine two strings
"HelloWorld"
Repetition Repeat a string a * 3 → "Hi" * 3 → "HiHiHi"
Answer in a word/phrase/sentence/
paragraph
Built-In Functions
Commonly Used Built-In Functions for Strings:
• x, y = get_coordinates()
• print(x, y) # Output: 10 20
• If no return statement is present, the function implicitly returns None.
Void Functions
• A void function is a function that does not return a value.
• It is primarily used for performing an action (like printing output or modifying
a global variable).
• Syntax
• def function_name(parameters):
• # Logic without a return statement
• Example
• def print_message():
• print("Hello, this is a void function!")
• print_message()
• # Output: Hello, this is a void function!
• The function print_message performs an action (printing a message) but does
not return anything.
Important Points
• Calling a void function does not produce a return value.
• By default, void functions return None.
• Example Showing Implicit None:
• def do_nothing():
• pass
• print(do_nothing()) # Output: None
Comparison of return Statement and Void
Functions
def f():
# local variable
s = “Welcome to the World of AI"
print(s)
# Driver code
f()
Output
Welcome to the World of AI
If we will try to use this local variable outside
the function then let’s see what will happen.
def f():
# local variable
s = "I love Geeksforgeeks"
print("Inside Function:", s)
# Driver code
f()
print(s)
Output
NameError: name 's' is not defined
Python Global variables
• Global variables are the ones that are defined and declared
outside any function and are not specified to any function. They
can be used by any part of the program.
• # This function uses global variable s
def f():
print(s)
# Global scope
s = “ML is the subset of AI"
f()
Output
ML is the subset of AI
Global and Local Variables with the
Same Name
# This function has a variable with
# name same as s.
def f():
s = “Blockchain." Blockchain.
Neural Network
print(s)
# Global scope
s = “Neural Network"
f()
print(s)
Lifetime of Variables
• The lifetime of a variable refers to the duration the variable exists in
memory during program execution.
• Lifetime of Local Variables
• Local variables are created when a function is called.
• They are destroyed when the function completes execution.
def my_function():
local_var = 10 # Created
print(local_var)
my_function()
# local_var is destroyed after the function ends.
Lifetime of Global Variables
• Global variables persist throughout the program's execution and are
destroyed when the program ends.
• global_var = "I exist until the program ends"
• Garbage Collection
• In Python, unused variables are automatically cleaned up by the
garbage collector to free memory.
outer()
Output
Value of a using nonlocal is : 10
Value of a without using nonlocal is : 5
Default Parameters
• In Python, you can specify default values for function parameters.
• If a value is not provided during the function call, the default value is used.
• Syntax
• def function_name(param1=default_value):
• # Function body
• Example
def greet(name="Guest"):
print(f"Hello, {name}!")
introduce(age=25, name="Alice")
# Output: My name is Alice and I am 25 years old.
*args (Arbitrary Positional Arguments)
• *args allows you to pass a variable number of positional arguments to
a function.
• It collects arguments into a tuple.
def add_numbers(*args):
return sum(args)
def display_info(**kwargs):
for key, value in [Link]():
print(f"{key}: {value}")
Output
Arguments passed: ['[Link]', 'arg1', 'arg2']
2. Repetition
Repeat a string multiple times using the * Operator.
Example:
str1 = "Python "
print(str1 * 3) # Output: Python Python Python
Accessing Characters by Index Number
• Strings are indexed, starting from 0 for the first character and -1 for
the last character.
• Syntax
• string[index]
• Example
• str1 = "Hello"
• print(str1[0]) # Output: H
• print(str1[-1]) # Output: o
•
String Slicing and Joining
• String Slicing
• Extract a substring using slicing syntax:
Syntax:
• string[start:stop:step]
• Example
• str1 = "Hello, World!"
• print(str1[0:5]) # Output: Hello
• print(str1[::-1]) # Output: !dlroW ,olleH (reverse)
• Joining Strings
• Join multiple strings into one using join().
• Syntax
• [Link](iterable)
• Example
• words = ["Python", "is", "awesome"]
• sentence = " ".join(words)
• print(sentence) # Output: Python is awesome
String methods:
✓[Link]()
✓[Link]()
✓[Link]()
✓rstrip()
✓lstrip()
✓[Link]()
✓[Link]() : gives position of first occurrence of the string
passed
✓[Link]()
String Concatenation
• Concatenation means joining two or more strings together.
• To concatenate strings, we use + operator.
• When we work with numbers, + will be an operator for addition, but
when used with strings it is a joining operator.
Example:
s1 = “Hello”
s2 = “Good Morning!”
s3 = s1 + “ ” + s2
print(s3)
Output:
Hello Good Morning!
String repetition
• The * symbol used to represent multiplication, but when the
operand on the left side of the * is a list, it becomes the
repetition operator.
• The repetition operator makes multiple copies of a list and joins
them all together. Lists can be created using the repetition
operator, *.
Example:
• numbers = [1] * 5
• print(numbers)
Output: [1, 1, 1, 1, 1]
• n = [0, 1, 2] * 3
• print(n)
Output: [0, 1, 2, 0, 1, 2, 0, 1, 2]
Format Function
>>>print("today is {}".format('Monday'))
today is Monday
Multiple Formatter:
• Let’s say if there is another variable substitution required in a
sentence, this can be done by adding another set of curly brackets where we
want substitution and passing a second value into format().
• Python will then replace the placeholders by values that are passed as the
parameters.
[Link]('apple')
Output: ['banana', 'orange', 'kiwi', 'apple']
[Link]()
Output: 'apple‘
[Link](2)
Output: 'strawberry‘
print(fruits)
Output: ['banana', 'orange', 'kiwi']
• pop will remove the last element by default, if we want to remove
particular element specify the index inside the pop function
fruits = ['banana', 'orange', 'kiwi']
[Link](['apple', 'strawberry'])
Output: ['banana', 'orange', 'kiwi', 'apple', 'strawberry']
[Link]('banana')
Output: 1
• If item is not present count() will return 0.
• Functions which can be used in lists:
min(list)
max(list)
len(list)
sum(list)
Example:
L = [23,12,34,5,58,16,18]
len(L)
Output: 7
[Link]()
Output: [5, 12, 16, 18, 23, 34, 58]
[Link](reverse=True)
Output: [58, 34, 23, 18, 16, 12, 5]
max(L)
Output: 58
min(L)
Output: 5
# nested list
l1=[1,2,3]
l2=[2,3,4]
[Link](l2)
print(l1)
• Output: [1, 2, 3, [2, 3, 4]]
l1[3]
• Output: [2, 3, 4]
l1[3][2]
• Output: 4
Tuple:
➢A tuple is another sequence data type that is similar to the list.
➢It is a collection of immutable objects.
➢Tuple is ordered and cannot be changed.
➢Duplicate values can be present.
➢The main differences between lists and tuples are:
Lists are enclosed in brackets ( [ ] ), and their elements and size can be
changed, while tuples are enclosed in parentheses ( ( ) ) and cannot be updated.
➢Tuples can be thought of as read-only lists.
➢Tuples can be defined without brackets.
➢Assign multiple values at a time.
t1 = ('abcd', 786 , 2.23, 'john', 70.2 )
t2 = (123, 'john')
print(t1) # Prints complete list
print(t1[0]) # Prints first element of the list
print(t1[1:3]) # Prints elements starting from 2nd till 3rd
print(t1[2:]) # Prints elements starting from 3rd element
print(t2 * 2) # Prints list two times
print(t1 + t2) # Prints concatenated lists
Output:
('abcd', 786, 2.23, 'john', 70.2)
abcd
(786, 2.23)
(2.23, 'john', 70.2)
(123, 'john', 123, 'john‘)
('abcd', 786, 2.23, 'john', 70.2, 123, 'john‘)
t1[0] = 7634
Example:
t = (1,2,3,2,1,2,4,5)
[Link](2)
Output: 3
t = (1,2,'hi')
[Link](1)
Output: 0
Dictionary:
➢Dictionaries are enclosed by curly braces ( { } ) and values can be assigned and
accessed using square braces ( [] ).
➢Keys can be used as indexes and are unique but values in the keys can be
duplicate.
Example:
>>> camera = {'sony':200, 'nikon': 200}
>>> [Link]({'canon':500})
>>> print(camera)
Output: {'sony': 200, 'nikon': 200, 'canon': 500}
>>> camera['xyz'] = 1000
>>> print(camera)
Output: {'sony': 200, 'nikon': 200, 'canon': 750, 'xyz': 1000}
>>> [Link]()
Output: dict_keys(['sony', 'nikon', 'canon', 'xyz'])
>>> [Link]()
Output: dict_values([200, 200, 750, 1000])
File Handling
• File handling is a mechanism by which we can read data of disk files in
python program or write back data from python program to disk files.
• File handling allows us to store data entered through python program
permanently in disk file and later we can read data back.
• Data files can be stored in 2 ways:
1. Text file: Stores information as a character. If data is “hello” it will
take 5 bytes and if data is floating value 12.45 it will take 5 bytes.
Each line is terminated by special character called EOL(‘\n’ or ‘\r’ or
combination of both.
2. Binary file: Data is stored according to its data type and hence no
translation occurs. It stores the information in the same format as in the
memory.
‘b’ appended to the mode opens the file in binary mode.
File input and output operations:
• To perform operation on file we have to perform the following
steps:
✓Open file
✓Use file to Read or Write
✓Close file
Syntax:
Filevariable = open(filename, mode)
# Do read and write operations
[Link]()
File - Modes
✓‘r’ : Default option. Used when the file will only be read
✓‘w’: used for only writing- an existing file with same name will be erased.
✓‘a’: opens the file for appending. Data written to the file is added to the end
✓‘a+’: same as ‘a’ but file position is at the beginning
✓‘r+’: opens the file for both reading and writing
✓‘w+’: opens for read and writing. Create files if does not exists otherwise truncate
✓‘x’: creates a new file and open for writing
• # open a file
• file1 = open("[Link]", "r")
• # read the file
• read_content = [Link]()
• print(read_content)
• # open file in current directory
• file1 = open("[Link]")
• file1 = open("[Link]") # equivalent to 'r' or 'rt'
• file1 = open("[Link]",'w') # write in text mode
• file1 = open("[Link]",'r+b') # read and write in binary mode
File- Read Methods
• read()
read is used to print the entire contents of file at a time
• readline()
The readline method reads one line from the file and returns it as a
string
• readlines()
The readlines method returns the contents of the entire file as a list
of strings, where each item in the list represents one line of the
file.
• seek(offset)
move cursor back to whichever position we want.
• tell()
determines current position of the file.
Class Definition
• A class in Python is a blueprint for creating objects. It encapsulates data
(attributes) and methods (functions) that operate on the data.
• Syntax
class ClassName:
# Class attributes and methods
• Example
class Animal:
def __init__(self, name):
[Link] = name
def speak(self):
print(f"{[Link]} makes a sound.")
dog = Animal("Dog")
[Link]() # Output: Dog makes a sound.
Constructors
• A constructor is a special method defined using __init__.
• It is automatically called when an object is created and is typically
used to initialize attributes.
• Syntax
class ClassName:
def __init__(self, parameters):
# Initialization
__init__() Function
• __init__() function is a constructor method in Python. It
initializes the object’s state when the object is created.
• If the child class does not define its own __init__() method, it
will automatically inherit the one from the parent class.
# Parent Class: Person
class Person:
def __init__(self, name, idnumber):
[Link] = name
[Link] = idnumber
def display(self):
print([Link])
print([Link])
def display(self):
print(f"Name: {[Link]}, Age: {[Link]}")
class ChildClass(ParentClass):
# Additional implementation for child class
Creating a Parent Class
# A Python program to demonstrate inheritance
class Person(object):
# Constructor
def __init__(self, name, id):
[Link] = name
[Link] = id
# To check if this person is an employee
def Display(self):
print([Link], [Link])
# Driver code
emp = Person("Satyam", 102) # An Object of Person
[Link]()
Creating a Child Class
class Emp(Person):
def Print(self):
print("Emp class called")
Emp_details = Emp("Mayank", 103)
# calling parent class function
Emp_details.Display()
# Calling child class function
Emp_details.Print()
Example of Inheritance
class Vehicle:
def __init__(self, brand):
[Link] = brand
def display(self):
print(f"Brand: {[Link]}")
class Car(Vehicle):
def __init__(self, brand, model):
super().__init__(brand) # Call parent constructor
[Link] = model
def display(self):
super().display()
print(f"Model: {[Link]}")
car = Car("Toyota", "Corolla")
[Link]()
# Output:
# Brand: Toyota
# Model: Corolla
Types of Python Inheritance
[Link] Inheritance: A child class inherits from one parent class.
[Link] Inheritance: A child class inherits from more than one
parent class.
[Link] Inheritance: A class is derived from a class which is also
derived from another class.
[Link] Inheritance: Multiple classes inherit from a single
parent class.
[Link] Inheritance: A combination of more than one type of
inheritance.
# 1. Single Inheritance
class Person:
def __init__(self, name):
[Link] = name
# 2. Multiple Inheritance
class Job:
def __init__(self, salary):
[Link] = salary
# 4. Hierarchical Inheritance
class AssistantManager(EmployeePersonJob): # Inherits from EmployeePersonJob
def __init__(self, name, salary, team_size):
EmployeePersonJob.__init__(self, name, salary) # Explicitly initialize EmployeePersonJob
self.team_size = team_size
# Single Inheritance
emp = Employee("John", 40000)
print([Link], [Link])
# Multiple Inheritance
emp2 = EmployeePersonJob("Alice", 50000)
print([Link], [Link])
# Multilevel Inheritance
mgr = Manager("Bob", 60000, "HR")
print([Link], [Link], [Link])
# Hierarchical Inheritance
asst_mgr = AssistantManager("Charlie", 45000, 10)
print(asst_mgr.name, asst_mgr.salary, asst_mgr.team_size)
# Hybrid Inheritance
sen_mgr = SeniorManager("David", 70000, "Finance", 20)
print(sen_mgr.name, sen_mgr.salary, sen_mgr.department,
sen_mgr.team_size)
Overloading in Python
• Python does not support method overloading in the traditional sense.
• Instead, it allows a single method to handle multiple scenarios using
default arguments or variable arguments (*args and **kwargs).
• Method Overloading with Default Arguments
class Calculator:
def add(self, a, b=0, c=0):
return a + b + c
calc = Calculator()
print([Link](5)) # Output: 5
print([Link](5, 10)) # Output: 15
print([Link](5, 10, 15)) # Output: 30
Operator Overloading
• Operator overloading allows defining custom behavior for operators like +, -,
etc., by overriding special methods such as __add__ and __sub__.
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __str__(self):
return f"({self.x}, {self.y})"
p1 = Point(1, 2)
p2 = Point(3, 4)
p3 = p1 + p2 # Overloaded '+' operator
print(p3) # Output: (4, 6)
Unit 3
Data Pre-processing and Data
Wrangling
Introduction
• Data pre-processing and wrangling are essential steps in the data
analytics pipeline.
• These processes involve cleaning, transforming, and organizing raw
data to make it suitable for analysis.
Data Pre-processing
• Data pre-processing is the initial step in preparing raw data
for analysis.
• It includes techniques to handle missing values, detect and
remove outliers, normalize data, and more.
• Steps in Data Pre-processing
1. Loading the Data
2. Handling Missing Values
3. Removing Duplicates
4. Outlier Detection and Removal
5. Data Normalization and Scaling
Steps in Data Pre-processing
• Loading the Data
Use Python libraries like pandas or numpy to load datasets.
import pandas as pd
df = pd.read_csv('[Link]')
print([Link]())
Steps in Data Pre-processing
• Handling Missing Values
Missing values can be handled by removing or imputing them.
# Removing rows with missing values
[Link](inplace=True)
# Melting
melted = [Link](id_vars='category', value_vars=['col1', 'col2'])
• Aggregating Data
Summarize data using group-by operations.
aggregated = [Link]('category')['value'].sum()
print(aggregated)
• Converting Data Types
Ensure data is in the correct format.
df['date'] = pd.to_datetime(df['date'])
df['numeric_column'] = pd.to_numeric(df['numeric_column'])
Example Workflow: Data Pre-processing and Wrangling
import pandas as pd
from [Link] import StandardScaler
# Load the dataset
df = pd.read_csv('[Link]')
# Filter rows based on a condition
# Handle missing values filtered_df = df[df['Age'] > 30]
df['Age'].fillna(df['Age'].median(), inplace=True) # Group by and aggregate
# Remove duplicates aggregated = [Link]('Gender')['Income'].mean()
df.drop_duplicates(inplace=True) # Merge with another dataset
df2 = pd.read_csv('additional_data.csv')
# Normalize a column merged_df = [Link](df, df2, on='ID', how='inner')
scaler = StandardScaler() # Reshape data
df['Normalized_Income'] = melted_df = [Link](id_vars='ID', value_vars=['Age',
scaler.fit_transform(df[['Income']]) 'Income'], var_name='Metric', value_name='Value')
print(merged_df.head())
Key Python Libraries for Data Pre-processing
and Wrangling
• pandas: Provides DataFrames for data manipulation.
• numpy: Efficient operations on numerical data.
• scikit-learn: Tools for preprocessing, scaling, and encoding.
• matplotlib / seaborn: Visualization for identifying trends and outliers.
Acquiring Data with Python: Loading from CSV files,
Accessing SQL databases.
• Data acquisition is the process of gathering and importing data for
analysis.
• Python provides powerful tools and libraries to load data from CSV
files and access SQL databases efficiently.
• Loading Data from CSV Files
• CSV (Comma-Separated Values) is a common format for storing tabular data.
Python's pandas library is widely used to load and manipulate CSV files.
Loading Data from a CSV File Example
data = {'Name': [' Alice ', ' Bob ', ' Charlie ']}
df = [Link](data)
df['Name'] = df['Name'].[Link]()
print(df)
# Output:
# Name
# 0 Alice
# 1 Bob
# 2 Charlie
Dropping Irrelevant Columns
• Remove unnecessary columns from a DataFrame.
df = [Link](columns=['IrrelevantColumn'], axis=1)
• Removing Rows with Noise
Filter rows that contain irrelevant or noisy data.
scaler = MinMaxScaler()
df['Normalized_Score'] = scaler.fit_transform(df[['Score']])
print(df)
# Output:
# Score Normalized_Score
# 0 50 0.00
# 1 200 1.00
# 2 150 0.67
Standardizing Case in Text Data
• Ensure uniform case (lowercase or uppercase) for consistency.
df['Name'] = df['Name'].[Link]()
print(df)
# Output:
# Name
# 0 alice
# 1 bob
# 2 charlie
Encoding Categorical Variables
• Convert categories to numeric values.
df['Date'] = pd.to_datetime(df['Date'])
df['Amount'] = pd.to_numeric(df['Amount'])
Formatting Dates
Convert dates into a standard format.
df['Formatted_Date'] = df['Date'].[Link]('%Y-%m-%d')
Padding or Trimming Values
Ensure fixed-length values, such as adding leading zeros.
df['ID'] = df['ID'].[Link](5) # Pad ID to 5 digits
print(df)
Output
Name Score Date Category Normalized_Score
0 Alice 50.0 2024-12-01 1 0.000000
1 Bob 200.0 2024-12-02 2 1.000000
3 Charlie 150.0 2024-12-03 2 0.666667
Key Python Libraries for Data Cleansing
• pandas: For data manipulation and cleaning.
• numpy: For handling numerical data.
• scikit-learn: For normalization and scaling.
Combining and Merging Data Sets in Python
• Combining and merging data sets is a critical step in data analysis to
consolidate data from multiple sources into a unified structure.
• Python's pandas library provides powerful tools for these operations.
1. Combining Data Sets
• Combining data sets refers to stacking data either vertically (adding
rows) or horizontally (adding columns).
• Concatenation
• Concatenation is used to append data frames either row-wise or
column-wise.
• Syntax
• [Link]([df1, df2], axis=0) # Vertical stacking (default axis=0)
• [Link]([df1, df2], axis=1) # Horizontal stacking
Example (Vertical Stacking):
import pandas as pd
ID Name Age
0 1 Alice 25
1 2 Bob 30
Appending Data
• Appending is similar to concatenation but is specifically used to add
rows to a DataFrame.
• Syntax
[Link](df2, ignore_index=True)
• Example
df1 = [Link]({'ID': [1, 2], 'Name': ['Alice', 'Bob']})
Output
df2 = [Link]({'ID': [3], 'Name': ['Charlie']})
ID Name
0 1 Alice
appended_df = [Link](df2, ignore_index=True)
1 2 Bob
print(appended_df)
2 3 Charlie
2. Merging Data Sets
• Merging combines datasets based on common keys or columns. This
is equivalent to SQL-style joins (inner, outer, left, and right joins).
• Syntax
• [Link](df1, df2, on='common_column', how='join_type')
• Join Types:
• Inner Join (default): Includes rows with matching keys in both DataFrames.
• Left Join: Includes all rows from the left DataFrame and matching rows from
the right.
• Right Join: Includes all rows from the right DataFrame and matching rows
from the left.
• Outer Join: Includes all rows from both DataFrames, filling missing values with
NaN.
Example 1: Inner Join
• df1 = [Link]({'ID': [1, 2], 'Name': ['Alice', 'Bob']})
• df2 = [Link]({'ID': [1, 3], 'Age': [25, 35]})
Output
ID Name Age
0 1 Alice 25
Example 2: Left Join
• merged_df = [Link](df1, df2, on='ID', how='left')
• print(merged_df)
Output
ID Name Age
0 1 Alice 25.0
1 2 Bob NaN
Example 3: Outer Join
• merged_df = [Link](df1, df2, on='ID', how='outer')
• print(merged_df)
Output
ID Name Age
0 1 Alice 25.0
1 2 Bob NaN
2 3 NaN 35.0
Merging on Multiple Keys
• Merging can also be done on multiple columns.
df1 = [Link]({'ID': [1, 2], 'Dept': ['HR', 'Finance'], 'Name': ['Alice', 'Bob']})
df2 = [Link]({'ID': [1, 2], 'Dept': ['HR', 'Finance'], 'Salary': [5000, 6000]})
Output
ID Dept Name Salary
0 1 HR Alice 5000
1 2 Finance Bob 6000
3. Practical Workflow: Combining and Merging Example
# Merge products with sales details
import pandas as pd merged_data = [Link](products, sales_details,
on='ProductID', how='inner')
# DataFrames
sales_2023 = [Link]({'Month': ['Jan', 'Feb'], 'Sales': print("Combined Sales:")
[1000, 1500]}) print(combined_sales)
sales_2024 = [Link]({'Month': ['Mar', 'Apr'], 'Sales': print("\nMerged Data:")
[2000, 2500]}) print(merged_data)
Output
products = [Link]({'ProductID': [1, 2], 'Product': ['A',
Combined Sales:
'B']})
Month Sales
sales_details = [Link]({'ProductID': [1, 2], 'Sales': [3000,
0 Jan 1000
4000]})
1 Feb 1500
0 Mar 2000
# Combine sales data for 2023 and 2024
1 Apr 2500
combined_sales = [Link]([sales_2023, sales_2024], axis=0)
Merged Data:
ProductID Product Sales
0 1 A 3000
1 2 B 4000
Key Takeaways
• concat: For stacking DataFrames vertically or horizontally.
• append: For appending rows to an existing DataFrame.
• merge: For combining datasets based on keys using SQL-like joins.
• Flexibility: Use these methods to handle various data sources and
formats seamlessly.
Reshaping and Pivoting in Python
• Reshaping and pivoting are key data manipulation techniques used to
reorganize data for better analysis and visualization.
• Python's pandas library provides powerful tools for these operations,
including melt(), pivot(), pivot_table(), and others.
Reshaping Data
• Reshaping involves changing the structure of the dataset, such as
transforming data from wide to long format or vice versa.
• 1.1 Melting
• Melting transforms a dataset from a wide format (columns as variables) to a
long format (rows as variables).
• Syntax
• [Link](frame, id_vars=None, value_vars=None, var_name=None,
value_name=None)
id_vars: Columns to keep unchanged.
value_vars: Columns to unpivot.
var_name: Name for the new "variable" column.
value_name: Name for the new "value" column.
import pandas as pd
Example:
Output
data = {
Subject Math Science
'ID': [1, 1, 2, 2],
ID
'Subject': ['Math', 'Math', 'Science', 'Science'],
1 95 NaN
'Score': [90, 95, 80, 85]
2 NaN 85
}
df = [Link](data)
Example
data = { Output
'Region': ['North', 'North', 'South', 'South'], Year 2021 2022
'Year': [2021, 2022, 2021, 2022], Region
'Sales': [200, 250, 300, 400] North 200 250
} South 300 400
df = [Link](data)
# Sample data
data = {'Age': [25, 35, 45], 'Salary': [50000, 70000, 100000]} Age Salary Scaled_Salary Normalized_Age
df = [Link](data) 0 25 50000 0.000000 -1.224745
1 35 70000 0.333333 0.000000
# Min-Max Scaling 2 45 100000 1.000000 1.224745
scaler = MinMaxScaler()
df['Scaled_Salary'] = scaler.fit_transform(df[['Salary']])
# Standardization
std_scaler = StandardScaler()
df['Normalized_Age'] = std_scaler.fit_transform(df[['Age']])
print(df)
2.2 Encoding Categorical Variables
• Categorical variables need to be converted into numeric format for
machine learning models.
• One-Hot Encoding: Creates binary columns for each category.
• Label Encoding: Assigns integer values to categories.
from [Link] import OneHotEncoder, LabelEncoder
One-Hot Encoded:
# Data [[1. 0. 0.]
df = [Link]({'City': ['New York', 'Paris', 'London']}) [0. 1. 0.]
[0. 0. 1.]]
# One-Hot Encoding Label Encoded:
encoder = OneHotEncoder() City City_Label
one_hot = encoder.fit_transform(df[['City']]).toarray() 0 New York 2
1 Paris 1
# Label Encoding 2 London 0
label_encoder = LabelEncoder()
df['City_Label'] = label_encoder.fit_transform(df['City'])
print(df)
3. Practical Workflow for Data Transformation
import pandas as pd
from [Link] import MinMaxScaler, LabelEncoder df = [Link](data)
# Step 1: Scaling numerical columns
# Sample data
scaler = MinMaxScaler()
data = { df['Scaled_Salary'] = scaler.fit_transform(df[['Salary']])
'ID': [1, 2, 3], # Step 2: Encoding categorical column
'Name': ['Alice', 'Bob', 'Charlie'],
label_encoder = LabelEncoder()
df['City_Code'] = label_encoder.fit_transform(df['City'])
'Age': [25, 35, 45], # Step 3: Feature extraction
'Salary': [50000, 70000, 100000], df['Age_Group'] = [Link](df['Age'], bins=[0, 30, 50],
'City': ['New York', 'Paris', 'London'] labels=['Young', 'Middle-Aged'])
print(df)
}
Output
ID Name Age Salary City Scaled_Salary City_Code
Age_Group
0 1 Alice 25 50000 New York 0.000000 2 Young
1 2 Bob 35 70000 Paris 0.333333 1 Middle-Aged
2 3 Charlie 45 100000 London 1.000000 0 Middle-
Aged
Key Libraries for Data Transformation
• pandas: For data manipulation and transformation.
• numpy: For numerical transformations.
• scikit-learn: For scaling, encoding, and feature engineering.
• Data transformation is a fundamental step in the data preparation
pipeline. By effectively applying these techniques, you can improve
the quality, consistency, and analytical utility of your datasets.
String Manipulation and Regular Expressions in
Python
• String manipulation and regular expressions (regex) are essential for
working with text data.
• Python provides built-in string methods and the re module for
efficient string processing.
• 2.1 Basic Functions in [Link](pattern, string)
Searches for the first occurrence of the pattern in the string.
import re
result = [Link](r"\d+", "The year is 2024")
print([Link]()) # 2024
• [Link](pattern, string)
• Matches the pattern only at the beginning of the string.
result = [Link](r"The", "The year is 2024")
print([Link]()) # The
• [Link](pattern, string)
• Returns all occurrences of the pattern in the string.
result = [Link](r"\d+", "There are 3 cats and 4 dogs.")
print(result) # ['3', '4']
• [Link](pattern, replacement, string)
• Replaces occurrences of the pattern with the replacement string.
result = [Link](r"\d+", "X", "There are 3 cats and 4 dogs.")
print(result) # There are X cats and X dogs.
• [Link](pattern, string)
• Splits the string based on the pattern.
result = [Link](r"\s", "Split this string")
print(result) # ['Split', 'this', 'string']
2.2 Regex Metacharacters and Special Sequences
Pattern Description Example
Matches any character except
. [Link](r"c.t", "cat")
newline.
^ Matches the start of a string. [Link](r"^The", "The cat")
$ Matches the end of a string. [Link](r"end$", "the end")
\d Matches any digit (0-9). [Link](r"\d", "A1B2C3")
Matches any word character
\w [Link](r"\w", "Hello_123")
(alphanumeric).
\s Matches any whitespace character. [Link](r"\s", "a b c")
* Matches 0 or more occurrences. [Link](r"ab*", "abc abbb")
+ Matches 1 or more occurrences. [Link](r"ab+", "abc abbb")
? Matches 0 or 1 occurrence. [Link](r"ab?", "abc ab")
{n} Matches exactly n occurrences. [Link](r"a{2}", "caaat")
[abc] Matches any one of a, b, or c. [Link](r"[aeiou]", "cat")
` ` Matches either pattern.
2.3 Using Regex for Validation
Example: Validate Email Address
email = "example@[Link]"
pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
if [Link](pattern, email):
print("Valid email")
else:
print("Invalid email")
2.3 Using Regex for Validation
• Example: Validate Phone Number
phone = "123-456-7890"
pattern = r"^\d{3}-\d{3}-\d{4}$"
if [Link](pattern, phone):
print("Valid phone number")
else:
print("Invalid phone number")
2.4 Practical Example: Extracting Data
• Extract Hashtags from Text:
text = "Learn #Python and #Regex in 2024!"
hashtags = [Link](r"#\w+", text)
print(hashtags) # ['#Python', '#Regex']
• Extract URLs from Text:
text = "Visit [Link] or [Link]
urls = [Link](r"https?://[^\s]+", text)
print(urls) # ['[Link] '[Link]
3. Combining String Manipulation and Regex
• Example: Process a Log File
import re
Output
Date: 2024-12-01, Error: Failed to connect to server
Conclusion
• String Manipulation: Built-in Python methods handle tasks like
splitting, joining, formatting, and cleaning strings.
• Regular Expressions: The re module is a powerful tool for pattern
matching and text extraction.
• Combining these techniques allows for robust text processing in data
analysis, validation, and parsing workflows.
Unit 4
Web Scraping And Numerical Analysis
What is Web Scraping?
• Web scraping, also called web data extraction, is an automated
process of collecting publicly available web data from targeted
websites.
• Instead of gathering data manually, web scraping software can be
used to acquire a vast amount of information automatically, making
the process much faster.
Why is Web Scraping Important?
• Some websites can contain a very large amount of invaluable data such as
stock prices, product details, sports stats, you name it.
• If you want to access this information, you either have to use whatever format
the website uses or copy and paste the information manually into a new
document.
• This can be pretty tedious when you want to extract a lot of information from
a website and here is where web scraping can help.
• Instead of scraping this data manually, in most cases, software tools called web
scrapers are preferred because they are less expensive compared to human
labor and they work at a faster rate.
• Web scrapers can run on your PC or in a data center.
How Does Web Scraping Work?
• Step 1: Retrieving content from a website
•
Web Scraping
• Web scraping is the process of extracting structured data from
websites for analysis, automation, or integration.
• For example, you might scrape product details from an e-commerce
website or collect weather data from a forecasting site.
1. Concepts in Web Scraping
1. HTTP Requests
• Web scraping relies on the HTTP protocol to fetch webpages.
• Key request types include:
• GET: Retrieves data (e.g., loading a webpage).
• POST: Sends data to the server (e.g., form submission).
2. HTML Structure
• Websites are built using HTML. Scraping involves parsing this structure to extract the
desired elements like tags (<h1>, <p>, <table>), attributes, or classes.
3. Ethical Considerations
• Always follow a website's Terms of Service and check for a [Link] file, which
specifies the allowed scraping practices.
Fetching Web Pages
• Fetching a webpage is the first step in scraping.
• This involves sending an HTTP GET request to the URL and retrieving
the server's response.
• Python’s requests library simplifies HTTP communication.
• The server's response includes metadata (headers) and the content
(HTML).
• This content can then be parsed and processed.
Example Program: Fetching a Web Page
• import requests
# 1D Array
arr = [Link]([1, 2, 3, 4, 5])
print("1D Array:", arr)
# 2D Array
arr_2d = [Link]([[1, 2, 3], [4, 5, 6]])
print("2D Array:\n", arr_2d)
Array Operations
• NumPy enables element-wise operations.
• For example:
• Adding 10 to each element: arr + 10
• Multiplying each element by 2: arr * 2
Example Program: Array Operations
import numpy as np
# Element-wise addition
print("Add 10 to each element:", arr + 10)
# Element-wise multiplication
print("Multiply each element by 2:", arr * 2)
4. Statistical Analysis
• NumPy provides built-in functions for statistical analysis, including
mean, median, standard deviation, and variance.
• Mean: Average of elements.
• Standard Deviation: Measures the dispersion of data points.
Example Program: Statistics with NumPy
import numpy as np
# Standard Deviation
print("Standard Deviation:", [Link](arr))
5. Matrix Operations
• Matrices are 2D arrays used for linear algebra computations.
• Dot Product: Computes the product of two matrices.
• Transpose: Swaps rows and columns.
Example Program: Matrix Operations
import numpy as np
# Define matrices
matrix_a = [Link]([[1, 2], [3, 4]])
matrix_b = [Link]([[5, 6], [7, 8]])
# Matrix multiplication
result = [Link](matrix_a, matrix_b)
print("Matrix Multiplication:\n", result)
# Transpose
print("Transpose of matrix_a:\n", matrix_a.T)
Unit 5
Data Visualization
• Data visualization transforms raw data into graphical representations,
enabling easier interpretation and insights.
• Python's libraries like NumPy, Matplotlib, Seaborn, and Pandas provide
powerful tools for creating meaningful visualizations.
• Data visualization is an easier way of presenting the data, however
complex it is, to analyze trends and relationships amongst variables with
the help of pictorial representation.
• The following are the advantages of Data Visualization
• Easier representation of compels data
• Highlights good and bad performing areas
• Explores relationship between data points
• Identifies data patterns even for larger data points
Best Practices to be followed during Data
Visualization
• Ensure appropriate usage of shapes, colors, and size while building
visualization
• Knowledge of suitable plot with respect to the data types brings more
clarity to the information
# Plot
[Link](x, y, label="Sine Wave")
[Link]("Line Plot of NumPy Array")
[Link]("X values")
[Link]("Y values")
[Link]()
[Link]()
2.2 Scatter Plot with NumPy Data
• Scatter plots visualize relationships between two variables.
• Example: Scatter Plot
# Random data using NumPy
x = [Link](50) # 50 random values
between 0 and 1
y = [Link](50)
# Scatter plot
[Link](x, y, color='blue', alpha=0.7)
[Link]("Scatter Plot of NumPy Array")
[Link]("X values")
[Link]("Y values")
[Link]()
3. Visualizing Array Transformations
• You can perform mathematical operations on NumPy arrays and
visualize the results.
• Example: Multiple Trigonometric Functions
# Create x values
x = [Link](0, 10, 100)
# Compute y values
y1 = [Link](x)
y2 = [Link](x)
# Plot both functions
[Link](x, y1, label="Sine")
[Link](x, y2, label="Cosine", linestyle="--")
[Link]("Sine and Cosine Functions")
[Link]("X-axis")
[Link]("Y-axis")
[Link]()
[Link]()
4. Statistical Visualization with NumPy Arrays
• NumPy provides functions for statistical analysis, such as mean, std,
and percentile, which are useful for creating visualizations.
• Example: Histogram of Random DatapythonCopy code
# Generate random data
data = [Link](1000) # 1000 random
values from a normal distribution
# Plot histogram
[Link](data, bins=30, color='green',
edgecolor='black')
[Link]("Histogram of Random Data")
[Link]("Value")
[Link]("Frequency")
[Link]()
5. Advanced Visualizations with NumPy Arrays
• Bar Charts with Aggregated Data
• Bar charts are useful for comparing categorical data.
# Data
x = [Link](0, 10, 100) # 100 points between 0 and 10
y = [Link](x)
# Plot
[Link](x, y, label="Sine Wave")
[Link]("Line Plot Example")
[Link]("X-axis")
[Link]("Y-axis")
[Link]()
[Link]()
1.3 Controlling Graph Appearance
• Matplotlib allows control over graph styles, line properties, and
marker attributes.
• Example: Customizing Line Style and Markers
[Link](x, y, color='red', linestyle='--', linewidth=2, marker='o')
[Link]("Customized Line Plot")
[Link]()
1.4 Adding Text to Graphs
• You can annotate plots by adding titles, labels, and custom text.
• Example: Adding Annotations
[Link](x, y)
[Link]("Adding Annotations")
[Link]("X-axis")
[Link]("Y-axis")
[Link](5, 0, "Midpoint", fontsize=12, color='blue')
[Link]()
1.5 More Graph Types
• Bar Chart
• Bar charts are used to represent categorical data.
categories = ['A', 'B', 'C']
values = [10, 15, 7]
[Link](x, y, color='orange')
[Link]("Scatter Plot Example")
[Link]()
1.6 Patches
• Patches in Matplotlib allow you to add shapes like circles, rectangles,
and polygons to plots.
• Example: Adding a Rectangle
from [Link] import Rectangle
fig, ax = [Link]()
ax.add_patch(Rectangle((0.1, 0.2), 0.5, 0.3, color='cyan'))
[Link](0, 1)
[Link](0, 1)
[Link]("Rectangle Patch Example")
[Link]()
2. Advanced Data Visualization with Seaborn
• Seaborn is a higher-level library built on Matplotlib that provides a
more aesthetic interface for creating advanced visualizations.
• Seaborn is a powerful Python library built on top of Matplotlib,
designed to simplify the creation of attractive and informative
statistical graphics.
Key Features of Seaborn
• High-Level Interface: Seaborn provides a more convenient and user-
friendly interface compared to Matplotlib, enabling users to create
complex visualizations with fewer lines of code.
• Built-in Themes and Color Palettes: The library includes aesthetically
pleasing default styles and color palettes, which can be customized to
enhance visual appeal.
• Integration with Pandas: Seamless integration with Pandas
DataFrames allows for easy manipulation and visualization of
structured datasets.
Advanced Visualization Techniques
1. Pair Plots
• Pair plots are an effective way to visualize relationships between multiple
variables in a dataset.
• They create a matrix of scatter plots for each pair of variables, allowing
quick identification of correlations.
import seaborn as sns
import pandas as pd
# Create heatmap
[Link](correlation_matrix, annot=True, cmap='coolwarm')
[Link]('Heatmap of Correlations in Iris Dataset')
[Link]()
3. Facet Grids
• Facet grids allow the creation of a grid of plots based on subsets of your
dataset.
• This technique is useful for visualizing the distribution of data across different
categories.
g = [Link](iris, col='species')
[Link]([Link], 'sepal_length')
[Link]()
• Statistical Visualization
• Seaborn simplifies the process of performing and visualizing statistical
analyses:
• Regression Plots: Use regplot() or lmplot() to visualize linear relationships between
variables along with regression lines.
[Link](x='sepal_length', y='sepal_width', data=iris)
[Link]('Regression Plot of Sepal Length vs Width')
[Link]()
• Box Plots and Violin Plots: These plots provide insights into the distribution
and frequency of data points across different categories.
Customization in Seaborn
• Customization Options
• Seaborn offers extensive customization options to enhance the aesthetics and
clarity of visualizations:
• Custom Color Palettes: Users can define custom color palettes using
sns.set_palette().
• Style Settings: Adjust overall styles using sns.set_style() for a polished look.
• You can enhance Seaborn plots by adding themes and color palettes.
• Example: Using Themes
• sns.set_theme(style="darkgrid")
• [Link](x=x, y=y, color="red")
• [Link]("Scatter Plot with Seaborn Theme")
• [Link]()
3. Time Series Analysis with Pandas
• Time series analysis involves statistical techniques to analyze time-
ordered data points, often collected at regular intervals.
• The Pandas library in Python provides robust tools for manipulating
and analyzing time series data, making it a popular choice for data
scientists and analysts.
• Time series analysis is used to examine data points collected over
time intervals. Pandas makes time series analysis easier with its
datetime capabilities.
Creating Time Series Data
• To create a time series in Pandas, you can use the pd.date_range()
function to generate a range of dates and then create a Series or
DataFrame.
import pandas as pd
# Data
x = [Link](0, 10, 100)
y = [Link](x)
noise = y + [Link](scale=0.2, size=100)
# Seaborn plot
[Link](x=x, y=noise, label="Noisy Sine", color="blue")