0% found this document useful (0 votes)
6 views73 pages

Python Complete Sprint DataScience AI

The document is a comprehensive guide to Python programming for Data Science and AI, covering topics from variables and data types to conditionals and loops. It includes explanations, syntax, examples, and outputs for various Python concepts such as strings, operators, and input/output functions. The guide is designed to be accessible with simple English and real-world examples.

Uploaded by

g.wagon2005.pspk
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views73 pages

Python Complete Sprint DataScience AI

The document is a comprehensive guide to Python programming for Data Science and AI, covering topics from variables and data types to conditionals and loops. It includes explanations, syntax, examples, and outputs for various Python concepts such as strings, operators, and input/output functions. The guide is designed to be accessible with simple English and real-world examples.

Uploaded by

g.wagon2005.pspk
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

🐍 Python Complete Sprint

for Data Science & AI


A complete recall guide — from basics to pandas, OOP to ML

Simple English • Real Examples • Everything Included


SECTION 1: Variables & Data Types

1.1 What is a Variable?


Think of a variable like a labelled box. You put something inside the box (a number, a word, etc.) and
you give the box a name so you can find it later.
Why do we use variables? So we can store information and use it again and again without retyping it.
It makes our code clean and easy to change.
Real world example: A hospital stores a patient's name, age, and blood group in variables. When the
doctor needs this info, the program just reads the variable.

Syntax
📝 Code:
# variable_name = value
name = 'Arjun' # stores text
age = 10 # stores a number
height = 4.5 # stores a decimal
is_student = True # stores True/False

print(name)
print(age)
print(height)
print(is_student)
💻 Output:
Arjun
10
4.5
True

Rules for naming variables


• Must start with a letter or underscore (_), not a number
• Can contain letters, numbers, underscores
• Cannot use Python keywords like if, for, while, class
• Python is case-sensitive: age ≠ Age ≠ AGE

1.2 Data Types


Every value in Python has a type. Like in real life — a word is different from a number, and Python
needs to know what kind of data it is dealing with.
Data Type What it stores / Example
int Whole numbers: age = 25, count = 100
float Decimal numbers: price = 99.99, pi = 3.14
str Text/words: name = 'Riya', city = 'Hyderabad'
bool True or False only: is_alive = True
list Multiple items in order: [1, 2, 3]
tuple Like list but cannot change: (10, 20, 30)
dict Key-value pairs: {'name': 'Ram', 'age': 12}
set Unique items, no duplicates: {1, 2, 3}
NoneType No value / empty: result = None

How to check the type of a variable?


📝 Code:
x = 42
y = 3.14
z = 'hello'
a = True

print(type(x)) # int
print(type(y)) # float
print(type(z)) # str
print(type(a)) # bool
💻 Output:
<class 'int'>
<class 'float'>
<class 'str'>
<class 'bool'>

1.3 Type Conversion (Casting)


Sometimes you want to change the type. Like turning the text '10' into the number 10 so you can do
math with it.
📝 Code:
# int() — convert to integer
x = int('42') # '42' (string) → 42 (int)
y = int(3.9) # 3.9 (float) → 3 (truncates!)

# float() — convert to float


a = float('3.14') # '3.14' (string) → 3.14
b = float(5) # 5 (int) → 5.0
# str() — convert to string
s = str(100) # 100 → '100'
s2 = str(3.14) # 3.14 → '3.14'

# bool() — convert to boolean


print(bool(0)) # False (0 is always False)
print(bool(1)) # True
print(bool('')) # False (empty string is False)
print(bool('hi')) # True

# list(), tuple(), set()


print(list('abc')) # ['a', 'b', 'c']
print(tuple([1,2])) # (1, 2)
print(set([1,1,2])) # {1, 2} — removes duplicates!
💻 Output:
False
True
False
True
['a', 'b', 'c']
(1, 2)
{1, 2}
SECTION 2: Strings

2.1 What is a String?


A string is any text — words, sentences, or even a single letter. In Python, we put text inside single
quotes '' or double quotes "".
Real world use: Storing names, addresses, messages, passwords, URLs, file names.

Creating Strings
📝 Code:
name = 'Priya'
city = "Hyderabad"
multi = '''This is
a multi-line
string.'''
empty = ''

print(name, city)
print(len(name)) # length of string
💻 Output:
Priya Hyderabad
5

2.2 String Operations


📝 Code:
a = 'Hello'
b = 'World'

# Concatenation (joining)
print(a + ' ' + b) # Hello World

# Repetition
print(a * 3) # HelloHelloHello

# Indexing (getting one character)


print(a[0]) # H (first character)
print(a[-1]) # o (last character)
# Slicing (getting part of a string)
print(a[1:4]) # ell
print(a[:3]) # Hel
print(a[2:]) # llo
print(a[::-1]) # olleH (reversed!)
💻 Output:
Hello World
HelloHelloHello
H
o
ell
Hel
llo
olleH

2.3 String Methods (Built-in Functions for Strings)


Python has tons of ready-made tools (methods) to work with strings. Here are all the important ones:

Method What it does + Example


upper() 'hello'.upper() → 'HELLO'
lower() 'HELLO'.lower() → 'hello'
title() 'hello world'.title() → 'Hello World'
strip() ' hi '.strip() → 'hi' (removes spaces)
lstrip() ' hi'.lstrip() → 'hi' (left side)
rstrip() 'hi '.rstrip() → 'hi' (right side)
replace(old,new) 'cat'.replace('c','b') → 'bat'
split(sep) 'a,b,c'.split(',') → ['a','b','c']
join(list) ','.join(['a','b']) → 'a,b'
find(x) 'hello'.find('l') → 2 (index of first match)
count(x) 'hello'.count('l') → 2
startswith(x) 'hello'.startswith('he') → True
endswith(x) 'hello'.endswith('lo') → True
isdigit() '123'.isdigit() → True
isalpha() 'abc'.isalpha() → True
isalnum() 'abc123'.isalnum() → True
isupper() 'ABC'.isupper() → True
islower() 'abc'.islower() → True
center(n) 'hi'.center(10) → ' hi '
ljust(n) 'hi'.ljust(10) → 'hi '
rjust(n) 'hi'.rjust(10) → ' hi'
zfill(n) '42'.zfill(5) → '00042'
format() '{} is {}'.format('sky','blue') → 'sky is blue'
encode() 'hello'.encode() → b'hello'
in keyword 'el' in 'hello' → True

2.4 f-Strings (The modern way to format)


f-strings let you put variables directly inside a string. Just add f before the quote and use {} to insert
variables.
📝 Code:
name = 'Ravi'
age = 14
score = 95.678

print(f'My name is {name} and I am {age} years old.')


print(f'Score: {score:.2f}') # 2 decimal places
print(f'2 + 2 = {2 + 2}') # math inside {}
print(f'{[Link]()} rocks!')
💻 Output:
My name is Ravi and I am 14 years old.
Score: 95.68
2 + 2 = 4
RAVI rocks!
SECTION 3: Operators

3.1 Arithmetic Operators


These do math — just like a calculator.
📝 Code:
a = 10
b = 3
print(a + b) # Addition: 13
print(a - b) # Subtraction: 7
print(a * b) # Multiplication: 30
print(a / b) # Division: 3.333...
print(a // b) # Floor division: 3 (no decimal)
print(a % b) # Modulus: 1 (remainder)
print(a ** b) # Power: 1000 (10^3)
💻 Output:
13
7
30
3.3333333333333335
3
1
1000

3.2 Comparison Operators


These compare two values and give True or False.
📝 Code:
a = 10
b = 5
print(a == b) # Equal? False
print(a != b) # Not equal? True
print(a > b) # Greater than? True
print(a < b) # Less than? False
print(a >= b) # Greater or equal? True
print(a <= b) # Less or equal? False
💻 Output:
False
True
True
False
True
False

3.3 Logical Operators


These combine multiple conditions.
📝 Code:
x = 10
# and — BOTH must be True
print(x > 5 and x < 20) # True
print(x > 5 and x > 20) # False

# or — at LEAST ONE must be True


print(x > 5 or x > 20) # True
print(x < 5 or x > 20) # False

# not — flips True to False and vice versa


print(not (x > 5)) # False
💻 Output:
True
False
True
False
False

3.4 Assignment Operators


📝 Code:
x = 10
x += 5 # same as x = x + 5 → x is now 15
x -= 3 # same as x = x - 3 → x is now 12
x *= 2 # same as x = x * 2 → x is now 24
x //= 5 # same as x = x // 5 → x is now 4
x **= 3 # same as x = x ** 3 → x is now 64
print(x)
💻 Output:
64

3.5 Identity & Membership Operators


📝 Code:
# 'is' — checks if same object in memory
a = [1, 2, 3]
b = a
c = [1, 2, 3]
print(a is b) # True — b is literally a
print(a is c) # False — same content, different object

# 'in' — checks if value is inside


fruits = ['apple', 'mango', 'banana']
print('mango' in fruits) # True
print('grape' not in fruits) # True
💻 Output:
True
False
True
True
SECTION 4: Input & Output

4.1 print() — Showing Output


print() displays anything on the screen. It is the most used function in Python.
📝 Code:
print('Hello!') # basic
print('Name:', 'Arjun') # multiple items
print(10 + 5) # expression
print('A', 'B', 'C', sep='-') # custom separator
print('Loading', end='...') # no newline
print(' Done') # continues same line

# Formatted output
name = 'Priya'
print(f'Hello {name}!')
💻 Output:
Hello!
Name: Arjun
15
A-B-C
Loading... Done
Hello Priya!

4.2 input() — Taking User Input


input() pauses the program and waits for the user to type something. It ALWAYS returns a string.
name = input('What is your name? ') # user types: Ravi
age = int(input('How old are you? ')) # user types: 15

print(f'Hello {name}, you are {age} years old!')

# Taking multiple inputs in one line


x, y = input('Enter two numbers: ').split() # user types: 10 20
x, y = int(x), int(y)
print(f'Sum = {x + y}')

⚠️ Important: input() always gives you a STRING.


If you need a number, convert it: int(input(...)) or float(input(...))
SECTION 5: Conditionals (if / elif / else)

5.1 What are Conditionals?


Conditionals let your program make decisions. Just like in real life: IF it is raining, take an umbrella,
ELSE go without one.
Real world use: Login systems check IF username and password match. A shopping cart checks IF
the item is in stock.

Syntax
if condition:
# code runs if condition is True
elif another_condition:
# code runs if the first was False but this is True
else:
# code runs if ALL above conditions were False

Example — Grade Checker


📝 Code:
marks = 78

if marks >= 90:


print('Grade: A+')
elif marks >= 80:
print('Grade: A')
elif marks >= 70:
print('Grade: B')
elif marks >= 60:
print('Grade: C')
else:
print('Grade: F — Study harder!')
💻 Output:
Grade: B

5.2 Nested if
📝 Code:
age = 20
has_id = True
if age >= 18:
if has_id:
print('Entry allowed')
else:
print('Show your ID')
else:
print('Too young!')
💻 Output:
Entry allowed

5.3 One-Line if (Ternary)


📝 Code:
# result = value_if_true if condition else value_if_false
age = 20
status = 'Adult' if age >= 18 else 'Minor'
print(status)
💻 Output:
Adult
SECTION 6: Loops

6.1 for Loop


A for loop repeats code a set number of times or for each item in a collection.
Real world use: Sending the same email to 1000 customers. Checking each product in a shopping
cart.
📝 Code:
# Loop through a list
fruits = ['apple', 'mango', 'banana']
for fruit in fruits:
print('I like', fruit)

# Loop through a range


for i in range(5): # 0, 1, 2, 3, 4
print(i, end=' ')
print()

# range(start, stop, step)


for i in range(1, 10, 2): # 1, 3, 5, 7, 9
print(i, end=' ')
💻 Output:
I like apple
I like mango
I like banana
0 1 2 3 4
1 3 5 7 9

6.2 while Loop


A while loop keeps running AS LONG AS a condition is True. Use when you don't know how many
times to repeat.
📝 Code:
count = 1
while count <= 5:
print(f'Count is {count}')
count += 1 # IMPORTANT: must change count or it loops forever!

print('Done!')
💻 Output:
Count is 1
Count is 2
Count is 3
Count is 4
Count is 5
Done!

6.3 break, continue, pass


📝 Code:
# break — exit the loop immediately
for i in range(10):
if i == 5:
break
print(i, end=' ') # prints: 0 1 2 3 4
print()

# continue — skip this one, go to next


for i in range(10):
if i % 2 == 0: # if even, skip
continue
print(i, end=' ') # prints: 1 3 5 7 9
print()

# pass — do nothing (placeholder)


for i in range(3):
pass # loop runs but does nothing
💻 Output:
0 1 2 3 4
1 3 5 7 9

6.4 enumerate() — Loop with index


📝 Code:
fruits = ['apple', 'mango', 'banana']
for index, fruit in enumerate(fruits):
print(f'{index}: {fruit}')

# Start counting from 1


for index, fruit in enumerate(fruits, start=1):
print(f'{index}. {fruit}')
💻 Output:
0: apple
1: mango
2: banana
1. apple
2. mango
3. banana

6.5 zip() — Loop over two lists together


📝 Code:
names = ['Ravi', 'Priya', 'Arjun']
scores = [85, 92, 78]

for name, score in zip(names, scores):


print(f'{name} scored {score}')
💻 Output:
Ravi scored 85
Priya scored 92
Arjun scored 78

6.6 List Comprehension — The Smart Loop


A shorter way to create lists using a loop in a single line.
📝 Code:
# Normal way
squares = []
for i in range(6):
[Link](i ** 2)

# List comprehension (same result!)


squares = [i ** 2 for i in range(6)]
print(squares)

# With condition
evens = [i for i in range(20) if i % 2 == 0]
print(evens)

# Nested comprehension
matrix = [[i * j for j in range(1, 4)] for i in range(1, 4)]
print(matrix)
💻 Output:
[0, 1, 4, 9, 16, 25]
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
[[1, 2, 3], [2, 4, 6], [3, 6, 9]]
SECTION 7: Functions

7.1 What is a Function?


A function is a reusable block of code that does a specific job. Instead of writing the same code again
and again, you write it once in a function and call it whenever you need it.
Real world use: A payment function runs every time a user buys something. A login function runs
every time someone logs in.

Defining and Calling a Function


📝 Code:
def greet(name): # define the function
print(f'Hello, {name}!')

greet('Ravi') # call the function


greet('Priya') # call again with different name
💻 Output:
Hello, Ravi!
Hello, Priya!

7.2 Return Values


A function can send back a result using 'return'.
📝 Code:
def add(a, b):
return a + b

result = add(10, 5)
print(result)

# Returning multiple values


def min_max(numbers):
return min(numbers), max(numbers)

lo, hi = min_max([3, 1, 7, 4, 9])


print(f'Min: {lo}, Max: {hi}')
💻 Output:
15
Min: 1, Max: 9
7.3 Default & Keyword Arguments
📝 Code:
# Default argument — used if not provided
def greet(name, greeting='Hello'):
print(f'{greeting}, {name}!')

greet('Arjun') # uses default


greet('Arjun', 'Namaste') # overrides default

# Keyword arguments — pass by name (order doesn't matter)


def power(base, exp):
return base ** exp

print(power(exp=3, base=2)) # 8
💻 Output:
Hello, Arjun!
Namaste, Arjun!
8

7.4 *args and **kwargs


When you don't know how many arguments will be passed, use *args (for a list of values) or **kwargs
(for key=value pairs).
📝 Code:
# *args — variable number of arguments
def total(*nums):
return sum(nums)

print(total(1, 2, 3)) # 6
print(total(10, 20, 30, 40)) # 100

# **kwargs — keyword arguments


def show_info(**details):
for key, value in [Link]():
print(f'{key}: {value}')

show_info(name='Priya', age=15, city='Hyderabad')


💻 Output:
6
100
name: Priya
age: 15
city: Hyderabad

7.5 Lambda Functions (Anonymous Functions)


A lambda is a tiny, one-line function. Use it when the function is simple and you only need it once.
📝 Code:
# Normal function
def square(x):
return x ** 2

# Same thing as lambda


square = lambda x: x ** 2
print(square(5)) # 25

# Used with sorted()


students = [('Ravi', 85), ('Priya', 92), ('Arjun', 78)]
[Link](key=lambda s: s[1]) # sort by score
print(students)
💻 Output:
25
[('Arjun', 78), ('Ravi', 85), ('Priya', 92)]

7.6 Scope — Where Variables Live


📝 Code:
x = 100 # global variable

def show():
y = 50 # local variable
print(x) # can read global
print(y)

show()
# print(y) # ERROR! y doesn't exist outside

# To MODIFY a global inside a function


counter = 0
def increment():
global counter
counter += 1

increment()
print(counter) # 1
💻 Output:
100
50
1
SECTION 8: Lists

8.1 What is a List?


A list is like a shopping bag — you can put many items in it, in order. Lists are the most used data
structure in Python.
Real world use: Storing a student's marks, a playlist of songs, prices of items in a cart.

Creating Lists
📝 Code:
numbers = [1, 2, 3, 4, 5]
names = ['Ravi', 'Priya', 'Arjun']
mixed = [1, 'hello', 3.14, True] # can mix types
empty = []
nested = [[1, 2], [3, 4], [5, 6]] # list inside list

print(numbers[0]) # first item: 1


print(numbers[-1]) # last item: 5
print(numbers[1:4]) # slice: [2, 3, 4]
💻 Output:
1
5
[2, 3, 4]

8.2 All List Methods

Method What it does


append(x) Adds x to the end: [1,2].append(3) → [1,2,3]
insert(i, x) Inserts x at index i: [1,3].insert(1,2) → [1,2,3]
extend(lst) Adds all items from lst: [1].extend([2,3]) → [1,2,3]
remove(x) Removes first x: [1,2,2].remove(2) → [1,2]
pop() Removes & returns last item
pop(i) Removes & returns item at index i
del list[i] Deletes item at index i
clear() Empties the list: []
index(x) Returns index of first x
count(x) How many times x appears
sort() Sorts in ascending order (changes list)
sort(reverse=True) Sorts in descending order
sorted(lst) Returns sorted copy (original unchanged)
reverse() Reverses the list in place
copy() Returns a copy of the list
len(lst) Number of items
min(lst) Smallest item
max(lst) Largest item
sum(lst) Total of all numbers
list() Create list from other iterable
in / not in Check if value exists: 3 in [1,2,3] → True

Full Example
📝 Code:
marks = [78, 92, 85, 67, 91, 88]

[Link](95) # add new mark


[Link]() # sort ascending
print(marks)

print('Highest:', max(marks))
print('Lowest: ', min(marks))
print('Average:', sum(marks) / len(marks))

[Link](67) # remove the lowest


print('After removal:', marks)

# Find top 3
top3 = sorted(marks, reverse=True)[:3]
print('Top 3:', top3)
💻 Output:
[67, 78, 85, 88, 91, 92, 95]
Highest: 95
Lowest: 67
Average: 85.14285714285714
After removal: [78, 85, 88, 91, 92, 95]
Top 3: [95, 92, 91]
SECTION 9: Tuples, Sets & Dictionaries

9.1 Tuples
A tuple is like a list but it CANNOT be changed after creation. Use tuples for data that should stay fixed
— like coordinates, RGB colors, or date of birth.
📝 Code:
coords = (10.5, 17.3) # latitude, longitude
rgb = (255, 128, 0) # color
person = ('Priya', 25, 'F') # name, age, gender

print(coords[0]) # 10.5
print(len(person)) # 3

# Unpacking
name, age, gender = person
print(f'{name} is {age} years old')

# count and index are the main tuple methods


t = (1, 2, 2, 3, 2)
print([Link](2)) # 3
print([Link](3)) # 3 (index of value 3)
💻 Output:
10.5
3
Priya is 25 years old
3
3

9.2 Sets
A set stores UNIQUE items only — no duplicates allowed! Sets are great for removing duplicates and
checking membership.
📝 Code:
fruits = {'apple', 'mango', 'banana', 'mango'} # mango appears only once
print(fruits)

# Set operations (great for comparing groups!)


a = {1, 2, 3, 4, 5}
b = {4, 5, 6, 7, 8}
print(a | b) # Union (all unique): {1,2,3,4,5,6,7,8}
print(a & b) # Intersection (common): {4, 5}
print(a - b) # Difference (in a not b): {1,2,3}
print(a ^ b) # Symmetric diff (not in both): {1,2,3,6,7,8}

# Remove duplicates from a list


nums = [1,1,2,2,3,3,4]
unique = list(set(nums))
print(unique)
💻 Output:
{'apple', 'mango', 'banana'}
{1, 2, 3, 4, 5, 6, 7, 8}
{4, 5}
{1, 2, 3}
{1, 2, 3, 6, 7, 8}
[1, 2, 3, 4]

9.3 Dictionaries
A dictionary stores data as key-value pairs — like a real dictionary where you look up a word (key) to
find its meaning (value).
Real world use: Student records, product catalogs, user profiles, JSON data from APIs.
📝 Code:
student = {
'name': 'Ravi',
'age': 16,
'marks': [85, 92, 78],
'city': 'Hyderabad'
}

# Accessing values
print(student['name']) # Ravi
print([Link]('age')) # 16
print([Link]('grade', 'N/A')) # N/A (default if not found)

# Adding / updating
student['grade'] = 'A'
student['age'] = 17

# Deleting
del student['city']
# Looping
for key, value in [Link]():
print(f'{key}: {value}')
💻 Output:
Ravi
16
N/A
name: Ravi
age: 17
marks: [85, 92, 78]
grade: A

All Dictionary Methods

Method What it does


dict[key] Get value — raises KeyError if missing
[Link](key, default) Get value — returns default if missing
dict[key] = val Add or update a key
[Link]({...}) Merge another dict in
del dict[key] Delete a key
[Link](key) Remove and return value
[Link]() Remove and return last item as (key, value)
[Link]() All keys
[Link]() All values
[Link]() All (key, value) pairs
key in dict Check if key exists
[Link]() Shallow copy
[Link]() Empty the dictionary
len(dict) Number of key-value pairs
[Link](k, v) Set key only if it doesn't exist

Dictionary Comprehension
📝 Code:
# Square of each number
squares = {x: x**2 for x in range(6)}
print(squares)

# Filter: only even-squared


even_sq = {x: x**2 for x in range(6) if x % 2 == 0}
print(even_sq)
💻 Output:
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
{0: 0, 2: 4, 4: 16}
SECTION 10: File Handling

10.1 Opening & Reading Files


Python can read from and write to files on your computer. Always use 'with open()' — it automatically
closes the file even if an error occurs.

Mode What it does


'r' Read only (default). File must exist.
'w' Write — creates new or OVERWRITES existing file
'a' Append — adds to end of existing file
'x' Create — creates new file; error if exists
'rb' Read binary (images, PDFs)
'wb' Write binary

📝 Code:
# Writing to a file
with open('[Link]', 'w') as f:
[Link]('Ravi,85\n')
[Link]('Priya,92\n')
[Link]('Arjun,78\n')

# Reading entire file


with open('[Link]', 'r') as f:
content = [Link]()
print(content)

# Reading line by line


with open('[Link]', 'r') as f:
for line in f:
name, score = [Link]().split(',')
print(f'{name} scored {score}')
💻 Output:
Ravi,85
Priya,92
Arjun,78
Ravi scored 85
Priya scored 92
Arjun scored 78
SECTION 11: Exception Handling (try / except)

11.1 What is an Exception?


An exception is an error that happens while your program is running. Without exception handling, your
program would crash. With it, you can handle the error gracefully.
Real world use: Banking apps catch errors when you enter wrong PIN. Web apps catch errors when a
server is down.

📝 Code:
try:
num = int(input('Enter a number: ')) # user types: abc
print(10 / num)
except ValueError:
print('That is not a valid number!')
except ZeroDivisionError:
print('Cannot divide by zero!')
except Exception as e:
print(f'Something went wrong: {e}')
else:
print('Everything worked fine!')
finally:
print('This always runs!')
💻 Output:
That is not a valid number!
This always runs!

11.2 Common Exception Types

Exception When it happens


ValueError Wrong type given: int('abc')
TypeError Wrong operation type: 'a' + 1
ZeroDivisionError Dividing by zero: 10 / 0
IndexError List index out of range: [1,2][5]
KeyError Dict key doesn't exist: d['xyz']
FileNotFoundError File doesn't exist
AttributeError Object has no such attribute
NameError Variable not defined
ImportError Module not found
OverflowError Number too large
RecursionError Too many recursive calls

11.3 Raising Your Own Exceptions


📝 Code:
def check_age(age):
if age < 0:
raise ValueError('Age cannot be negative!')
if age > 150:
raise ValueError('That age is not realistic!')
return f'Age {age} is valid.'

try:
print(check_age(-5))
except ValueError as e:
print(f'Error: {e}')
💻 Output:
Error: Age cannot be negative!
SECTION 12: Object-Oriented Programming (OOP)

OOP is a way of writing code that groups related data and functions together into 'objects'.
Think of it like building blocks. Each block (object) knows what it is and what it can do.
OOP has 4 main pillars: Encapsulation, Inheritance, Polymorphism, Abstraction.

12.1 Classes and Objects


A CLASS is like a blueprint. An OBJECT is the actual thing built from that blueprint.
Real world: Class = Car blueprint. Objects = your actual car, your friend's car.

📝 Code:
class Dog: # define the class
# __init__ is the constructor — runs when object is created
def __init__(self, name, breed, age):
[Link] = name # self refers to THIS dog
[Link] = breed
[Link] = age

def bark(self):
print(f'{[Link]} says: Woof!')

def info(self):
print(f'{[Link]} is a {[Link]}, {[Link]} years old.')

# Creating objects
dog1 = Dog('Bruno', 'Labrador', 3)
dog2 = Dog('Max', 'Pug', 5)

[Link]()
[Link]()
[Link]()
print([Link]) # access attribute
💻 Output:
Bruno says: Woof!
Bruno is a Labrador, 3 years old.
Max says: Woof!
Max
12.2 Encapsulation — Keeping Data Safe
Encapsulation means hiding the internal details and only showing what is necessary. In Python, we use
underscores to signal privacy.
Real world: ATM — you press a button and get money. You cannot see the internal code that
processes it.
📝 Code:
class BankAccount:
def __init__(self, owner, balance):
[Link] = owner
self.__balance = balance # __ makes it private

def deposit(self, amount):


if amount > 0:
self.__balance += amount
print(f'Deposited ₹{amount}')

def withdraw(self, amount):


if amount > self.__balance:
print('Insufficient funds!')
else:
self.__balance -= amount
print(f'Withdrawn ₹{amount}')

def get_balance(self):
return self.__balance # controlled access

acc = BankAccount('Ravi', 5000)


[Link](2000)
[Link](8000)
print(f'Balance: ₹{acc.get_balance()}')
# acc.__balance # ERROR! Cannot access directly
💻 Output:
Deposited ₹2000
Insufficient funds!
Balance: ₹7000

12.3 Inheritance — Reusing Code


Inheritance lets a child class automatically get all the features of a parent class. No need to rewrite
everything!
Real world: Animal → Dog, Cat, Bird. Vehicle → Car, Bus, Truck.
📝 Code:
class Animal: # parent class
def __init__(self, name):
[Link] = name

def eat(self):
print(f'{[Link]} is eating.')

def speak(self):
print('...')

class Dog(Animal): # child class inherits Animal


def speak(self): # overrides parent method
print(f'{[Link]} says: Woof!')

class Cat(Animal): # another child class


def speak(self):
print(f'{[Link]} says: Meow!')

def purr(self): # new method only for Cat


print(f'{[Link]} purrs...')

d = Dog('Bruno')
c = Cat('Whiskers')

[Link]() # inherited from Animal


[Link]() # Dog's own version
[Link]() # inherited from Animal
[Link]() # Cat's own version
[Link]() # only for cats
💻 Output:
Bruno is eating.
Bruno says: Woof!
Whiskers is eating.
Whiskers says: Meow!
Whiskers purrs...

super() — Calling the Parent's __init__


📝 Code:
class Animal:
def __init__(self, name, sound):
[Link] = name
[Link] = sound
class Dog(Animal):
def __init__(self, name, breed):
super().__init__(name, 'Woof') # calls Animal's __init__
[Link] = breed

def info(self):
print(f'{[Link]} ({[Link]}) says {[Link]}')

d = Dog('Bruno', 'Labrador')
[Link]()
💻 Output:
Bruno (Labrador) says Woof

12.4 Polymorphism — Same Name, Different Behaviour


Polymorphism means 'many forms'. The same method name does different things in different classes.
Real world: A 'draw()' method on a Circle draws a circle; on a Square it draws a square. Same name,
different result.
📝 Code:
class Shape:
def area(self):
return 0

class Circle(Shape):
def __init__(self, r):
self.r = r
def area(self):
return 3.14 * self.r ** 2

class Rectangle(Shape):
def __init__(self, w, h):
self.w = w
self.h = h
def area(self):
return self.w * self.h

shapes = [Circle(7), Rectangle(4, 5), Circle(3)]

for shape in shapes:


print(f'{shape.__class__.__name__}: area = {[Link]()}')
💻 Output:
Circle: area = 153.86
Rectangle: area = 20
Circle: area = 28.26

12.5 Abstraction — Hiding Complexity


Abstraction means showing only what is necessary and hiding the inner details. We use abstract
classes to force child classes to implement certain methods.
📝 Code:
from abc import ABC, abstractmethod

class Vehicle(ABC): # abstract class


@abstractmethod
def start(self): # must be implemented by child
pass

def stop(self):
print('Stopping...')

class Car(Vehicle):
def start(self):
print('Car started with key')

class Bike(Vehicle):
def start(self):
print('Bike started with kick')

c = Car()
[Link]()
[Link]()

b = Bike()
[Link]()

# v = Vehicle() # ERROR! Cannot create abstract class object


💻 Output:
Car started with key
Stopping...
Bike started with kick
12.6 Special (Dunder) Methods
These are special methods with double underscores. They let your objects work with Python's built-in
operations like print(), len(), +, etc.
📝 Code:
class Student:
def __init__(self, name, marks):
[Link] = name
[Link] = marks

def __str__(self): # called by print()


return f'Student: {[Link]}'

def __repr__(self): # developer representation


return f'Student({[Link]!r}, {[Link]})'

def __len__(self): # called by len()


return len([Link])

def __add__(self, other): # called by +


combined = [Link] + [Link]
return Student([Link] + '+' + [Link], combined)

def __lt__(self, other): # called by <


return sum([Link]) < sum([Link])

s1 = Student('Ravi', [85, 92, 78])


s2 = Student('Priya', [90, 88, 95])

print(s1) # calls __str__


print(len(s1)) # calls __len__
print(s1 < s2) # calls __lt__
merged = s1 + s2 # calls __add__
print([Link])
💻 Output:
Student: Ravi
3
True
Ravi+Priya

12.7 Class Methods & Static Methods


📝 Code:
class MathHelper:
pi = 3.14159 # class variable (shared)

def __init__(self, value):


[Link] = value

# classmethod — works with class, not instance


@classmethod
def circle_area(cls, r):
return [Link] * r ** 2

# staticmethod — no self or cls


@staticmethod
def is_even(n):
return n % 2 == 0

print(MathHelper.circle_area(5))
print(MathHelper.is_even(10))
print(MathHelper.is_even(7))
💻 Output:
78.53975
True
False
SECTION 13: Built-in Functions & Important Modules

13.1 All Important Built-in Functions

Function What it does + Example


print(x) Displays x on screen
input(prompt) Takes text input from user
len(x) Length of string/list/dict: len([1,2,3]) → 3
type(x) Type of x: type(3.14) → float
int(x) Convert to integer: int('5') → 5
float(x) Convert to float: float(3) → 3.0
str(x) Convert to string: str(42) → '42'
bool(x) Convert to bool: bool(0) → False
list(x) Convert to list: list('abc') → ['a','b','c']
tuple(x) Convert to tuple: tuple([1,2]) → (1,2)
set(x) Convert to set (removes duplicates)
dict() Create empty dict or from pairs
range(n) Creates range 0 to n-1
range(a,b,step) Range from a to b (step size)
enumerate(lst) Returns (index, value) pairs
zip(a, b) Pairs items from two iterables
map(fn, lst) Apply function to each item
filter(fn, lst) Keep items where function returns True
sorted(x) Returns sorted copy
reversed(x) Returns reversed iterator
sum(lst) Sum of numbers: sum([1,2,3]) → 6
min(lst) Smallest item: min([3,1,4]) → 1
max(lst) Largest item: max([3,1,4]) → 4
abs(x) Absolute value: abs(-5) → 5
round(x, n) Round to n decimal places: round(3.14159, 2) →
3.14
pow(x, y) Power: pow(2, 10) → 1024
divmod(a, b) Returns (quotient, remainder): divmod(10,3) →
(3,1)
hash(x) Hash value of object
id(x) Memory address of object
isinstance(x, T) True if x is of type T
issubclass(A, B) True if A is subclass of B
hasattr(obj, 'name') True if obj has attribute 'name'
getattr(obj,'name') Get attribute by name string
setattr(obj,'n',v) Set attribute by name string
delattr(obj,'name') Delete attribute
dir(x) List all attributes/methods of x
vars(obj) Returns object's __dict__
help(x) Show documentation for x
callable(x) True if x can be called as function
open(file, mode) Open a file
chr(n) Character for ASCII: chr(65) → 'A'
ord(c) ASCII number: ord('A') → 65
hex(n) Hex string: hex(255) → '0xff'
oct(n) Octal string: oct(8) → '0o10'
bin(n) Binary string: bin(5) → '0b101'
format(v, spec) Format a value: format(3.14,'f')
eval(str) Run string as Python code (careful!)
exec(str) Execute string as Python code
globals() Returns global variable dict
locals() Returns local variable dict
any(lst) True if at least one item is True
all(lst) True if ALL items are True
iter(x) Creates an iterator from x
next(it) Gets the next item from iterator
slice(a,b) Creates a slice object
object() Base class of all Python classes
super() Call parent class methods
staticmethod(fn) Decorator for static method
classmethod(fn) Decorator for class method
property(fn) Makes a method act like an attribute

13.2 math Module


The math module gives you advanced mathematical functions.
📝 Code:
import math

print([Link](144)) # 12.0 — square root


print([Link]) # 3.14159...
print([Link](4.1)) # 5 — round up
print([Link](4.9)) # 4 — round down
print([Link](5)) # 120
print([Link](100, 10)) # 2.0 — log base 10
print([Link](math.e)) # 1.0 — natural log
print([Link]([Link]/2)) # 1.0
print([Link](0)) # 1.0
print([Link](12, 18)) # 6
print([Link]) # infinity
print([Link](float('nan'))) # True
💻 Output:
12.0
3.141592653589793
5
4
120
2.0
1.0
1.0
1.0
6
inf
True

13.3 random Module


📝 Code:
import random

print([Link]()) # float between 0 and 1


print([Link](1, 100)) # random integer 1-100
print([Link](['a','b','c'])) # random item from list
print([Link]([1,2,3], k=5)) # 5 random picks (with replacement)

cards = ['A', 'K', 'Q', 'J', '10']


[Link](cards) # shuffle in place
print(cards)

print([Link](range(100), 5)) # 5 unique random numbers


print([Link](1.5, 10.5)) # random float in range
💻 Output:
0.7234... (random)
42 (random)
b (random)
[1,1,3,2,1] (random)
shuffled...
[23,67,4,89,31] (random)
6.23 (random)

13.4 datetime Module


📝 Code:
from datetime import datetime, date, timedelta

now = [Link]()
print(now) # current date and time
print([Link], [Link], [Link])
print([Link]('%d-%m-%Y')) # format: 05-06-2025

# Date arithmetic
today = [Link]()
future = today + timedelta(days=30)
print(f'30 days from now: {future}')

# Parse a date string


birthday = [Link]('15-08-2000', '%d-%m-%Y')
print(birthday)
💻 Output:
2025-06-05 14:23:11.123456
2025 6 5
05-06-2025
30 days from now: 2025-07-05
2000-08-15 00:00:00

13.5 os & sys Modules


📝 Code:
import os
import sys

print([Link]()) # current directory


print([Link]('.')) # list files in directory
[Link]('new_folder') # create a folder
[Link]('[Link]', '[Link]')# rename file
[Link]('[Link]') # delete file
print([Link]('[Link]')) # check if file exists
print([Link]('data', '[Link]')) # safe path joining

print([Link]) # Python version


print([Link]) # command line arguments
💻 Output:
/home/user/project
['[Link]', 'data', ...]
True
data/[Link] (or data\[Link] on Windows)
3.11.0 ...

13.6 collections Module


📝 Code:
from collections import Counter, defaultdict, OrderedDict, deque

# Counter — count occurrences


words = ['apple', 'mango', 'apple', 'banana', 'mango', 'apple']
c = Counter(words)
print(c) # Counter({'apple': 3, ...})
print(c.most_common(2)) # top 2 items

# defaultdict — no KeyError for missing keys


dd = defaultdict(list)
dd['fruits'].append('mango') # works without dd['fruits'] = []
print(dd)

# deque — fast append/remove from both ends


dq = deque([1, 2, 3])
[Link](0)
[Link](4)
[Link]()
print(dq)
💻 Output:
Counter({'apple': 3, 'mango': 2, 'banana': 1})
[('apple', 3), ('mango', 2)]
defaultdict(<class 'list'>, {'fruits': ['mango']})
deque([1, 2, 3, 4])
13.7 itertools Module
📝 Code:
import itertools

# chain — combine iterables


print(list([Link]([1,2], [3,4], [5])))

# combinations
print(list([Link]('ABC', 2)))

# permutations
print(list([Link]([1,2,3], 2)))

# product — cartesian product


print(list([Link]([1,2], ['a','b'])))

# groupby
data = [('A',1),('A',2),('B',3),('B',4)]
for key, group in [Link](data, key=lambda x: x[0]):
print(key, list(group))
💻 Output:
[1, 2, 3, 4, 5]
[('A','B'),('A','C'),('B','C')]
[(1,2),(1,3),(2,1),(2,3),(3,1),(3,2)]
[(1,'a'),(1,'b'),(2,'a'),(2,'b')]
A [('A',1),('A',2)]
B [('B',3),('B',4)]

13.8 functools Module


📝 Code:
from functools import reduce, partial, lru_cache

# reduce — apply function cumulatively


nums = [1, 2, 3, 4, 5]
product = reduce(lambda a, b: a * b, nums)
print(product) # 120 (1*2*3*4*5)

# partial — fix some arguments of a function


def multiply(x, y):
return x * y
double = partial(multiply, 2) # x is fixed as 2
print(double(5)) # 10

# lru_cache — cache results to speed up repetitive calls


@lru_cache(maxsize=None)
def fibonacci(n):
if n < 2: return n
return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(40)) # fast, cached!


💻 Output:
120
10
102334155
SECTION 14: NumPy — Numbers at Speed

NumPy = Numerical Python. It is the foundation of ALL data science in Python.


It works with arrays (like lists but MUCH faster for math). Instead of looping, NumPy does
math on the whole array at once. Used in every ML library: pandas, scikit-learn, TensorFlow.

14.1 Creating Arrays


📝 Code:
import numpy as np

# From list
a = [Link]([1, 2, 3, 4, 5])
print(a) # [1 2 3 4 5]
print([Link]) # (5,) — shape
print([Link]) # int64

# Special arrays
print([Link]((2, 3))) # 2x3 of zeros
print([Link]((2, 3))) # 2x3 of ones
print([Link](3)) # 3x3 identity matrix
print([Link](0, 10, 2)) # [0 2 4 6 8]
print([Link](0,1,5)) # 5 evenly spaced from 0 to 1
print([Link](3, 2)) # 3x2 random floats 0-1
print([Link]((2, 2), 7)) # 2x2 filled with 7
💻 Output:
[1 2 3 4 5]
(5,)
int64
[[0. 0. 0.] [0. 0. 0.]]
[[1. 1. 1.] [1. 1. 1.]]
...
[0 2 4 6 8]
[0. 0.25 0.5 0.75 1. ]
...
[[7 7] [7 7]]
14.2 Array Operations (Vectorized — No Loop Needed!)
📝 Code:
a = [Link]([1, 2, 3, 4, 5])
b = [Link]([10, 20, 30, 40, 50])

print(a + b) # [11 22 33 44 55]


print(a * 2) # [ 2 4 6 8 10]
print(b / a) # [10. 10. 10. 10. 10.]
print(a ** 2) # [ 1 4 9 16 25]
print([Link](a)) # [1. 1.41 1.73 2. 2.24]

# Boolean operations
print(a > 2) # [False False True True True]
print(a[a > 2]) # [3 4 5] — filter!
💻 Output:
[11 22 33 44 55]
[ 2 4 6 8 10]
[10. 10. 10. 10. 10.]
[ 1 4 9 16 25]
[1. 1.414 1.732 2. 2.236]
[False False True True True]
[3 4 5]

14.3 2D Arrays (Matrices)


📝 Code:
m = [Link]([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])

print([Link]) # (3, 3)
print(m[0]) # [1 2 3] — first row
print(m[:, 1]) # [2 5 8] — second column
print(m[1, 2]) # 6 — row 1, col 2
print(m[0:2, 0:2]) # top-left 2x2 submatrix

# Matrix math
print(m.T) # transpose
print([Link](1, 9)) # reshape to 1x9
print([Link](m, m)) # matrix multiplication
💻 Output:
(3, 3)
[1 2 3]
[2 5 8]
6
[[1 2] [4 5]]
transpose...
[[1 2 3 4 5 6 7 8 9]]
matrix product...

14.4 NumPy Statistics


📝 Code:
data = [Link]([15, 22, 8, 35, 42, 18, 27, 31])

print([Link](data)) # average
print([Link](data)) # middle value
print([Link](data)) # standard deviation
print([Link](data)) # variance
print([Link](data))
print([Link](data))
print([Link](data))
print([Link](data)) # index of min
print([Link](data)) # index of max
print([Link](data, 75)) # 75th percentile
print([Link](data)) # cumulative sum
💻 Output:
24.75
24.5
10.81...
116.94...
8
42
198
2
4
31.75
[ 15 37 45 80 122 140 167 198]

14.5 Useful NumPy Functions for Data Science

Function What it does


[Link](a) Unique values
[Link](a) Sorted array
[Link](a) Indices that would sort the array
[Link](a, lo, hi) Cap values between lo and hi
[Link](cond, x, y) Like if-else for arrays
[Link]([a,b]) Join arrays
[Link]([a,b]) Stack arrays vertically (row-wise)
[Link]([a,b]) Stack arrays horizontally (col-wise)
[Link](a, n) Split into n equal parts
[Link]() Convert any-D to 1D
[Link](a) Flatten without copy
[Link] Represents missing/Not a Number
[Link](a) True where values are NaN
[Link](a) Mean ignoring NaN values
[Link](a) Natural logarithm
[Link](a) e raised to power
[Link](a) Absolute values
[Link](a, n) Round to n decimals
[Link](n) Set random seed (reproducible)
[Link](r,c) Standard normal distribution
[Link](a,b,n) n random integers from a to b
[Link](a) Magnitude of vector
[Link](m) Inverse of matrix
[Link](m) Determinant
[Link](m) Eigenvalues & eigenvectors
SECTION 15: Pandas — Working with Tables of Data

Pandas is the #1 tool for data analysis in Python. It works like a super-powered Excel in code.
The two main structures: Series (one column) and DataFrame (whole table with rows &
columns).
Used in EVERY data science workflow: cleaning data, exploring data, preparing for machine
learning.

15.1 Series — One Column of Data


📝 Code:
import pandas as pd

# Create a Series
marks = [Link]([85, 92, 78, 95, 88],
index=['Ravi','Priya','Arjun','Sita','Ram'])
print(marks)
print('Mean:', [Link]())
print('Ravi:', marks['Ravi'])
print('Top scorers:')
print(marks[marks > 88])
💻 Output:
Ravi 85
Priya 92
Arjun 78
Sita 95
Ram 88
dtype: int64
Mean: 87.6
Ravi: 85
Top scorers:
Priya 92
Sita 95

15.2 DataFrame — The Full Table


📝 Code:
data = {
'Name': ['Ravi', 'Priya', 'Arjun', 'Sita', 'Ram'],
'Age': [16, 15, 17, 16, 15],
'Score': [85, 92, 78, 95, 88],
'Grade': ['B', 'A', 'C', 'A+', 'A']
}
df = [Link](data)
print(df)
💻 Output:
Name Age Score Grade
0 Ravi 16 85 B
1 Priya 15 92 A
2 Arjun 17 78 C
3 Sita 16 95 A+
4 Ram 15 88 A

15.3 Exploring a DataFrame


📝 Code:
print([Link]) # (5, 4) — 5 rows, 4 columns
print([Link]) # data types of each column
print([Link](3)) # first 3 rows
print([Link](2)) # last 2 rows
print([Link]()) # summary info
print([Link]()) # stats for numeric columns
print([Link]) # column names
print([Link]) # row indices
print([Link]().sum()) # count missing values per column
print([Link]()) # unique values per column
print([Link](2)) # 2 random rows
💻 Output:
(5, 4)
Name: object, Age: int64 ...
first 3 rows...
last 2 rows...
info summary...
count mean std min 25% ...
Index(['Name','Age','Score','Grade'])
RangeIndex(start=0, stop=5)
0 missing values ...

15.4 Selecting Data


📝 Code:
# Select a column
print(df['Name'])
print(df[['Name', 'Score']]) # multiple columns

# Select rows by index number — iloc


print([Link][0]) # first row
print([Link][1:3]) # rows 1 and 2
print([Link][0, 2]) # row 0, column 2

# Select rows by label — loc


print([Link][0, 'Name']) # row 0, Name column
print([Link][0:2, ['Name','Score']]) # slice rows, specific cols

# Filtering rows
print(df[df['Score'] > 85]) # students with score > 85
print(df[df['Age'] == 16]) # age 16 only
print(df[(df['Score'] > 80) & (df['Age'] == 15)]) # multiple conditions
💻 Output:
0 Ravi ...
(Name, Score columns)
Name Ravi Age 16 ...
rows 1-2...
78
Ravi
(rows 0-2, Name+Score)
rows with score > 85...
rows where age=16...
filtered rows...

15.5 Modifying DataFrames


📝 Code:
# Add new column
df['Passed'] = df['Score'] >= 80
df['Score_x2'] = df['Score'] * 2

# Update values
[Link][2, 'Score'] = 82 # fix Arjun's score

# Rename columns
[Link](columns={'Score': 'Marks'}, inplace=True)

# Drop column
[Link](columns=['Score_x2'], inplace=True)
# Drop row
[Link](index=4, inplace=True)

# Reset index
df.reset_index(drop=True, inplace=True)

print(df)
💻 Output:
updated DataFrame printed...

15.6 Handling Missing Data


📝 Code:
import numpy as np

df2 = [Link]({
'Name': ['Ravi', 'Priya', None, 'Sita'],
'Score': [85, None, 78, 95],
'City': ['Hyd', 'Hyd', None, 'Pune']
})

print([Link]()) # True where value is missing


print([Link]().sum()) # count missing per column

# Fill missing values


df2['Score'].fillna(df2['Score'].mean(), inplace=True) # fill with mean
df2['City'].fillna('Unknown', inplace=True) # fill with text

# Drop rows with ANY missing value


df3 = [Link]()

# Drop rows where ALL values are missing


df4 = [Link](how='all')

print(df2)
💻 Output:
NaN → filled or dropped...

15.7 Groupby & Aggregation


GroupBy is like putting students in groups by grade and then calculating stats for each group.
📝 Code:
data = {
'City': ['Hyd','Hyd','Pune','Pune','Delhi'],
'Sales': [1000, 1500, 800, 1200, 900],
'Year': [2023, 2024, 2023, 2024, 2023]
}
df = [Link](data)

# Total sales by city


print([Link]('City')['Sales'].sum())

# Multiple aggregations
print([Link]('City')['Sales'].agg(['mean','sum','min','max']))

# Group by multiple columns


print([Link](['City','Year'])['Sales'].sum())
💻 Output:
City
Delhi 900
Hyd 2500
Pune 2000
(mean, sum, min, max per city)
(city+year grouped sums)

15.8 Sorting & Ranking


📝 Code:
df = [Link]({'Name':['C','A','B'],'Score':[85,92,78]})

# Sort by column
print(df.sort_values('Score')) # ascending
print(df.sort_values('Score', ascending=False)) # descending
print(df.sort_values(['Name','Score'])) # multiple columns

# Rank
df['Rank'] = df['Score'].rank(ascending=False).astype(int)
print(df)
💻 Output:
sorted by score...
descending...
multi-sorted...
with rank column...
15.9 Merging & Joining DataFrames
📝 Code:
students = [Link]({'ID':[1,2,3], 'Name':['Ravi','Priya','Arjun']})
scores = [Link]({'ID':[1,2,4], 'Score':[85,92,78]})

# Inner join — only matching IDs


print([Link](students, scores, on='ID', how='inner'))

# Left join — all students, NaN where no score


print([Link](students, scores, on='ID', how='left'))

# Outer join — everything


print([Link](students, scores, on='ID', how='outer'))

# Concat — stack DataFrames


df1 = [Link]({'A':[1,2],'B':[3,4]})
df2 = [Link]({'A':[5,6],'B':[7,8]})
print([Link]([df1, df2], ignore_index=True))
💻 Output:
inner...
left...
outer...
concatenated...

15.10 Apply & Map — Custom Transformations


📝 Code:
df = [Link]({'Name':['Ravi','Priya'],'Score':[85,92]})

# apply — run function on each item


df['Grade'] = df['Score'].apply(lambda x: 'A' if x >= 90 else 'B')
print(df)

# map — replace values using dict


grade_map = {'A': 'Excellent', 'B': 'Good'}
df['Level'] = df['Grade'].map(grade_map)
print(df)

# applymap/map — apply to entire DataFrame


nums = [Link]({'a':[1,2],'b':[3,4]})
print([Link](lambda x: x*2))
💻 Output:
with Grade column...
with Level column...
doubled values...

15.11 Reading & Writing Files


📝 Code:
# CSV
df = pd.read_csv('[Link]')
df = pd.read_csv('[Link]', index_col=0, parse_dates=['date'])
df.to_csv('[Link]', index=False)

# Excel
df = pd.read_excel('[Link]', sheet_name='Sheet1')
df.to_excel('[Link]', index=False)

# JSON
df = pd.read_json('[Link]')
df.to_json('[Link]', orient='records')

# Useful read_csv options


df = pd.read_csv('[Link]',
sep=',', # separator
header=0, # row to use as header
usecols=['A','B'], # load only these columns
nrows=100, # load only first 100 rows
na_values=['NA','?'] # treat these as NaN
)
💻 Output:
DataFrame loaded...

15.12 All Important Pandas Functions Reference

Function/Method What it does


pd.read_csv() Load CSV file into DataFrame
pd.read_excel() Load Excel file
df.to_csv() Save to CSV
[Link](n) First n rows (default 5)
[Link](n) Last n rows
[Link]() Column types and non-null counts
[Link]() Statistics for numeric cols
[Link] Rows x Columns
[Link] Data type of each column
[Link] Column names
[Link] Row indices
[Link]() True/False for missing values
[Link]() Remove rows with missing values
[Link](v) Fill missing with value v
[Link]() Mark duplicate rows
df.drop_duplicates() Remove duplicate rows
[Link]() Rename columns or index
[Link]() Convert column to type
df.sort_values(col) Sort by column
[Link](col) Group data by column
[Link](fn) Aggregate with function
df.pivot_table() Create pivot table
[Link]() Wide to long format
[Link](df2) SQL-style join
[Link]([a,b]) Stack DataFrames
[Link](fn) Apply function to col/row
[Link](fn) Apply function elementwise
df.value_counts() Count unique values
[Link]() Correlation matrix
[Link]() Covariance matrix
[Link](n) Random n rows
df.reset_index() Reset to integer index
df.set_index(col) Set column as index
[Link]() Deep copy of DataFrame
df.T Transpose (swap rows/cols)
[Link]('col>5') Filter using string query
[Link][row, col] Access by label
[Link][row, col] Access by integer index
[Link][row, col] Single value by label
[Link][row, col] Single value by integer
[Link](lo, hi) Limit values to range
[Link](n) Round to n decimal places
[Link]() Column sums
[Link]() Column means
[Link]() Standard deviation
[Link]() Variance
[Link]()/max() Min/max of each column
[Link]() Cumulative sum
[Link]() Difference between rows
[Link](n) Shift values by n rows
df.pct_change() Percentage change from previous
[Link](n) Rolling window operations
[Link]() Expanding window operations
[Link]() String matching in column
[Link]() Split string column
[Link]() Strip whitespace
[Link]()/upper() Case conversion
pd.get_dummies(df) One-hot encoding
[Link]() Rank values in column
SECTION 16: Matplotlib & Seaborn — Data Visualization

16.1 Matplotlib Basics


Matplotlib is the base plotting library. Think of it as the pencil and paper for drawing graphs.
📝 Code:
import [Link] as plt
import numpy as np

x = [1, 2, 3, 4, 5]
y = [10, 25, 15, 30, 20]

[Link](figsize=(8, 5)) # set size


[Link](x, y, color='blue', marker='o', linewidth=2, label='Sales')
[Link]('Monthly Sales')
[Link]('Month')
[Link]('Sales (₹ thousands)')
[Link]()
[Link](True)
[Link]('[Link]', dpi=150) # save to file
[Link]()
💻 Output:
[chart saved as [Link]]

16.2 Types of Charts

Chart Type Code + When to use


Line chart [Link](x,y) — trends over time
Bar chart [Link](x,y) — comparing categories
Horizontal bar [Link](x,y) — long category names
Histogram [Link](data,bins=20) — distribution
Scatter plot [Link](x,y) — relationship between 2 vars
Pie chart [Link](sizes, labels=...) — parts of whole
Box plot [Link](data) — spread & outliers
Multiple plots [Link](1,2,1) then [Link](1,2,2)
Heatmap (seaborn) [Link](data, annot=True) — matrix viz
Pair plot [Link](df) — all variable pairs
Distribution [Link](data, kde=True)

16.3 Seaborn — Beautiful Plots Easily


📝 Code:
import seaborn as sns
import [Link] as plt
import pandas as pd

# Load sample dataset


tips = sns.load_dataset('tips')

# Distribution plot
[Link](tips['total_bill'], kde=True)
[Link]()

# Scatter with color grouping


[Link](data=tips, x='total_bill', y='tip', hue='sex')
[Link]()

# Box plot
[Link](data=tips, x='day', y='total_bill', hue='sex')
[Link]()

# Correlation heatmap
corr = [Link](numeric_only=True)
[Link](corr, annot=True, cmap='coolwarm', fmt='.2f')
[Link]()
💻 Output:
[charts displayed]
SECTION 17: Scikit-Learn — Machine Learning Basics

Scikit-learn is THE machine learning library for Python. It has ready-made algorithms for:
• Classification (predict which category: spam/not spam, disease/healthy)
• Regression (predict a number: house price, salary)
• Clustering (group similar data: customer segments)
• Everything follows the same pattern: Import → Create → Fit → Predict → Evaluate

17.1 The Standard ML Workflow


📝 Code:
from sklearn.model_selection import train_test_split
from [Link] import StandardScaler
from sklearn.linear_model import LogisticRegression
from [Link] import accuracy_score, classification_report

# Step 1: Prepare data


# X = features (input), y = target (output)
X = df[['feature1', 'feature2', 'feature3']]
y = df['target']

# Step 2: Split into training and test sets


X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
) # 80% train, 20% test

# Step 3: Scale features (important for many algorithms)


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = [Link](X_test)

# Step 4: Create and train model


model = LogisticRegression()
[Link](X_train, y_train)

# Step 5: Predict
y_pred = [Link](X_test)
# Step 6: Evaluate
print('Accuracy:', accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
💻 Output:
Accuracy: 0.95...
classification report...

17.2 Preprocessing Functions

Function What it does


StandardScaler Scale to mean=0, std=1 (z-score)
MinMaxScaler Scale to 0-1 range
RobustScaler Scales using median (robust to outliers)
LabelEncoder Convert text labels to numbers: cat→0, dog→1
OneHotEncoder Convert categories to binary columns
pd.get_dummies() Same as OneHotEncoder but simpler
SimpleImputer Fill missing values (mean/median/constant)
PolynomialFeatures Add polynomial features (x², x*y)
train_test_split Split data into training and test sets
KFold K-fold cross validation splits
cross_val_score Run cross validation and return scores

17.3 Common ML Algorithms

Algorithm When to use / Import


LinearRegression Predict continuous number. sklearn.linear_model
LogisticRegression Binary classification (yes/no).
sklearn.linear_model
DecisionTreeClassifier Easy to understand tree-based decisions
RandomForestClassifier Many trees — very accurate. [Link]
GradientBoostingClassifier Boosted trees — top accuracy. [Link]
KNeighborsClassifier Classify by nearest neighbors
SVC Support Vector Machine for classification
KMeans Unsupervised clustering. [Link]
PCA Reduce dimensions. [Link]
Ridge / Lasso Regularized regression (prevent overfitting)
XGBClassifier Extreme Gradient Boosting (xgboost library)
NaiveBayes Fast text classification. sklearn.naive_bayes

17.4 Evaluation Metrics

Metric Use case + Function


accuracy_score % of correct predictions
confusion_matrix Table of TP, TN, FP, FN
classification_report Precision, Recall, F1 per class
roc_auc_score Area under ROC curve (binary classification)
mean_squared_error For regression: average squared error
mean_absolute_error For regression: average absolute error
r2_score Regression: how much variance explained
silhouette_score For clustering quality
cross_val_score Evaluate with cross-validation
SECTION 18: Advanced Python Concepts

18.1 Generators — Lazy Data Processing


A generator is like a list, but it creates items ONE AT A TIME instead of all at once. This saves memory
when working with huge datasets.
📝 Code:
# Generator function — uses 'yield' instead of 'return'
def count_up(n):
for i in range(n):
yield i # pause here, give value, resume later

gen = count_up(5)
print(next(gen)) # 0
print(next(gen)) # 1

for val in count_up(5):


print(val, end=' ')
print()

# Generator expression (like list comprehension but lazy)


squares = (x**2 for x in range(1_000_000)) # barely uses memory!
print(next(squares)) # 0
print(next(squares)) # 1
💻 Output:
0
1
0 1 2 3 4
0
1

18.2 Decorators — Wrapping Functions


A decorator adds extra behaviour to a function without changing its code. Like wrapping a gift — the gift
is the same, but it has a wrapper.
📝 Code:
import time

def timer(func): # decorator function


def wrapper(*args, **kwargs):
start = [Link]()
result = func(*args, **kwargs)
end = [Link]()
print(f'{func.__name__} took {end-start:.4f}s')
return result
return wrapper

@timer # apply decorator


def slow_function():
[Link](0.1)
return 'done'

result = slow_function() # automatically timed!


print(result)
💻 Output:
slow_function took 0.1003s
done

18.3 Context Managers — with Statement


Context managers make sure resources (files, database connections) are properly cleaned up, even if
an error occurs.
📝 Code:
# Custom context manager using class
class Timer:
def __enter__(self):
import time
[Link] = [Link]()
return self

def __exit__(self, *args):


import time
elapsed = [Link]() - [Link]
print(f'Time: {elapsed:.4f}s')

with Timer():
total = sum(range(1_000_000))
print(total)
💻 Output:
499999500000
Time: 0.0341s
18.4 Regular Expressions
Regular expressions (regex) are patterns for searching and manipulating text. Very useful for data
cleaning.
📝 Code:
import re

text = 'Call us: 9876543210 or 040-23456789 for info@[Link]'

# Find phone numbers


phones = [Link](r'\d{10}', text)
print(phones)

# Check if email is valid


email = 'test@[Link]'
pattern = r'^[\w.-]+@[\w.-]+\.\w+$'
print(bool([Link](pattern, email)))

# Replace
cleaned = [Link](r'\s+', ' ', 'too many spaces')
print(cleaned)

# Split by multiple delimiters


parts = [Link](r'[,;|]', 'a,b;c|d')
print(parts)
💻 Output:
['9876543210']
True
too many spaces
['a', 'b', 'c', 'd']

18.5 map(), filter(), zip() in Depth


📝 Code:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# map — transform each element


squared = list(map(lambda x: x**2, nums))
print(squared)

# filter — keep only matching elements


evens = list(filter(lambda x: x % 2 == 0, nums))
print(evens)
# Combine map and filter
even_squares = list(map(lambda x: x**2, filter(lambda x: x%2==0, nums)))
print(even_squares)

# zip — combine multiple lists


names = ['Ravi', 'Priya', 'Arjun']
scores = [85, 92, 78]
cities = ['Hyd', 'Pune', 'Delhi']

combined = list(zip(names, scores, cities))


print(combined)
💻 Output:
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[2, 4, 6, 8, 10]
[4, 16, 36, 64, 100]
[('Ravi', 85, 'Hyd'), ('Priya', 92, 'Pune'), ('Arjun', 78, 'Delhi')]

18.6 Useful Python Tricks for Data Science


📝 Code:
# Unpacking
a, *middle, z = [1, 2, 3, 4, 5]
print(a, middle, z) # 1 [2, 3, 4] 5

# Walrus operator := (Python 3.8+)


data = [1, 2, 3, 4, 5]
if (n := len(data)) > 3:
print(f'Long list with {n} items')

# Dictionary unpacking
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3}
merged = {**d1, **d2} # merge dicts
print(merged)

# Conditional expression in list


data = [1, -2, 3, -4, 5]
positive = [x if x > 0 else 0 for x in data]
print(positive)
💻 Output:
1 [2, 3, 4] 5
Long list with 5 items
{'a': 1, 'b': 2, 'c': 3}
[1, 0, 3, 0, 5]
SECTION 19: Quick Reference Cheatsheet

String Formatting Specifiers

Specifier Example
{:.2f} f'{3.14159:.2f}' → '3.14' (2 decimal places)
{:d} f'{42:d}' → '42' (integer)
{:05d} f'{42:05d}' → '00042' (zero padded)
{:>10} f'{'hi':>10}' → ' hi' (right align)
{:<10} f'{'hi':<10}' → 'hi ' (left align)
{:^10} f'{'hi':^10}' → ' hi ' (center)
{:,} f'{1234567:,}' → '1,234,567' (commas)
{:.2%} f'{0.856:.2%}' → '85.60%'
{:e} f'{1234567:.2e}' → '1.23e+06' (scientific)
{:b} f'{10:b}' → '1010' (binary)
{:x} f'{255:x}' → 'ff' (hexadecimal)

Common Pandas One-Liners for Data Science


📝 Code:
df['col'].value_counts() # count unique values
df['col'].value_counts(normalize=True) # as percentages
df.select_dtypes(include='number') # numeric columns only
df.select_dtypes(include='object') # text columns only
[Link]()['target'].sort_values() # correlation with target
[Link](5, 'col') # top 5 rows by column
[Link](5, 'col') # bottom 5 rows
[Link]('keyword') # string search
[Link]([Link], bins=5) # bin into 5 groups
[Link]([Link], q=4) # quartile bins
df.to_dict('records') # DataFrame to list of dicts
[Link](list_of_dicts) # list of dicts to DataFrame
💻 Output:
useful outputs...
The Complete Python Data Science Import Block
# Data manipulation
import numpy as np
import pandas as pd

# Visualization
import [Link] as plt
import seaborn as sns
%matplotlib inline # for Jupyter notebooks

# Machine Learning
from sklearn.model_selection import train_test_split, cross_val_score
from [Link] import StandardScaler, LabelEncoder,
OneHotEncoder
from [Link] import SimpleImputer
from sklearn.linear_model import LinearRegression, LogisticRegression,
Ridge, Lasso
from [Link] import DecisionTreeClassifier
from [Link] import RandomForestClassifier,
GradientBoostingClassifier
from [Link] import SVC
from [Link] import KNeighborsClassifier
from [Link] import KMeans
from [Link] import PCA
from [Link] import accuracy_score, confusion_matrix,
classification_report
from [Link] import mean_squared_error, r2_score

# Utilities
import os, sys, re
from datetime import datetime
from collections import Counter, defaultdict
import warnings
[Link]('ignore')

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
np.set_printoptions(precision=4, suppress=True)
🎉 You now have a complete Python sprint for Data Science & AI!
Variables → Strings → Loops → Functions → OOP → NumPy → Pandas → ML

You might also like