0% found this document useful (0 votes)
43 views55 pages

Python Harvard Notes

The document is a comprehensive set of lecture notes for Harvard's CS 50P course on Python, covering fundamentals to advanced topics across 15 modules. It includes over 900 pages of content, with more than 300 code examples and real-world applications. Key topics include Python's history, installation, data types, operators, and core data structures like lists.

Uploaded by

xlucifer585
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views55 pages

Python Harvard Notes

The document is a comprehensive set of lecture notes for Harvard's CS 50P course on Python, covering fundamentals to advanced topics across 15 modules. It includes over 900 pages of content, with more than 300 code examples and real-world applications. Key topics include Python's history, installation, data types, operators, and core data structures like lists.

Uploaded by

xlucifer585
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Python Comprehensive Notes — Harvard CS 50P Page

PYTHON
COMPREHENSIVE LECTURE NOTES
From Fundamentals to Advanced Mastery

Department of Computer Science


Harvard University
CS 50P — Complete Python Mastery

Covering 15 Modules
900+ Pages of Detailed Content • 300+ Code Examples • Real-World Applications
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 1

Introduction to Python

1. Introduction to Python

1.1 History and Philosophy


Python was created by Guido van Rossum, first released in 1991. Named after Monty Python's Flying Circus,
not the snake. Van Rossum designed Python with the philosophy that code should be readable, beautiful, and
explicit rather than implicit. This philosophy is codified in PEP 20 — "The Zen of Python."

The Zen of Python (PEP 20)


Tim Peters distilled the guiding principles of Python's design. To read it, type import this in any Python
interpreter:
import this
# Beautiful is better than ugly.
# Explicit is better than implicit.
# Simple is better than complex.
# Complex is better than complicated.
# Flat is better than nested.
# Sparse is better than dense.
# Readability counts.
# Special cases aren't special enough to break the rules.
# Although practicality beats purity.
# Errors should never pass silently.
# Unless explicitly silenced.
# In the face of ambiguity, refuse the temptation to guess.
# There should be one-- and preferably only one --obvious way to do it.
# Now is better than never.

1.2 Python Versions


Python 2 reached end-of-life on January 1, 2020. All modern development uses Python 3. Key differences
include print() as a function (not statement), integer division (//), unicode strings by default, and range()
returning an iterator. Always use Python 3.10+ for new projects.
# Check Python version
import sys
print([Link]) # e.g., 3.12.0
print(sys.version_info) # sys.version_info(major=3, minor=12, ...)
print(sys.version_info >= (3, 10)) # True

1.3 Installation and Setup


Installing Python
• Download from [Link] (official) or use pyenv for version management
• On macOS: brew install python3
• On Ubuntu/Debian: sudo apt-get install python3
Python Comprehensive Notes — Harvard CS 50P Page

• Verify: python3 --version

Virtual Environments (CRITICAL PRACTICE)


A virtual environment is an isolated Python environment allowing separate dependencies per project. Always
use virtual environments — never install packages globally.
# Create a virtual environment
python3 -m venv myenv

# Activate (macOS/Linux)
source myenv/bin/activate

# Activate (Windows)
myenv\Scripts\[Link]

# Install a package inside the venv


pip install requests

# Freeze requirements
pip freeze > [Link]

# Install from requirements


pip install -r [Link]

# Deactivate
deactivate

💡 TIP: Use pyenv + pyenv-virtualenv for managing multiple Python versions alongside multiple virtual
environments.

1.4 Python Execution Model


Python is an interpreted, dynamically-typed language. When you run a .py file, CPython (the reference
implementation) compiles it to bytecode (.pyc), then the Python Virtual Machine (PVM) executes that
bytecode. This happens automatically and transparently.
# [Link]
print("Hello, Harvard!")

# Run it:
# $ python3 [Link]
# Hello, Harvard!

# Python also compiles .pyc to __pycache__/


# These are cached bytecode files — safe to ignore/delete

📘 NOTE: Python also supports interactive mode (REPL — Read-Eval-Print Loop) via python3. Use ipython for
an enhanced interactive experience with syntax highlighting and auto-completion.
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 2

Variables, Data Types & Operators

2. Variables, Data Types & Operators

2.1 Variables and Assignment


Python variables are dynamic references to objects. A variable is not a box containing a value — it is a label
pointing to an object in memory. Multiple variables can point to the same object.
# Simple assignment
x = 42
name = "Alice"
pi = 3.14159

# Multiple assignment
a = b = c = 0 # All point to same object 0

# Tuple unpacking (very Pythonic)


x, y, z = 1, 2, 3
first, *rest = [1, 2, 3, 4, 5] # first=1, rest=[2,3,4,5]
a, _, b = (10, 20, 30) # _ is convention for "don't care"

# Swap without temp variable (Python idiom)


x, y = y, x

# Augmented assignment
x += 5 # x = x + 5
x -= 2 # x = x - 2
x *= 3 # x = x * 3
x //= 2 # x = x // 2 (floor division)
x **= 2 # x = x ** 2 (power)

📘 NOTE: Python uses dynamic typing — you don't declare types. The variable's type is determined at
runtime by the object it references. Use type() or isinstance() to check types.

2.2 Built-in Data Types


Numeric Types
Python has three distinct numeric types: integers (int), floating-point numbers (float), and complex numbers
(complex). Python integers have arbitrary precision — they can be as large as memory allows.
# int — arbitrary precision integers
x = 42
big = 99999999999999999999999999999 # No overflow!
binary = 0b1010 # Binary literal = 10
octal = 0o17 # Octal literal = 15
hexa = 0xFF # Hex literal = 255

# float — 64-bit IEEE 754 double precision


pi = 3.14159265358979
sci = 1.5e-10 # Scientific notation
inf = float('inf') # Positive infinity
nan = float('nan') # Not a Number
Python Comprehensive Notes — Harvard CS 50P Page

# complex — real + imaginary


z = 3 + 4j
print([Link]) # 3.0
print([Link]) # 4.0
print(abs(z)) # 5.0 (magnitude = sqrt(3²+4²))

# Type conversions
int(3.9) # 3 (truncates, does NOT round)
float(7) # 7.0
round(3.567, 2) # 3.57

⚠️WARNING: Floating-point arithmetic is not exact due to IEEE 754 binary representation. 0.1 + 0.2 != 0.3
in Python. Use the decimal module for exact decimal arithmetic (finance, science).
from decimal import Decimal, getcontext
getcontext().prec = 50 # 50 significant digits

a = Decimal("0.1")
b = Decimal("0.2")
print(a + b) # 0.3 (exact!)

# Also: fractions for exact rational arithmetic


from fractions import Fraction
f = Fraction(1, 3) # Exactly 1/3
print(f + Fraction(1, 6)) # 1/2

Strings (str)
Strings in Python 3 are immutable sequences of Unicode characters (UTF-8). They support a rich API for text
manipulation.
# String literals — 4 ways
s1 = 'single quotes'
s2 = "double quotes"
s3 = '''triple single — spans
multiple lines'''
s4 = """triple double — same idea"""

# Raw strings (backslash not treated as escape)


path = r"C:\Users\Alice\Documents" # Useful for regex, Windows paths

# f-strings (Python 3.6+) — preferred for formatting


name = "Alice"
gpa = 3.9876
msg = f"Student {name} has GPA {gpa:.2f}" # "Student Alice has GPA 3.99"

# f-string debug mode (Python 3.8+)


x = 42
print(f"{x=}") # x=42 (prints name AND value)

# Old-style formatting (still common in legacy code)


"%s has %.2f" % (name, gpa)

# [Link]()
"{} has {:.2f}".format(name, gpa)

# String operations
s = "Hello, World!"
len(s) # 13
[Link]() # "HELLO, WORLD!"
[Link]() # "hello, world!"
[Link]() # removes leading/trailing whitespace
[Link]("H") # "ello, World!"
Python Comprehensive Notes — Harvard CS 50P Page
[Link]("World", "Python") # "Hello, Python!"
[Link](", ") # ["Hello", "World!"]
", ".join(["a","b","c"]) # "a, b, c"
[Link]("He") # True
[Link]("!") # True
[Link]("World") # 7 (index, -1 if not found)
[Link]("l") # 3
[Link]() # False
" ".isspace() # True

String Slicing — Deep Dive


String slicing uses the syntax s[start:stop:step]. Start is inclusive, stop is exclusive. Negative indices count from
the end. This is one of Python's most powerful features.
s = "Python Programming"
# 0123456789...

s[0] # 'P'
s[-1] # 'g' (last character)
s[0:6] # 'Python'
s[7:] # 'Programming'
s[:6] # 'Python'
s[::2] # every 2nd char: 'Pto rgamn'
s[::-1] # reverse: 'gnimmargorP nohtyP'
s[7:11] # 'Prog'

# Slicing never raises IndexError — it clips silently


s[100:] # '' (empty string, no error)

Booleans (bool)
bool is a subclass of int. True equals 1 and False equals 0. This means True + True == 2, which is valid but
usually a code smell.
True == 1 # True
False == 0 # True
bool(0) # False
bool(1) # True
bool("") # False (empty string is falsy)
bool("hi") # True
bool([]) # False (empty list is falsy)
bool([0]) # True (list with one element, even 0)
bool(None) # False

# Falsy values in Python:


# False, 0, 0.0, 0j, "", [], (), {}, set(), None, range(0)
# Everything else is truthy

None
None is Python's null value. It is the sole instance of NoneType. Used to represent absence of a value,
uninitialized variables, or default function returns. Always compare with 'is None', not '== None'.
x = None
print(x is None) # True (CORRECT)
print(x == None) # True (works but discouraged)

# Functions return None implicitly


def greet(name):
print(f"Hello, {name}")
# No explicit return — returns None
Python Comprehensive Notes — Harvard CS 50P Page

2.3 Operators
Arithmetic Operators
10 + 3 # 13 (addition)
10 - 3 # 7 (subtraction)
10 * 3 # 30 (multiplication)
10 / 3 # 3.333... (true division — always float)
10 // 3 # 3 (floor division — rounds toward -infinity)
10 % 3 # 1 (modulo — remainder)
10 ** 3 # 1000 (exponentiation)

# Floor division with negatives (important!)


-7 // 2 # -4 (rounds DOWN, not toward zero)
-7 % 2 # 1 (consistent with floor division)

Comparison and Logical Operators


# Comparison (return bool)
5 == 5 # True (equality)
5 != 4 # True (not equal)
5 > 4 # True
5 >= 5 # True
5 < 6 # True

# IMPORTANT: Chained comparisons (unique to Python)


0 < x < 10 # True if x is between 0 and 10, exclusive
1 <= age <= 120

# Identity vs Equality
a = [1, 2, 3]
b = [1, 2, 3]
a == b # True (equal values)
a is b # False (different objects)
a is not b # True

# Logical operators (short-circuit)


True and False # False (evaluates right only if left is True)
True or False # True (evaluates right only if left is False)
not True # False

# Short-circuit evaluation with side effects


def expensive(): print("called!"); return True
False and expensive() # "called!" is NOT printed
True or expensive() # "called!" is NOT printed

Bitwise Operators
a = 0b1100 # 12
b = 0b1010 # 10

a & b # 0b1000 = 8 (AND)


a | b # 0b1110 = 14 (OR)
a ^ b # 0b0110 = 6 (XOR)
~a # -13 (NOT — inverts all bits)
a << 2 # 0b110000=48 (left shift by 2)
a >> 1 # 0b0110 = 6 (right shift by 1)

# Practical use: checking if a number is even/odd


n & 1 == 0 # True if even (fast alternative to n % 2)
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 3

Data Structures

3. Core Data Structures

3.1 Lists
A list is a mutable, ordered sequence of objects. It can contain elements of mixed types. Lists are backed by a
dynamic array and support O(1) indexing, O(1) amortized append, but O(n) insert/delete in the middle.
# Creation
empty = []
nums = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True, None]
nested = [[1,2],[3,4],[5,6]]

# Indexing and slicing (same as strings)


nums[0] # 1
nums[-1] # 5
nums[1:4] # [2, 3, 4]
nums[::2] # [1, 3, 5]

# Mutability — lists can be changed


nums[0] = 99
[Link](6) # Add to end: [99,2,3,4,5,6]
[Link](0, 0) # Insert at index 0
[Link]([7, 8]) # Extend with iterable
[Link]() # Remove and return last: 8
[Link](0) # Remove and return index 0: 0
[Link](99) # Remove first occurrence of 99
del nums[0] # Delete by index
[Link]() # Empty the list

# Searching and sorting


nums = [3, 1, 4, 1, 5, 9, 2, 6]
[Link](4) # 2 (index of first 4)
[Link](1) # 2 (count of 1s)
[Link]() # In-place sort: [1,1,2,3,4,5,6,9]
[Link](reverse=True)# Descending
[Link]() # Reverse in place

sorted_copy = sorted(nums) # Returns new sorted list


sorted_custom = sorted(nums, key=abs) # Sort by absolute value

# List concatenation and repetition


a = [1, 2] + [3, 4] # [1, 2, 3, 4]
b = [0] * 5 # [0, 0, 0, 0, 0]

# Membership test
4 in nums # True
10 in nums # False

# Unpacking
first, *middle, last = [1, 2, 3, 4, 5]
# first=1, middle=[2,3,4], last=5
Python Comprehensive Notes — Harvard CS 50P Page

💡 TIP: Use [Link]() for in-place sorting (mutates list). Use sorted() to get a new sorted list without
modifying the original. Both accept a key= function.

3.2 Tuples
A tuple is an immutable, ordered sequence. Once created, elements cannot be added, removed, or changed.
Tuples are hashable (if all elements are hashable) and can be used as dictionary keys. They are slightly faster
than lists and signal immutability of data.
# Creation
empty = ()
single = (42,) # Trailing comma is REQUIRED for single element!
coords = (3, 4)
triple = ("Alice", 30, "Harvard")

# Parentheses are actually optional — commas make a tuple


x = 1, 2, 3 # Same as (1, 2, 3)

# Indexing (same as list)


coords[0] # 3
coords[-1] # 4

# Unpacking (elegant!)
name, age, school = triple
lat, lng = 42.3601, -71.0589 # Boston coords

# Named tuples — give fields names


from collections import namedtuple
Point = namedtuple('Point', ['x', 'y', 'z'])
p = Point(1, 2, 3)
print(p.x, p.y, p.z) # Access by name
print(p[0], p[1], p[2]) # Also by index

# Python 3.6+ [Link] (cleaner)


from typing import NamedTuple
class Student(NamedTuple):
name: str
gpa: float
year: int = 1 # Default value

alice = Student("Alice", 3.9)


print([Link]) # "Alice"

3.3 Dictionaries
A dictionary (dict) is a mutable, ordered (Python 3.7+) mapping of key-value pairs. Implemented as a hash
table, it provides O(1) average-case lookup, insertion, and deletion. Keys must be hashable (immutable).
# Creation
empty = {}
student = {"name": "Alice", "gpa": 3.9, "year": 2}
d = dict(name="Bob", gpa=3.7)
d = dict([("a", 1), ("b", 2)]) # From iterable of pairs

# Access
student["name"] # "Alice"
[Link]("major") # None (safe — no KeyError)
[Link]("major", "Undeclared") # "Undeclared"

# Modification
student["year"] = 3 # Update
Python Comprehensive Notes — Harvard CS 50P Page
student["major"] = "CS" # Add new key
del student["year"] # Delete
popped = [Link]("gpa") # Remove and return

# Iteration
for key in student: # Keys
print(key)
for value in [Link](): # Values
print(value)
for key, value in [Link](): # Key-value pairs
print(f"{key}: {value}")

# Merging dicts (Python 3.9+)


d1 = {"a": 1, "b": 2}
d2 = {"b": 3, "c": 4}
merged = d1 | d2 # {"a":1, "b":3, "c":4} (d2 wins)
d1 |= d2 # Update d1 in-place

# [Link]() (works in all versions)


[Link](d2)

# Dictionary comprehension
squares = {x: x**2 for x in range(1, 6)}
# {1:1, 2:4, 3:9, 4:16, 5:25}

# Nested dict
university = {
"CS": {"students": 500, "faculty": 40},
"Math": {"students": 300, "faculty": 25},
}
university["CS"]["students"] # 500

# setdefault — set if not present, always return value


counts = {}
for char in "mississippi":
counts[char] = [Link](char, 0) + 1
# Better: [Link]

📘 NOTE: In Python 3.7+, dicts maintain insertion order. This is part of the language specification, not just
CPython implementation detail.

3.4 Sets
A set is a mutable, unordered collection of unique, hashable elements. Implemented as a hash table. Provides
O(1) average membership testing, O(min(a,b)) intersection, and O(a+b) union.
# Creation
empty = set() # NOT {} — that's an empty dict!
fruits = {"apple", "banana", "cherry"}
from_list = set([1, 2, 2, 3, 3, 3]) # {1, 2, 3}

# Membership (O(1)) — much faster than list


"apple" in fruits # True

# Add and remove


[Link]("date")
[Link]("banana") # No error if not present
[Link]("cherry") # KeyError if not present

# Set operations (mathematical set theory)


a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
Python Comprehensive Notes — Harvard CS 50P Page

a | b # Union: {1,2,3,4,5,6}
a & b # Intersection: {3,4}
a - b # Difference: {1,2} (in a but not b)
a ^ b # Symmetric: {1,2,5,6} (in one but not both)
[Link](b)
[Link](b)
[Link](b)
a.symmetric_difference(b)

# Subset / superset
{1,2}.issubset({1,2,3}) # True
{1,2,3}.issuperset({1,2}) # True

# frozenset — immutable set, can be dict key


fs = frozenset([1, 2, 3])

3.5 Collections Module — Advanced Data Structures


from collections import (
Counter, defaultdict, OrderedDict,
deque, ChainMap, UserList
)

# Counter — count occurrences


from collections import Counter
words = "the cat sat on the mat the cat".split()
c = Counter(words)
# Counter({'the': 3, 'cat': 2, 'sat': 1, 'on': 1, 'mat': 1})
c.most_common(2) # [('the', 3), ('cat', 2)]
c["the"] # 3
c["elephant"] # 0 (no KeyError!)
c + Counter(["the", "dog"]) # Combine counters

# defaultdict — auto-create missing keys


from collections import defaultdict
dd = defaultdict(list)
dd["CS"].append("Alice") # No KeyError for new key
dd["Math"].append("Bob")

word_groups = defaultdict(lambda: "unknown")


word_groups["Python"] # "unknown"

# deque — double-ended queue, O(1) append/pop from both ends


from collections import deque
dq = deque([1, 2, 3], maxlen=5)
[Link](0) # O(1) — [0, 1, 2, 3]
[Link](4) # O(1) — [0, 1, 2, 3, 4]
[Link]() # O(1) — returns 0
[Link](1) # Rotate right by 1

# OrderedDict — maintains insertion order (historical, Python 3.7+ dict does too)
# But OrderedDict has useful move_to_end() and reversed() support
od = OrderedDict()
od["first"] = 1
od["second"] = 2
od.move_to_end("first") # Move to end
od.move_to_end("first", last=False) # Move to front
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 4

Control Flow

4. Control Flow

4.1 Conditional Statements


# if / elif / else
age = 20
if age < 13:
print("Child")
elif age < 18:
print("Teenager")
elif age < 65:
print("Adult")
else:
print("Senior")

# Ternary (conditional expression) — single line


label = "Adult" if age >= 18 else "Minor"

# Nested ternary (use sparingly!)


grade = "A" if score >= 90 else ("B" if score >= 80 else "C")

# Match statement (Python 3.10+) — structural pattern matching


command = "quit"
match command:
case "quit" | "exit":
print("Goodbye!")
case "hello" | "hi":
print("Hello!")
case str(msg) if len(msg) > 50: # Guard
print(f"Long message: {msg}")
case _: # Wildcard (default)
print("Unknown command")

# Match with data classes


from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float

def describe(shape):
match shape:
case Point(x=0, y=0):
return "Origin"
case Point(x=0, y=y):
return f"Y-axis at {y}"
case Point(x=x, y=0):
return f"X-axis at {x}"
case Point(x=x, y=y):
return f"Point at ({x}, {y})"
case _:
return "Not a point"
Python Comprehensive Notes — Harvard CS 50P Page

4.2 Loops
for Loops
Python's for loop iterates over any iterable — not just ranges. It is implemented by calling iter() on the iterable,
then repeatedly calling next().
# Iterate over a list
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
print(fruit)

# range() — generates integers


for i in range(5): # 0, 1, 2, 3, 4
print(i)
for i in range(2, 10, 2): # 2, 4, 6, 8 (start, stop, step)
print(i)
for i in range(10, 0, -1): # 10, 9, ..., 1 (countdown)
print(i)

# enumerate() — index + value (THE PYTHONIC WAY)


for i, fruit in enumerate(fruits):
print(f"{i}: {fruit}")
for i, fruit in enumerate(fruits, start=1): # Start counting from 1
print(f"{i}. {fruit}")

# zip() — iterate multiple iterables simultaneously


names = ["Alice", "Bob", "Carol"]
scores = [95, 87, 92]
for name, score in zip(names, scores):
print(f"{name}: {score}")
# zip stops at shortest — use zip_longest for longer

from itertools import zip_longest


for name, score in zip_longest(names, scores, fillvalue=0):
pass

# Iterate dict
d = {"a": 1, "b": 2, "c": 3}
for key in d: # or [Link]()
print(key)
for val in [Link]():
print(val)
for k, v in [Link]():
print(k, v)

while Loops
# while — runs as long as condition is True
n = 1
while n < 100:
n *= 2
print(n) # 128

# while with else (runs when condition becomes False, NOT when break)
n = 10
while n > 0:
n -= 3
else:
print(f"Loop ended normally, n = {n}")

# Infinite loop with break


Python Comprehensive Notes — Harvard CS 50P Page
import random
while True:
num = [Link](1, 10)
if num == 7:
print("Got 7!")
break

Loop Control: break, continue, else


# break — exit loop immediately
for i in range(10):
if i == 5:
break
print(i) # 0, 1, 2, 3, 4

# continue — skip current iteration


for i in range(10):
if i % 2 == 0:
continue
print(i) # 1, 3, 5, 7, 9

# for-else: else runs if loop completed without break


# Used to search for an item
def find_prime_factor(n):
for i in range(2, n):
if n % i == 0:
print(f"{n} is divisible by {i}")
break
else:
print(f"{n} is prime!")

find_prime_factor(17) # "17 is prime!"


find_prime_factor(18) # "18 is divisible by 2"

💡 TIP: The for-else and while-else construct is unique to Python. The else block runs only when the loop
exhausted the iterable (for) or condition became False (while) — NOT when a break occurred.
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 5

Functions — Deep Dive

5. Functions — Complete Reference

5.1 Defining and Calling Functions


# Basic function
def greet(name):
"""Return a greeting string. (This is a docstring)"""
return f"Hello, {name}!"

result = greet("Alice") # "Hello, Alice!"

# Multiple return values (actually returns a tuple)


def min_max(numbers):
return min(numbers), max(numbers)

lo, hi = min_max([3, 1, 4, 1, 5, 9])


# lo = 1, hi = 9

# Function with no return statement returns None


def say_hi():
print("Hi!")
result = say_hi() # prints "Hi!", result is None

5.2 Parameters and Arguments


# Default arguments
def power(base, exponent=2):
return base ** exponent

power(3) # 9 (exponent defaults to 2)


power(3, 3) # 27

# Keyword arguments — can pass in any order


def register(name, age, course):
print(f"{name}, {age}, {course}")

register(age=20, course="CS", name="Alice") # Order doesn't matter

# *args — variable positional arguments (tuple)


def total(*args):
return sum(args)
total(1, 2, 3, 4, 5) # 15

# **kwargs — variable keyword arguments (dict)


def profile(**kwargs):
for key, val in [Link]():
print(f" {key}: {val}")
profile(name="Alice", gpa=3.9, year=2)

# Combining all types — ORDER MATTERS:


# positional, *args, keyword-only, **kwargs
def complex_func(a, b, *args, option=False, **kwargs):
Python Comprehensive Notes — Harvard CS 50P Page
print(f"a={a}, b={b}")
print(f"args={args}")
print(f"option={option}")
print(f"kwargs={kwargs}")

complex_func(1, 2, 3, 4, 5, option=True, x=10, y=20)

# Positional-only parameters (Python 3.8+, using /)


def strictly_positional(a, b, /, c, d):
pass # a, b must be positional; c, d can be either

# Keyword-only parameters (after *)


def keyword_only(a, b, *, force=False):
pass # force must always be keyword arg

⚠️WARNING: Never use mutable default arguments! def func(data=[]): [Link](1) — the list is created
ONCE and shared across all calls. Use None as default and create inside the function.
# WRONG!
def append_item(item, lst=[]):
[Link](item)
return lst
append_item(1) # [1]
append_item(2) # [1, 2] — BUG: same list!

# CORRECT
def append_item(item, lst=None):
if lst is None:
lst = []
[Link](item)
return lst

5.3 Scope and LEGB Rule


Python resolves variable names using the LEGB rule: Local → Enclosing → Global → Built-in. Understanding
scope is critical for writing correct code.
x = "global"

def outer():
x = "enclosing"

def inner():
x = "local"
print(x) # "local" (L)
inner()
print(x) # "enclosing" (E)

outer()
print(x) # "global" (G)

# global keyword — modify global from inside function


count = 0
def increment():
global count # Declare intent to modify global
count += 1

# nonlocal keyword — modify enclosing scope


def make_counter():
count = 0
def counter():
nonlocal count # Modify enclosing count
count += 1
Python Comprehensive Notes — Harvard CS 50P Page
return count
return counter

c = make_counter()
c() # 1
c() # 2
c() # 3

5.4 Lambda Functions


Lambda creates an anonymous single-expression function. Syntactically limited — no statements, no
assignments. Best used as short callbacks passed to sorted(), map(), filter().
# Lambda syntax: lambda parameters: expression
square = lambda x: x ** 2
add = lambda x, y: x + y
noop = lambda: None

# Primary use: as key functions


students = [("Alice", 3.9), ("Bob", 3.7), ("Carol", 4.0)]
sorted_by_gpa = sorted(students, key=lambda s: s[1])
# Sort descending
sorted_desc = sorted(students, key=lambda s: s[1], reverse=True)

# With map() and filter()


nums = [1, 2, 3, 4, 5, 6]
squares = list(map(lambda x: x**2, nums)) # [1,4,9,16,25,36]
evens = list(filter(lambda x: x%2==0, nums)) # [2,4,6]

# Tip: comprehensions are usually cleaner


squares = [x**2 for x in n
ums]
evens = [x for x in nums if x % 2 == 0]

5.5 Closures
A closure is a function that captures variables from its enclosing scope. The captured variables are stored in the
function's __closure__ attribute. Used extensively for decorators, callbacks, and factory functions.
def multiplier(factor):
"""Factory function returning a multiplier closure."""
def multiply(n):
return n * factor # 'factor' is captured from enclosing scope
return multiply

double = multiplier(2)
triple = multiplier(3)
double(5) # 10
triple(5) # 15

# Inspecting closure
print(double.__closure__) # (<cell at 0x...>,)
print(double.__closure__[0].cell_contents) # 2

# Practical: memoization via closure


def make_memoized(func):
cache = {}
def memoized(*args):
if args not in cache:
cache[args] = func(*args)
return cache[args]
return memoized
Python Comprehensive Notes — Harvard CS 50P Page

@make_memoized
def fib(n):
return n if n < 2 else fib(n-1) + fib(n-2)
print(fib(100)) # Fast!

5.6 Decorators
A decorator is a higher-order function that wraps another function to extend its behavior without modifying
the original. Uses @syntax which is syntactic sugar.
import functools
import time

# Basic decorator
def timer(func):
@[Link](func) # Preserves function metadata
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
end = time.perf_counter()
print(f"{func.__name__} took {end-start:.4f}s")
return result
return wrapper

@timer
def slow_add(a, b):
[Link](0.1)
return a + b

slow_add(3, 4)
# add took 0.1001s
# equivalent to: slow_add = timer(slow_add)

# Decorator with arguments


def repeat(times):
def decorator(func):
@[Link](func)
def wrapper(*args, **kwargs):
for _ in range(times):
result = func(*args, **kwargs)
return result
return wrapper
return decorator

@repeat(3)
def say(msg):
print(msg)
say("Hello!") # prints "Hello!" 3 times

# Stacking decorators (applied bottom-up)


@timer
@repeat(2)
def greet(name):
print(f"Hi {name}")
# Applied as: greet = timer(repeat(2)(greet))

# Class-based decorator
class Retry:
def __init__(self, max_attempts=3):
self.max_attempts = max_attempts
def __call__(self, func):
Python Comprehensive Notes — Harvard CS 50P Page
@[Link](func)
def wrapper(*args, **kwargs):
for attempt in range(self.max_attempts):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt == self.max_attempts - 1:
raise
print(f"Attempt {attempt+1} failed: {e}")
return wrapper

@Retry(max_attempts=3)
def unreliable_function():
if [Link]() < 0.7:
raise ValueError("Random failure!")
return "success"

5.7 Recursion
# Factorial — classic recursion
def factorial(n):
"""n! = n * (n-1) * ... * 1"""
if n <= 1: # Base case
return 1
return n * factorial(n - 1) # Recursive case

factorial(5) # 120

# Fibonacci — naive recursion (O(2^n) — avoid!)


def fib_naive(n):
if n < 2: return n
return fib_naive(n-1) + fib_naive(n-2)

# Fibonacci — memoization with lru_cache


from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n < 2: return n
return fib(n-1) + fib(n-2)

fib(1000) # Instant!

# Fibonacci — dynamic programming (iterative, O(n) space O(1))


def fib_dp(n):
a, b = 0, 1
for _ in range(n):
a, b = b, a + b
return a

# Tree traversal — natural recursive structure


def sum_nested(lst):
total = 0
for item in lst:
if isinstance(item, list):
total += sum_nested(item) # Recurse
else:
total += item
return total

sum_nested([1, [2, [3, 4]], 5]) # 15

# Python's default recursion limit


Python Comprehensive Notes — Harvard CS 50P Page
import sys
[Link]() # 1000 by default
[Link](5000) # Increase if needed

⚠️WARNING: Python does NOT optimize tail recursion. Deep recursion will hit RecursionError. For deep
recursion, convert to iteration or use [Link]() cautiously.
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 6

Object-Oriented Programming

6. Object-Oriented Programming

6.1 Classes and Objects


Python is a fully object-oriented language — everything is an object, including functions, classes, and modules.
A class is a blueprint; an object (instance) is a concrete realization of that blueprint.
class Student:
"""Represents a university student."""

# Class variable — shared by ALL instances


university = "Harvard"
_student_count = 0 # Leading _ = private by convention

def __init__(self, name: str, gpa: float, year: int = 1):


"""Constructor — called when creating an instance."""
[Link] = name # Instance variable
[Link] = gpa
[Link] = year
Student._student_count += 1

def __repr__(self) -> str:


"""Unambiguous representation for developers."""
return f"Student(name={[Link]!r}, gpa={[Link]}, year={[Link]})"

def __str__(self) -> str:


"""Readable representation for users."""
return f"{[Link]} (Year {[Link]}, GPA {[Link]:.2f})"

def __eq__(self, other) -> bool:


if not isinstance(other, Student):
return NotImplemented
return [Link] == [Link] and [Link] == [Link]

def __lt__(self, other) -> bool:


return [Link] < [Link]

def __hash__(self):
return hash(([Link], [Link]))

# Instance method
def honor_roll(self) -> bool:
return [Link] >= 3.7

# Class method — receives class, not instance


@classmethod
def from_dict(cls, data: dict) -> "Student":
return cls(data["name"], data["gpa"], [Link]("year", 1))

@classmethod
def get_count(cls) -> int:
return cls._student_count
Python Comprehensive Notes — Harvard CS 50P Page
# Static method — no access to class or instance
@staticmethod
def is_valid_gpa(gpa: float) -> bool:
return 0.0 <= gpa <= 4.0

# Usage
alice = Student("Alice", 3.9, 2)
bob = Student.from_dict({"name": "Bob", "gpa": 3.7})

print(alice) # Alice (Year 2, GPA 3.90)


print(repr(alice)) # Student(name='Alice', gpa=3.9, year=2)
alice.honor_roll() # True
Student.get_count() # 2
Student.is_valid_gpa(4.5) # False

# Accessing class vs instance variables


[Link] # "Harvard" (from class)
[Link] # "Harvard"

# Sort list of students (uses __lt__)


students = [Student("C", 3.5), Student("A", 3.9), Student("B", 3.7)]
sorted(students) # Sorted by GPA ascending

6.2 Inheritance
class Person:
def __init__(self, name: str, age: int):
[Link] = name
[Link] = age

def greet(self) -> str:


return f"Hi, I'm {[Link]}"

def __repr__(self):
return f"{type(self).__name__}(name={[Link]!r})"

class Student(Person):
def __init__(self, name: str, age: int, gpa: float):
super().__init__(name, age) # MUST call super().__init__()
[Link] = gpa

def greet(self) -> str: # Override


base = super().greet() # Call parent's greet
return f"{base}, student with GPA {[Link]}"

class GradStudent(Student):
def __init__(self, name, age, gpa, thesis):
super().__init__(name, age, gpa)
[Link] = thesis

# isinstance checks inheritance chain


alice = GradStudent("Alice", 28, 3.95, "ML in Healthcare")
isinstance(alice, GradStudent) # True
isinstance(alice, Student) # True
isinstance(alice, Person) # True

# issubclass
issubclass(GradStudent, Student) # True
issubclass(GradStudent, Person) # True

# Method Resolution Order (MRO) — C3 linearization


print(GradStudent.__mro__)
Python Comprehensive Notes — Harvard CS 50P Page
# (<class GradStudent>, <class Student>, <class Person>, <class object>)

6.3 Multiple Inheritance and Mixins


class Flyable:
def fly(self):
return f"{self.__class__.__name__} is flying"

class Swimmable:
def swim(self):
return f"{self.__class__.__name__} is swimming"

class Duck(Flyable, Swimmable):


pass

donald = Duck()
[Link]() # "Duck is flying"
[Link]() # "Duck is swimming"

# Mixin pattern — reusable behavior modules


class JSONMixin:
"""Adds JSON serialization to any class."""
import json
def to_json(self):
import json
return [Link](self.__dict__, default=str)

@classmethod
def from_json(cls, json_str):
import json
return cls(**[Link](json_str))

class LogMixin:
"""Adds logging to any class."""
def log(self, msg):
print(f"[{type(self).__name__}] {msg}")

class SmartStudent(JSONMixin, LogMixin, Student):


pass

6.4 Abstract Classes and Interfaces


from abc import ABC, abstractmethod

class Shape(ABC):
"""Abstract base class — cannot be instantiated directly."""

@abstractmethod
def area(self) -> float:
"""Subclasses MUST implement this."""
pass

@abstractmethod
def perimeter(self) -> float:
pass

def describe(self) -> str: # Concrete method


return (f"{type(self).__name__}: "
f"area={[Link]():.2f}, "
f"perimeter={[Link]():.2f}")
Python Comprehensive Notes — Harvard CS 50P Page
class Circle(Shape):
import math
def __init__(self, radius: float):
[Link] = radius

def area(self) -> float:


import math
return [Link] * [Link] ** 2

def perimeter(self) -> float:


import math
return 2 * [Link] * [Link]

class Rectangle(Shape):
def __init__(self, w: float, h: float):
[Link] = w
[Link] = h

def area(self): return [Link] * [Link]


def perimeter(self): return 2 * ([Link] + [Link])

# Shape() # TypeError: Can't instantiate abstract class


c = Circle(5)
r = Rectangle(4, 6)
print([Link]()) # Circle: area=78.54, perimeter=31.42

6.5 Properties and Descriptors


class Temperature:
def __init__(self, celsius: float = 0):
self._celsius = celsius # Leading _ = internal storage

@property
def celsius(self) -> float:
return self._celsius

@[Link]
def celsius(self, value: float):
if value < -273.15:
raise ValueError(f"Temperature {value}°C below absolute zero!")
self._celsius = value

@[Link]
def celsius(self):
del self._celsius

@property
def fahrenheit(self) -> float:
return self._celsius * 9/5 + 32

@[Link]
def fahrenheit(self, f: float):
[Link] = (f - 32) * 5/9 # Validates via celsius setter

t = Temperature(100)
[Link] # 100
[Link] # 212.0
[Link] = 32
[Link] # 0.0
[Link] = -300 # ValueError!
Python Comprehensive Notes — Harvard CS 50P Page

6.6 Dataclasses (Python 3.7+)


from dataclasses import dataclass, field, KW_ONLY, asdict, astuple
from typing import ClassVar

@dataclass(order=True, frozen=False)
class Student:
# Fields with default values MUST come after fields without
name: str
gpa: float
year: int = 1
courses: list = field(default_factory=list) # Mutable default

# Class variable (not a dataclass field)


university: ClassVar[str] = "Harvard"

# Post-init processing
def __post_init__(self):
if not (0.0 <= [Link] <= 4.0):
raise ValueError(f"Invalid GPA: {[Link]}")
[Link] = [Link]().title()

alice = Student("alice", 3.9, 2, ["CS50", "Math55"])


print(alice)
# Student(name='Alice', gpa=3.9, year=2, courses=['CS50', 'Math55'])

# Auto-generated __repr__, __eq__ (and __lt__,__le__,... with order=True)


bob = Student("Bob", 3.7)
alice > bob # True (by gpa, since order=True)
asdict(alice) # {'name': 'Alice', 'gpa': 3.9, ...}
astuple(alice) # ('Alice', 3.9, 2, ['CS50', 'Math55'])

# Frozen dataclass (immutable — like namedtuple with type hints)


@dataclass(frozen=True)
class Point:
x: float
y: float
# frozen=True makes it hashable, can be dict key
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 7

Comprehensions & Functional Programming

7. Comprehensions & Functional Programming

7.1 List Comprehensions


List comprehensions provide a concise, readable way to create lists. They are more Pythonic and often faster
than equivalent for loops because they are optimized at the bytecode level.
# Syntax: [expression for variable in iterable if condition]

# Basic
squares = [x**2 for x in range(10)]
# [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# With filter
evens = [x for x in range(20) if x % 2 == 0]

# Transformation + filter
result = [[Link]() for x in ["hello","world"] if len(x) > 4]

# Nested (matrix flattening)


matrix = [[1,2,3],[4,5,6],[7,8,9]]
flat = [num for row in matrix for num in row]
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Conditional expression (ternary in comprehension)


labels = ["even" if x%2==0 else "odd" for x in range(6)]

# Nested comprehension (matrix transpose)


T = [[row[i] for row in matrix] for i in range(3)]

# Comprehension vs map/filter
# These are equivalent:
result1 = list(map(lambda x: x**2, filter(lambda x: x%2==0, range(10))))
result2 = [x**2 for x in range(10) if x % 2 == 0]
# result2 is more readable

7.2 Dict & Set Comprehensions


# Dict comprehension
word_lengths = {word: len(word) for word in ["hello", "python", "world"]}
# {'hello': 5, 'python': 6, 'world': 5}

# Invert a dictionary
d = {"a": 1, "b": 2, "c": 3}
inverted = {v: k for k, v in [Link]()}
# {1: 'a', 2: 'b', 3: 'c'}

# Filter dict entries


high_gpa = {name: gpa for name, gpa in [Link]() if gpa >= 3.7}

# Set comprehension
unique_squares = {x**2 for x in range(-5, 6)}
Python Comprehensive Notes — Harvard CS 50P Page
# {0, 1, 4, 9, 16, 25} — no duplicates

7.3 Generator Expressions


Generator expressions are like list comprehensions but produce values lazily, one at a time. They use () instead
of [] and save memory for large datasets.
# Generator expression — lazy evaluation
gen = (x**2 for x in range(1_000_000)) # No memory allocated for 1M numbers
next(gen) # 0 — compute only what's needed
next(gen) # 1
sum(x**2 for x in range(100)) # Sum without building a list

# Compare memory usage


import sys
list_comp = [x**2 for x in range(1000)]
gen_expr = (x**2 for x in range(1000))
[Link](list_comp) # ~8056 bytes
[Link](gen_expr) # ~112 bytes (just the generator object!)

# Generators as arguments (extra parens not needed)


total = sum(x**2 for x in range(100))
maxi = max(len(word) for word in ["hello", "python", "world"])

7.4 Generator Functions


A generator function uses yield to produce a sequence of values lazily. When called, it returns a generator
object. Execution suspends at each yield and resumes when next() is called.
def count_up(start, stop, step=1):
"""Lazy range-like generator."""
current = start
while current < stop:
yield current
current += step

for n in count_up(0, 10, 2):


print(n) # 0, 2, 4, 6, 8

# Infinite generator
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b

fib = fibonacci()
[next(fib) for _ in range(10)] # [0,1,1,2,3,5,8,13,21,34]

# Generator pipeline (memory-efficient data processing)


def read_large_file(path):
with open(path) as f:
for line in f:
yield [Link]()

def filter_lines(lines, keyword):


for line in lines:
if keyword in line:
yield line

def count_words(lines):
for line in lines:
Python Comprehensive Notes — Harvard CS 50P Page
yield len([Link]())

# Pipeline (each step is lazy — no large intermediate lists)


lines = read_large_file("[Link]")
errors = filter_lines(lines, "ERROR")
word_cnt = count_words(errors)
total = sum(word_cnt)

# yield from — delegate to sub-generator


def chain(*iterables):
for it in iterables:
yield from it # equivalent to: for item in it: yield item

list(chain([1,2], [3,4], [5,6])) # [1,2,3,4,5,6]

7.5 Functional Tools


from functools import reduce, partial, lru_cache, cache
from itertools import (chain, islice, takewhile, dropwhile,
groupby, combinations, permutations,
product, starmap, accumulate)

# map — apply function to each element


list(map([Link], ["hello", "world"])) # ['HELLO', 'WORLD']
list(map(pow, [2,3,4], [10,2,3])) # [1024, 9, 64]

# filter — keep elements where function returns True


list(filter([Link], ["abc", "1", "xyz", "2"])) # ['abc', 'xyz']
list(filter(None, [0, 1, "", "hi", None, True])) # [1, 'hi', True]

# reduce — fold/accumulate
from functools import reduce
reduce(lambda a,b: a*b, [1,2,3,4,5]) # 120 = 5!
reduce(max, [3,1,4,1,5,9]) # 9

# partial — fix some arguments


def power(base, exp): return base ** exp
square = partial(power, exp=2)
cube = partial(power, exp=3)
square(5) # 25
cube(3) # 27

# itertools — production-grade iterator tools


list(chain([1,2], [3,4])) # [1, 2, 3, 4]
list(islice(range(100), 5, 10)) # [5,6,7,8,9]
list(takewhile(lambda x: x<5, [1,2,3,5,1])) # [1,2,3]
list(accumulate([1,2,3,4,5])) # [1,3,6,10,15] (running sum)
list(accumulate([1,2,3,4], lambda a,b: a*b)) # [1,2,6,24]

# combinations and permutations


list(combinations("ABC", 2)) # [('A','B'),('A','C'),('B','C')]
list(permutations("AB", 2)) # [('A','B'),('B','A')]
list(product([0,1], repeat=3)) # All 3-bit binary: 8 tuples

# groupby — group consecutive elements


data = [("Alice","CS"),("Bob","CS"),("Carol","Math"),("Dan","Math")]
[Link](key=lambda x: x[1]) # Must sort first!
for dept, members in groupby(data, key=lambda x: x[1]):
print(dept, list(members))
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 8

Exceptions & Error Handling

8. Exceptions & Error Handling

8.1 Exception Hierarchy


Python's exception hierarchy is a tree rooted at BaseException. All user-facing exceptions inherit from
Exception. The most important exception classes:
# BaseException
# ├── SystemExit — [Link]() raises this
# ├── KeyboardInterrupt — Ctrl+C
# ├── GeneratorExit — [Link]()
# └── Exception — Base for all normal exceptions
# ├── StopIteration — End of iteration
# ├── ArithmeticError
# │ ├── ZeroDivisionError
# │ ├── OverflowError
# │ └── FloatingPointError
# ├── LookupError
# │ ├── IndexError — list[99] on list of 3
# │ └── KeyError — dict["missing"]
# ├── ValueError — right type, wrong value
# ├── TypeError — wrong type entirely
# ├── AttributeError — obj.no_such_attr
# ├── NameError — undefined variable
# ├── OSError (IOError, FileNotFoundError, etc.)
# ├── RuntimeError
# └── NotImplementedError

8.2 try/except/else/finally
# Full exception handling syntax
try:
result = 10 / int(input("Enter a number: "))
except ZeroDivisionError:
print("Cannot divide by zero!")
except ValueError as e:
print(f"Invalid input: {e}")
except (TypeError, AttributeError) as e: # Catch multiple types
print(f"Type issue: {e}")
except Exception as e: # Catch any remaining Exception
print(f"Unexpected error: {type(e).__name__}: {e}")
raise # Re-raise the exception
else:
# Runs ONLY if no exception was raised in try block
print(f"Result: {result}")
finally:
# ALWAYS runs — use for cleanup (close files, connections, etc.)
print("Done.")

# Re-raising with context


try:
dangerous_operation()
Python Comprehensive Notes — Harvard CS 50P Page
except IOError as e:
raise RuntimeError("Failed to complete operation") from e
# Sets __cause__ and shows chained traceback

8.3 Custom Exceptions


class AppError(Exception):
"""Base exception for our application."""
pass

class ValidationError(AppError):
def __init__(self, field: str, message: str):
[Link] = field
[Link] = message
super().__init__(f"Validation error on '{field}': {message}")

class DatabaseError(AppError):
def __init__(self, query: str, cause: Exception):
[Link] = query
super().__init__(f"Database error in query: {query}")
self.__cause__ = cause

# Raise custom exceptions


def validate_age(age: int):
if not isinstance(age, int):
raise TypeError(f"Age must be int, got {type(age).__name__}")
if age < 0 or age > 150:
raise ValidationError("age", f"{age} is out of valid range [0, 150]")
return age

try:
validate_age(200)
except ValidationError as e:
print(f"Field: {[Link]}, Message: {[Link]}")

8.4 Context Managers (with statement)


Context managers automate resource management — they guarantee cleanup (like closing files) even if
exceptions occur. Implement __enter__ and __exit__ dunder methods.
# Built-in: file handling
with open("[Link]", "r") as f:
content = [Link]()
# f is automatically closed here, even if exception occurred

# Multiple context managers


with open("[Link]") as src, open("[Link]","w") as dst:
[Link]([Link]())

# Class-based context manager


class DatabaseConnection:
def __init__(self, host: str):
[Link] = host
[Link] = None

def __enter__(self):
print(f"Connecting to {[Link]}...")
[Link] = simulate_connect([Link])
return [Link] # This is bound to the 'as' target

def __exit__(self, exc_type, exc_val, exc_tb):


Python Comprehensive Notes — Harvard CS 50P Page
print("Closing connection")
if [Link]:
[Link]()
# Return True to suppress the exception; False/None to propagate
return False

with DatabaseConnection("localhost") as conn:


[Link]("SELECT * FROM students")

# contextlib — easier context manager creation


from contextlib import contextmanager, suppress

@contextmanager
def timer(label: str):
import time
start = time.perf_counter()
try:
yield # Code inside 'with' runs here
finally:
elapsed = time.perf_counter() - start
print(f"{label}: {elapsed:.4f}s")

with timer("matrix multiply"):


result = [[sum(a*b for a,b in zip(row,col))
for col in zip(*B)] for row in A]

# suppress — silently ignore specific exceptions


with suppress(FileNotFoundError):
[Link]("temp_file.txt") # No error if file doesn't exist
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 9

Iterators, Generators & the Iterator Protocol

9. Iterators, Generators & the Iterator Protocol

9.1 The Iterator Protocol


Python's for loop works with any object that implements the iterator protocol: __iter__() returning an iterator
object, and __next__() returning the next value or raising StopIteration.
# How for loops work under the hood
nums = [1, 2, 3]
it = iter(nums) # Calls nums.__iter__()
print(next(it)) # 1 — calls it.__next__()
print(next(it)) # 2
print(next(it)) # 3
next(it) # Raises StopIteration

# A for loop is equivalent to:


it = iter(nums)
while True:
try:
item = next(it)
except StopIteration:
break
print(item) # Loop body

# Custom iterator
class Countdown:
def __init__(self, start: int):
[Link] = start

def __iter__(self):
return self # Iterator returns self

def __next__(self):
if [Link] <= 0:
raise StopIteration
val = [Link]
[Link] -= 1
return val

for n in Countdown(5):
print(n) # 5, 4, 3, 2, 1

# Making a class iterable (iter != iterator)


class NumberRange:
def __init__(self, start, stop):
[Link] = start
[Link] = stop

def __iter__(self):
current = [Link]
while current < [Link]:
yield current # __iter__ is a generator function!
current += 1
Python Comprehensive Notes — Harvard CS 50P Page

r = NumberRange(1, 5)
list(r) # [1, 2, 3, 4]
list(r) # [1, 2, 3, 4] — can iterate multiple times!

9.2 Advanced Generator Techniques


# Generator send() — two-way communication
def running_average():
total = 0
count = 0
avg = 0
while True:
value = yield avg # yield sends avg out, receives value in
if value is None:
break
total += value
count += 1
avg = total / count

gen = running_average()
next(gen) # Prime the generator (advance to first yield)
[Link](10) # avg = 10.0
[Link](20) # avg = 15.0
[Link](30) # avg = 20.0

# Generator throw() — inject exception


def resilient_gen():
while True:
try:
value = yield
print(f"Processing: {value}")
except ValueError as e:
print(f"Handling error: {e}")

# Coroutine-like generator pipeline


def producer(items):
for item in items:
yield item

def transformer(source, func):


for item in source:
yield func(item)

def consumer(source):
results = []
for item in source:
[Link](item)
return results

# Compose pipeline
data = producer([1, 2, 3, 4, 5])
doubled = transformer(data, lambda x: x * 2)
result = consumer(doubled) # [2, 4, 6, 8, 10]
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 10

File I/O & the OS Module

10. File I/O, Serialization & the OS Module

10.1 File Operations


# Opening files — always use 'with' statement
# Modes: 'r' read, 'w' write, 'a' append, 'x' create-exclusive
# Add 'b' for binary: 'rb', 'wb'

with open("[Link]", "r", encoding="utf-8") as f:


content = [Link]() # Read entire file as string
[Link](0) # Go back to start
lines = [Link]() # List of all lines (with \n)
[Link](0)
for line in f: # Efficient line-by-line (lazy)
print([Link]())

# Writing
with open("[Link]", "w", encoding="utf-8") as f:
[Link]("Hello, World!\n")
[Link](["line1\n", "line2\n"])

# Reading/writing CSV
import csv
with open("[Link]", "w", newline="") as f:
writer = [Link](f, fieldnames=["name","gpa","year"])
[Link]()
[Link]({"name":"Alice","gpa":3.9,"year":2})

with open("[Link]") as f:
reader = [Link](f)
for row in reader:
print(row["name"], row["gpa"])

10.2 JSON Serialization


import json
from datetime import datetime

# Serialize (Python → JSON string)


data = {"name": "Alice", "scores": [95, 87, 92], "active": True}
json_str = [Link](data, indent=2, ensure_ascii=False)

# Deserialize (JSON string → Python)


parsed = [Link](json_str)

# File I/O
with open("[Link]", "w") as f:
[Link](data, f, indent=2)

with open("[Link]") as f:
loaded = [Link](f)
Python Comprehensive Notes — Harvard CS 50P Page
# Custom serializer for non-serializable types
class DateTimeEncoder([Link]):
def default(self, obj):
if isinstance(obj, datetime):
return [Link]()
return super().default(obj)

[Link]({"ts": [Link]()}, cls=DateTimeEncoder)

10.3 pathlib — Modern Path Handling


from pathlib import Path

# Create path objects


p = Path("/home/alice/documents")
p = [Link]() / "documents" / "[Link]" # / operator joins paths

# Path operations
[Link] # "[Link]"
[Link] # "data"
[Link] # ".txt"
[Link] # Path("/home/alice/documents")
[Link] # ('/', 'home', 'alice', 'documents', '[Link]')

# Filesystem queries
[Link]() # True/False
p.is_file() # True/False
p.is_dir() # True/False
[Link]().st_size # File size in bytes

# Reading/writing (no open() needed)


text = p.read_text(encoding="utf-8")
p.write_text("Hello!", encoding="utf-8")
data = p.read_bytes()
p.write_bytes(b"\x00\x01")

# Directory operations
Path("new_dir").mkdir(parents=True, exist_ok=True)
Path("[Link]").unlink(missing_ok=True)

# Glob patterns
for py_file in Path(".").rglob("*.py"):
print(py_file)
list(Path(".").glob("**/*.txt")) # All .txt recursively

10.4 os and shutil Modules


import os, shutil

[Link]() # Current working directory


[Link]("/tmp") # Change directory
[Link]["HOME"] # Environment variable
[Link]("API_KEY", "")# Safe get with default

[Link]("dir", "file") # Platform-safe path join


[Link]("[Link]")
[Link]("/a/b/[Link]") # "[Link]"
[Link]("/a/b/[Link]") # "/a/b"

# Walk directory tree


for root, dirs, files in [Link]("."):
Python Comprehensive Notes — Harvard CS 50P Page
for f in files:
print([Link](root, f))

# shutil — high-level file operations


[Link]("[Link]", "[Link]") # Copy file
shutil.copy2("[Link]", "dst/") # Copy with metadata
[Link]("[Link]", "[Link]") # Move/rename
[Link]("src_dir", "dst_dir") # Copy directory tree
[Link]("temp_dir") # Remove directory tree
shutil.make_archive("backup", "zip", ".") # Create ZIP archive
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 11

Concurrency, Parallelism & Async

11. Concurrency, Parallelism & Async Programming

11.1 The GIL — Global Interpreter Lock


CPython uses a GIL — a mutex that allows only one thread to execute Python bytecode at a time. This means
threading does NOT achieve true parallelism for CPU-bound work. However, the GIL is released during I/O
operations, making threads useful for I/O-bound work. For CPU-bound parallelism, use multiprocessing.
📘 NOTE: Python 3.13+ introduces experimental no-GIL builds. In the future, the GIL may be optional or
removed entirely. For now: threads=I/O-bound, processes=CPU-bound, asyncio=many concurrent I/O tasks.

11.2 threading Module


import threading
import time

# Basic thread
def worker(name, delay):
print(f"Thread {name} starting")
[Link](delay)
print(f"Thread {name} done")

threads = [[Link](target=worker, args=(i, 0.5)) for i in range(5)]


for t in threads: [Link]()
for t in threads: [Link]() # Wait for all to complete

# Thread-safe shared state with Lock


counter = 0
lock = [Link]()

def safe_increment():
global counter
with lock: # Acquire and release automatically
counter += 1

# Condition variable — coordinate threads


condition = [Link]()
buffer = []

def producer():
for i in range(5):
with condition:
[Link](i)
[Link]() # Signal consumer
[Link](0.1)

def consumer():
while True:
with condition:
while not buffer:
[Link]() # Wait for signal
Python Comprehensive Notes — Harvard CS 50P Page
item = [Link](0)
if item == 4: break

# ThreadPoolExecutor — high-level thread pool


from [Link] import ThreadPoolExecutor, as_completed
import [Link]

def fetch_url(url):
with [Link](url, timeout=5) as r:
return len([Link]())

urls = ["[Link] "[Link]


with ThreadPoolExecutor(max_workers=4) as executor:
futures = {[Link](fetch_url, url): url for url in urls}
for future in as_completed(futures):
url = futures[future]
size = [Link]()
print(f"{url}: {size} bytes")

11.3 multiprocessing Module


from multiprocessing import Pool, Process, Queue, Manager
import os

# CPU-bound task — perfect for multiprocessing


def compute_prime_count(n):
"""Count primes up to n."""
primes = sum(1 for x in range(2, n+1)
if all(x % i != 0 for i in range(2, int(x**0.5)+1)))
return primes

if __name__ == "__main__": # Required guard for Windows!


with Pool(processes=os.cpu_count()) as pool:
results = [Link](compute_prime_count, [10000]*8)
print(sum(results))

# ProcessPoolExecutor — simpler API


from [Link] import ProcessPoolExecutor

def square(n): return n * n

if __name__ == "__main__":
with ProcessPoolExecutor() as executor:
results = list([Link](square, range(100)))

11.4 asyncio — Asynchronous I/O


asyncio enables concurrent execution of many I/O tasks in a single thread using cooperative multitasking.
Perfect for thousands of simultaneous network connections. Uses async/await syntax introduced in Python 3.5.
import asyncio
import aiohttp # pip install aiohttp

# Basic coroutine
async def greet(name: str, delay: float):
await [Link](delay) # Non-blocking sleep
print(f"Hello, {name}!")

# Run a single coroutine


[Link](greet("Alice", 1.0))
Python Comprehensive Notes — Harvard CS 50P Page
# Run multiple concurrently
async def main():
# gather — run all concurrently, return all results
results = await [Link](
greet("Alice", 1.0),
greet("Bob", 0.5),
greet("Carol", 1.5),
)
# All 3 run concurrently — total time ~1.5s, not 3s

[Link](main())

# HTTP requests with aiohttp


async def fetch(session, url):
async with [Link](url) as response:
return await [Link]()

async def fetch_all(urls):


async with [Link]() as session:
tasks = [asyncio.create_task(fetch(session, url)) for url in urls]
return await [Link](*tasks)

# [Link] — producer/consumer pattern


async def producer(queue: [Link]):
for i in range(10):
await [Link](i)
await [Link](0.05)
await [Link](None) # Sentinel

async def consumer(queue: [Link]):


while True:
item = await [Link]()
if item is None: break
print(f"Processing {item}")
queue.task_done()

async def main():


q = [Link](maxsize=3) # Bounded queue
await [Link](producer(q), consumer(q))
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 12

Type Hints & Static Analysis

12. Type Hints, Annotations & Static Analysis

12.1 Basic Type Annotations


Python 3.5+ supports type annotations via the typing module. Annotations are NOT enforced at runtime —
they are hints for static analysis tools like mypy, pyright, and IDEs. PEP 526 (variable annotations) and PEP 3107
(function annotations) formalize the syntax.
# Variable annotations
name: str = "Alice"
age: int = 20
gpa: float = 3.9
flag: bool = True

# Function annotations
def greet(name: str, times: int = 1) -> str:
return (f"Hello, {name}! " * times).strip()

def no_return() -> None:


print("no return value")

12.2 The typing Module


from typing import (
Optional, Union, List, Dict, Tuple, Set,
Any, Callable, Iterator, Generator,
TypeVar, Generic, Protocol, Final,
ClassVar, Literal, TypedDict, overload
)

# Optional — value OR None (same as Union[X, None])


def find_user(id: int) -> Optional[str]:
... # returns name or None

# Union — one of several types


def process(data: Union[str, bytes, list]) -> None: ...

# Python 3.10+ — use | instead of Union


def process(data: str | bytes | list) -> None: ...

# From Python 3.9+, use built-in types directly (no List, Dict etc.)
def get_scores(names: list[str]) -> dict[str, float]: ...

# Tuple — exact structure


def get_coords() -> tuple[float, float]: ...
def variadic() -> tuple[int, ...]: ... # Variable length

# Callable
from typing import Callable
def apply(func: Callable[[int, int], int], a: int, b: int) -> int:
return func(a, b)
Python Comprehensive Notes — Harvard CS 50P Page
# TypeVar — generic type variable
T = TypeVar('T')
def first(items: list[T]) -> T:
return items[0]

# TypeVar with bounds


Numeric = TypeVar('Numeric', int, float, complex)
def double(x: Numeric) -> Numeric:
return x * 2

# Generic classes
class Stack(Generic[T]):
def __init__(self) -> None:
self._items: list[T] = []

def push(self, item: T) -> None:


self._items.append(item)

def pop(self) -> T:


return self._items.pop()

s: Stack[int] = Stack()
[Link](42)

# TypedDict — typed dictionary


class StudentRecord(TypedDict):
name: str
gpa: float
year: int

# Protocol — structural subtyping (duck typing with type safety)


class Drawable(Protocol):
def draw(self) -> None: ...

def render(shape: Drawable) -> None:


[Link]()

class Circle:
def draw(self) -> None:
print("Drawing circle")

render(Circle()) # Valid — Circle has draw()

# Literal — restrict to specific values


from typing import Literal
def set_mode(mode: Literal["read", "write", "append"]) -> None: ...

# Final — cannot be reassigned


MAX_SIZE: Final[int] = 100

12.3 Runtime Type Checking with Pydantic


# pip install pydantic
from pydantic import BaseModel, Field, validator, model_validator
from typing import Optional
from datetime import datetime

class Student(BaseModel):
name: str
email: str
gpa: float = Field(ge=0.0, le=4.0, description="GPA between 0 and 4")
year: int = Field(default=1, ge=1, le=8)
Python Comprehensive Notes — Harvard CS 50P Page
courses: list[str] = []
enrolled_at: datetime = Field(default_factory=[Link])

@validator('name')
def name_must_not_be_empty(cls, v):
if not [Link]():
raise ValueError("Name cannot be empty")
return [Link]().title()

@validator('email')
def email_must_be_valid(cls, v):
if '@' not in v:
raise ValueError("Invalid email")
return [Link]()

# Auto-validates and coerces types


alice = Student(name="alice smith", email="ALICE@[Link]", gpa=3.9)
print([Link]) # "Alice Smith" (title-cased)
print([Link]) # "alice@[Link]" (lowercased)
[Link]() # Convert to dict
[Link]() # Convert to JSON string
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 13

The Python Standard Library

13. The Python Standard Library — Essential Modules

13.1 datetime — Date and Time


from datetime import datetime, date, time, timedelta, timezone
import zoneinfo # Python 3.9+

# Current date/time
now = [Link]() # Local time (naive)
utc = [Link]([Link]) # UTC (aware)
today = [Link]()

# Creating datetime objects


dt = datetime(2024, 9, 1, 9, 0, 0)
d = date(2024, 9, 1)
t = time(14, 30, 0)

# Formatting and parsing


formatted = [Link]("%Y-%m-%d %H:%M:%S") # "2024-09-01 09:00:00"
parsed = [Link]("2024-09-01", "%Y-%m-%d")
iso = [Link]() # "2024-09-01T09:00:00"
from_iso = [Link](iso)

# Arithmetic
delta = timedelta(days=30, hours=2, minutes=30)
future = dt + delta
diff = datetime(2025,1,1) - [Link]()
print(f"{[Link]} days until 2025")

# Timezone-aware (Python 3.9+)


eastern = [Link]("America/New_York")
dt_eastern = [Link](eastern)

13.2 re — Regular Expressions


import re

text = "Contact alice@[Link] or bob@[Link] for info"

# Search — find first match


m = [Link](r'\b[\w.+-]+@[\w-]+\.[a-zA-Z]{2,}\b', text)
if m:
print([Link]()) # "alice@[Link]"
print([Link]()) # Start index
print([Link]()) # (start, end) tuple

# findall — return all matches


emails = [Link](r'[\w.+-]+@[\w-]+\.[a-zA-Z]{2,}', text)
# ['alice@[Link]', 'bob@[Link]']

# sub — replace
clean = [Link](r'\s+', ' ', "hello world") # Remove extra spaces
Python Comprehensive Notes — Harvard CS 50P Page

# Compile for reuse (significant speedup in loops)


email_re = [Link](r'[\w.+-]+@[\w-]+\.[a-zA-Z]{2,}')
emails = email_re.findall(text)

# Groups
date_str = "Today is 2024-09-15"
m = [Link](r'(\d{4})-(\d{2})-(\d{2})', date_str)
year, month, day = [Link](1), [Link](2), [Link](3)

# Named groups
m = [Link](r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})', date_str)
print([Link]('year')) # "2024"
print([Link]()) # {'year':'2024','month':'09','day':'15'}

# Verbose mode — readable regex


email_pattern = [Link](r"""
\b # Word boundary
[\w.+-]+ # Local part
@ # At sign
[\w-]+ # Domain name
\. # Dot
[a-zA-Z]{2,} # TLD
\b # Word boundary
""", [Link])

13.3 logging — Production Logging


import logging

# Basic config
[Link](
level=[Link],
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
[Link]('[Link]'),
[Link]() # Also print to console
]
)

# Named logger (best practice)


logger = [Link](__name__)

[Link]("Debug information")
[Link]("Application started")
[Link]("Low disk space: %d%%", 10)
[Link]("Database connection failed")
[Link]("System shutting down")

# Structured logging with extra


[Link]("User login", extra={"user_id": 42, "ip": "[Link]"})

# Exception logging
try:
1/0
except ZeroDivisionError:
[Link]("Division error") # Includes full traceback

13.4 argparse — CLI Applications


import argparse
Python Comprehensive Notes — Harvard CS 50P Page

parser = [Link](
description="Student GPA calculator",
formatter_class=[Link]
)
parser.add_argument("name", type=str, help="Student name")
parser.add_argument("scores", type=float, nargs="+", help="List of scores")
parser.add_argument("-v", "--verbose", action="store_true")
parser.add_argument("-o", "--output", type=str, default="stdout")
parser.add_argument("--min-passing", type=float, default=60.0)

args = parser.parse_args()
gpa = sum([Link]) / len([Link])
if [Link]:
print(f"Processing {len([Link])} scores for {[Link]}")
print(f"{[Link]}: {gpa:.2f}")

13.5 unittest — Testing Framework


import unittest
from [Link] import Mock, patch, MagicMock

def add(a, b): return a + b


def divide(a, b):
if b == 0: raise ZeroDivisionError("Cannot divide by zero")
return a / b

class TestMath([Link]):

def test_add_positive(self):
[Link](add(2, 3), 5)

def test_add_negative(self):
[Link](add(-1, -2), -3)

def test_divide_normal(self):
[Link](divide(1, 3), 0.333, places=3)

def test_divide_by_zero(self):
with [Link](ZeroDivisionError) as ctx:
divide(10, 0)
[Link]("Cannot divide by zero", str([Link]))

def setUp(self): # Runs before each test


[Link] = [1, 2, 3]

def tearDown(self): # Runs after each test


pass

@[Link]("Not implemented yet")


def test_future_feature(self): ...

# Mock external dependencies


@patch('[Link]')
def test_api_call(self, mock_get):
mock_get.return_value.json.return_value = {"status": "ok"}
# ... test code that calls [Link]()
mock_get.assert_called_once()

# pytest (recommended over unittest)


# pip install pytest
# def test_add(): assert add(2, 3) == 5
Python Comprehensive Notes — Harvard CS 50P Page
# pytest -v tests/
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 14

Advanced Python Techniques

14. Advanced Python Techniques

14.1 Metaclasses
A metaclass is the class of a class. In Python, classes themselves are objects, created by metaclasses. The
default metaclass is type. Metaclasses enable powerful patterns like ORMs, auto-registration, and DSLs.
# type is the metaclass of all classes
type(int) # <class 'type'>
type(str) # <class 'type'>
type(list) # <class 'type'>

# Dynamically create a class with type()


MyClass = type('MyClass', (object,), {
'x': 42,
'hello': lambda self: f"Hello from {type(self).__name__}"
})

# Custom metaclass
class SingletonMeta(type):
"""Ensure only one instance of a class can exist."""
_instances = {}

def __call__(cls, *args, **kwargs):


if cls not in cls._instances:
cls._instances[cls] = super().__call__(*args, **kwargs)
return cls._instances[cls]

class Database(metaclass=SingletonMeta):
def __init__(self):
print("Creating database connection")

db1 = Database()
db2 = Database()
print(db1 is db2) # True — same object!

# Auto-registry metaclass
class PluginMeta(type):
registry = {}
def __new__(mcs, name, bases, namespace):
cls = super().__new__(mcs, name, bases, namespace)
if bases: # Skip base class itself
[Link][name] = cls
return cls

class Plugin(metaclass=PluginMeta):
pass

class CSVPlugin(Plugin):
def process(self): ...

class JSONPlugin(Plugin):
def process(self): ...
Python Comprehensive Notes — Harvard CS 50P Page

print([Link]) # {'CSVPlugin': ..., 'JSONPlugin': ...}

14.2 Descriptors
Descriptors implement __get__, __set__, and/or __delete__ methods. They power Python's property system,
classmethod, staticmethod, and many ORMs. The most powerful and underutilized feature in Python.
class TypedAttribute:
"""Descriptor that enforces type checking on assignment."""

def __init__(self, name: str, expected_type: type):


[Link] = name
self.expected_type = expected_type
self.attr_name = f"_{name}" # Private storage attribute

def __set_name__(self, owner, name):


self.attr_name = f"_{name}"

def __get__(self, obj, objtype=None):


if obj is None: # Accessed on class, not instance
return self
return getattr(obj, self.attr_name, None)

def __set__(self, obj, value):


if not isinstance(value, self.expected_type):
raise TypeError(f"{[Link]} must be {self.expected_type.__name__}, "
f"got {type(value).__name__}")
setattr(obj, self.attr_name, value)

def __delete__(self, obj):


delattr(obj, self.attr_name)

class Student:
name = TypedAttribute("name", str)
gpa = TypedAttribute("gpa", float)
year = TypedAttribute("year", int)

def __init__(self, name, gpa, year):


[Link] = name # Calls TypedAttribute.__set__
[Link] = gpa
[Link] = year

alice = Student("Alice", 3.9, 2)


[Link] = 42 # TypeError: name must be str, got int

14.3 __slots__
# By default, instances use __dict__ (a hash table) for attributes
# __slots__ replaces __dict__ with fixed-size array — saves memory!

class Point:
__slots__ = ('x', 'y', 'z') # Only these attributes allowed

def __init__(self, x, y, z):


self.x, self.y, self.z = x, y, z

p = Point(1, 2, 3)
p.x # 1
p.w = 4 # AttributeError — not in __slots__
Python Comprehensive Notes — Harvard CS 50P Page
import sys
class RegPoint:
def __init__(self, x, y, z):
self.x, self.y, self.z = x, y, z

# Memory comparison (with 1M objects)


# Regular: ~200MB — each has a __dict__
# Slots: ~75MB — no __dict__ overhead

14.4 Memory Management & the gc Module


import gc, sys

# Reference counting
x = [1, 2, 3]
[Link](x) # 2 (x + getrefcount's own reference)
y = x
[Link](x) # 3

# Garbage collector (handles circular references)


[Link]()
[Link]()
[Link]() # Force collection, returns number of collected objects
gc.get_stats() # Collection statistics

# Memory profiling
# pip install tracemalloc (built-in since 3.4)
import tracemalloc
[Link]()

# ... code to profile ...


snapshot = tracemalloc.take_snapshot()
top_stats = [Link]('lineno')
for stat in top_stats[:5]:
print(stat)

14.5 Design Patterns in Python


Observer Pattern
from typing import Callable

class EventEmitter:
def __init__(self):
self._handlers: dict[str, list[Callable]] = {}

def on(self, event: str, handler: Callable):


self._handlers.setdefault(event, []).append(handler)
return self # Allow chaining

def emit(self, event: str, *args, **kwargs):


for handler in self._handlers.get(event, []):
handler(*args, **kwargs)

emitter = EventEmitter()
[Link]("data", lambda d: print(f"Got: {d}"))
[Link]("data", lambda d: print(f"Logging: {d}"))
[Link]("data", {"user": "Alice"})

Strategy Pattern
Python Comprehensive Notes — Harvard CS 50P Page
from abc import ABC, abstractmethod
from typing import Protocol

class SortStrategy(Protocol):
def sort(self, data: list) -> list: ...

class BubbleSort:
def sort(self, data): ... # Implementation

class QuickSort:
def sort(self, data): return sorted(data)

class DataProcessor:
def __init__(self, strategy: SortStrategy):
[Link] = strategy

def process(self, data):


return [Link](data)

dp = DataProcessor(QuickSort())
[Link]([3,1,4,1,5]) # [1,1,3,4,5]
Python Comprehensive Notes — Harvard CS 50P Page

MODULE 15

Best Practices & Pythonic Code

15. Best Practices, Idioms & Pythonic Code

15.1 PEP 8 — Style Guide


PEP 8 is the de-facto style guide for Python. Tools like black (auto-formatter), flake8 (linter), and isort (import
sorter) enforce it automatically.

Convention Used For


snake_case Variables, functions, methods,
modules
PascalCase Classes
UPPER_SNAKE_CASE Constants
_single_leading Private by convention
__double_leading Name mangling (truly private)
__dunder__ Special/magic methods

# Good PEP 8 style


import os
import sys
from pathlib import Path
from typing import Optional

MAX_RETRIES: int = 3
DEFAULT_TIMEOUT: float = 30.0

class StudentDatabase:
"""A database of students. (Class docstring)"""

def __init__(self, host: str, port: int = 5432) -> None:


[Link] = host
[Link] = port
self._connection = None

def connect(self, timeout: Optional[float] = None) -> bool:


"""Connect to the database.

Args:
timeout: Connection timeout in seconds.

Returns:
True if connection successful.

Raises:
ConnectionError: If connection fails.
"""
...
Python Comprehensive Notes — Harvard CS 50P Page
@property
def is_connected(self) -> bool:
return self._connection is not None

15.2 Pythonic Idioms


# Use enumerate instead of range(len(...))
# BAD
for i in range(len(items)):
print(i, items[i])
# GOOD
for i, item in enumerate(items):
print(i, item)

# Use zip for parallel iteration


# BAD
for i in range(len(names)):
print(names[i], scores[i])
# GOOD
for name, score in zip(names, scores):
print(name, score)

# Use [Link]() with default


# BAD
if key in d:
value = d[key]
else:
value = default
# GOOD
value = [Link](key, default)

# Use 'in' for membership testing (not index)


# BAD
if [Link]("x") >= 0: ...
# GOOD
if "x" in items: ...

# Return early to reduce nesting


# BAD (arrow-shaped code)
def process(data):
if data:
if isinstance(data, list):
if len(data) > 0:
return data[0]
# GOOD (guard clauses)
def process(data):
if not data: return None
if not isinstance(data, list): return None
if not data: return None
return data[0]

# Use context managers


# BAD
f = open("[Link]")
content = [Link]()
[Link]()
# GOOD
with open("[Link]") as f:
content = [Link]()

# Swap variables
# BAD (C-style)
Python Comprehensive Notes — Harvard CS 50P Page
temp = a; a = b; b = temp
# GOOD
a, b = b, a

# Build strings with join, not concatenation


# BAD (O(n²) — creates new string each iteration)
result = ""
for word in words:
result += word + " "
# GOOD
result = " ".join(words)

# Use any() and all()


# BAD
found = False
for item in items:
if condition(item):
found = True
break
# GOOD
found = any(condition(item) for item in items)
all_valid = all(validate(item) for item in items)

15.3 Performance Tips


# 1. Use local variables in tight loops (local lookup is faster)
import math
sqrt = [Link] # Bind to local for speed
result = [sqrt(x) for x in range(10000)]

# 2. Use sets for O(1) membership testing


# BAD: O(n)
if item in large_list: ...
# GOOD: O(1)
large_set = set(large_list)
if item in large_set: ...

# 3. Use slots for memory-intensive classes (see 14.3)

# 4. Use generators for large data


# BAD: builds entire list in memory
total = sum([x**2 for x in range(1_000_000)])
# GOOD: lazy evaluation
total = sum(x**2 for x in range(1_000_000))

# 5. Profile before optimizing


import cProfile
[Link]('my_function()')

# 6. Use bisect for sorted list operations


import bisect
sorted_list = [1, 3, 5, 7, 9]
[Link](sorted_list, 4) # O(log n) insert maintaining sort

# 7. lru_cache for expensive pure functions


from functools import lru_cache
@lru_cache(maxsize=128)
def expensive_computation(n: int) -> int:
[Link](0.1) # Simulate expensive work
return n * n

# 8. Use array module for typed numeric arrays


Python Comprehensive Notes — Harvard CS 50P Page
import array
a = [Link]('d', [1.0, 2.0, 3.0]) # More memory-efficient than list
# Or use numpy for numerical computing

15.4 Security Best Practices


import secrets, hashlib, hmac

# 1. Use secrets for cryptographically secure random


token = secrets.token_hex(32) # 64-char hex string
password = secrets.token_urlsafe(16) # URL-safe base64

# 2. Hash passwords with bcrypt (not hashlib!)


# pip install bcrypt
import bcrypt
password = b"super_secret_password"
hashed = [Link](password, [Link](rounds=12))
[Link](password, hashed) # True

# 3. Constant-time comparison to prevent timing attacks


hmac.compare_digest(a, b) # NOT ==

# 4. Never use eval() or exec() with user input


# BAD:
user_input = "[Link]('rm -rf /')"
eval(user_input) # CATASTROPHIC

# 5. Use parameterized queries, never f-strings for SQL


# BAD (SQL injection!):
query = f"SELECT * FROM users WHERE name = '{user_input}'"
# GOOD:
[Link]("SELECT * FROM users WHERE name = ?", (user_input,))

# 6. Validate and sanitize all external inputs


import re
def safe_username(name: str) -> str:
if not [Link](r'^[a-zA-Z0-9_]{3,32}$', name):
raise ValueError("Invalid username")
return name

15.5 Project Structure


# Recommended project layout (src layout)
my_project/
├── src/
│ └── my_package/
│ ├── __init__.py
│ ├── [Link]
│ ├── [Link]
│ ├── [Link]
│ └── [Link]
├── tests/
│ ├── __init__.py
│ ├── test_core.py
│ └── test_models.py
├── docs/
├── .github/
│ └── workflows/
│ └── [Link]
├── [Link] # PEP 517/518 build config
├── [Link] # or [Link]
Python Comprehensive Notes — Harvard CS 50P Page
├── [Link]
├── [Link]
├── .[Link]
├── .flake8
└── [Link]

# [Link] (modern Python packaging)


# [build-system]
# requires = ["setuptools>=61", "wheel"]
# build-backend = "[Link]:build"
#
# [project]
# name = "my-package"
# version = "1.0.0"
# requires-python = ">=3.10"
# dependencies = ["requests>=2.28", "pydantic>=2.0"]

End of Python Comprehensive Notes


Harvard University • Department of Computer Science • CS 50P
"In Python, readability counts — and so does deep understanding."

You might also like