GUJARAT TECHNOLOGICAL UNIVERSITY
Master of Engineering | Semester 2
ME02000301 – Advance Algorithms
Complete Study Notes: Units 1 – 4
UNIT 1: Sorting & Graph Algorithms
📌 What is this unit about?
Unit 1 covers: (1) Review of sorting algorithms, (2) Topological Sorting, (3) Graph basics, (4)
Shortest paths using BFS & Dijkstra, (5) DFS & Strongly Connected Components, (6)
Amortized Analysis.
1.1 Sorting Algorithms – Review
Sorting means arranging elements in a specific order (e.g., ascending). Understanding sorting is
fundamental because many complex algorithms depend on sorted data.
Why do we study sorting?
• Faster searching in sorted arrays (Binary Search)
• Many graph algorithms require sorting of edges by weight
• Helps understand algorithm design paradigms: Divide & Conquer, Comparison-based sorting
1.1.1 Bubble Sort
Idea: Compare adjacent elements and swap if out of order. Repeat until no swaps are needed. It
'bubbles' the largest element to the end in each pass.
🔢 Step-by-step Example
Array: [5, 3, 8, 1] Pass 1: Compare (5,3)→swap→[3,5,8,1], (5,8)→ok,
(8,1)→swap→[3,5,1,8] Pass 2: (3,5)→ok, (5,1)→swap→[3,1,5,8], (5,8)→ok Pass 3:
(3,1)→swap→[1,3,5,8] ✓ Done!
Bubble Sort (Array A, size n):
for i = 0 to n-1:
for j = 0 to n-i-2:
if A[j] > A[j+1]:
swap(A[j], A[j+1])
Property Value
Time Complexity (Best) O(n) — already sorted
Time Complexity (Worst) O(n²)
Space Complexity O(1) — in-place
Stable? Yes
1.1.2 Selection Sort
Idea: Find the minimum element from the unsorted part and place it at the beginning. Repeat for
remaining elements.
🔢 Step-by-step Example
Array: [64, 25, 12, 22] Step 1: Min=12, swap with 64 → [12, 25, 64, 22] Step 2: Min=22,
swap with 25 → [12, 22, 64, 25] Step 3: Min=25, swap with 64 → [12, 22, 25, 64] ✓
Property Value
Time Complexity O(n²) always
Space Complexity O(1)
Stable? No
Use when Memory writes are expensive
1.1.3 Insertion Sort
Idea: Build the sorted array one item at a time. Pick the next element and insert it into its correct
position in the already-sorted part.
🔢 Step-by-step Example
Array: [4, 2, 5, 1] Step 1: [4] sorted. Take 2: 2<4 so insert before → [2, 4] Step 2: Take 5:
5>4, no move → [2, 4, 5] Step 3: Take 1: 1<5, 1<4, 1<2, insert at start → [1, 2, 4, 5] ✓
Property Value
Time Complexity (Best) O(n) — nearly sorted
Time Complexity (Worst) O(n²)
Stable? Yes
Great for Small arrays or nearly sorted data
1.1.4 Merge Sort
Idea: Divide the array into two halves, recursively sort each half, then merge the sorted halves. This is a
classic 'Divide and Conquer' algorithm.
🔢 Step-by-step Example
Array: [38, 27, 43, 3] Divide: [38, 27] | [43, 3] Divide again: [38] [27] [43] [3] Merge: [27, 38]
[3, 43] Merge again: [3, 27, 38, 43] ✓
MergeSort(A, left, right):
if left < right:
mid = (left + right) / 2
MergeSort(A, left, mid)
MergeSort(A, mid+1, right)
Merge(A, left, mid, right)
Property Value
Time Complexity O(n log n) always
Space Complexity O(n) — extra array
Stable? Yes
Preferred for Linked lists, large datasets
1.1.5 Quick Sort
Idea: Pick a 'pivot' element. Partition the array so all elements smaller than pivot go left, and larger go
right. Recursively sort both sides.
🔢 Step-by-step Example
Array: [3, 6, 8, 10, 1, 2, 1], pivot=3 Left of 3: [1, 2, 1] → sorted → [1, 1, 2] Right of 3: [6, 8,
10] → sorted → [6, 8, 10] Result: [1, 1, 2, 3, 6, 8, 10] ✓
Property Value
Time Complexity (Average) O(n log n)
Time Complexity (Worst) O(n²) — bad pivot choice
Space Complexity O(log n)
In practice Fastest general-purpose sort
1.1.6 Heap Sort
Idea: Build a max-heap (parent always > children). Repeatedly extract the maximum from the heap and
place it at the end of the array.
Key Concept: Max-Heap
In a max-heap, the largest element is always at the root (index 0). When we remove the
root, we fix the heap property again. This process of fixing is called 'heapify'.
Property Value
Time Complexity O(n log n) always
Space Complexity O(1)
Stable? No
Use when You need guaranteed O(n log n) with no extra space
📊 Sorting Algorithms — Big Comparison Table
1.2 Topological Sorting
Topological sorting is a linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every
directed edge u → v, vertex u comes before v.
🧠 Real-life Analogy
Think of university courses: You must take Math 101 BEFORE Advanced Calculus.
Topological sort gives you a valid order to take courses without violating prerequisites.
When can we use Topological Sort?
• Only works on DAGs — Directed Acyclic Graphs (no cycles!)
• If there's a cycle (e.g., A→B→A), topological sort is impossible
Method 1: Kahn's Algorithm (BFS-based)
1. Calculate in-degree (number of incoming edges) for all vertices
2. Add all vertices with in-degree = 0 to a queue
3. While queue is not empty: • Remove a vertex from queue, add it to result • For each
neighbor, reduce its in-degree by 1 • If neighbor's in-degree becomes 0, add it to queue
4. If result contains all vertices → valid topological order Otherwise → cycle exists!
🔢 Example
Graph: A→C, B→C, B→D, C→E, D→F, E→F In-degrees: A=0, B=0, C=2, D=1, E=1, F=2
Queue starts: [A, B] Process A → C's in-degree=1, Process B → C=0, D=0 Queue: [C, D] →
Process C → E's in-degree=0 Result: A, B, C, D, E, F ✓
Method 2: DFS-based
5. Do a DFS on the graph
6. When a vertex's DFS finishes (all its neighbors explored), PUSH it to a stack
7. Pop the stack to get topological order
⏱ Time Complexity
O(V + E) where V = number of vertices, E = number of edges. Both methods are equally
efficient.
1.3 Graph Basics & Elementary Algorithms
What is a Graph?
A graph G = (V, E) consists of:
• V = Set of Vertices (nodes) — e.g., cities
• E = Set of Edges (connections) — e.g., roads between cities
Graph Type Description
Undirected Edges have no direction. A-B means A connects to B and B connects
to A.
Directed (Digraph) Edges have direction. A→B means only A to B.
Weighted Edges have a cost/weight (e.g., distance between cities).
Unweighted All edges treated equally.
Cyclic Contains at least one cycle (loop).
Acyclic (DAG) No cycles. Used in scheduling, dependency graphs.
Graph Representations
Adjacency Matrix
A 2D array where matrix[i][j] = 1 (or weight) if there's an edge from i to j, else 0.
✅ Good for
Dense graphs (many edges). O(1) edge lookup.
❌ Bad for
Sparse graphs — wastes O(V²) space.
Adjacency List
An array of lists. Each vertex has a list of its neighbors.
✅ Good for
Sparse graphs. Space = O(V + E).
❌ Bad for
Checking if a specific edge exists — takes O(degree) time.
1.4 BFS — Breadth First Search & Shortest Path
BFS explores a graph level by level, starting from a source vertex. It visits all neighbors at distance 1,
then distance 2, etc.
🧠 Real-life Analogy
Imagine ripples in water. You drop a stone (source), and waves spread outward equally in
all directions. BFS is exactly that — it explores all nodes at distance d before going to
distance d+1.
BFS Algorithm — Step by Step
8. Create a queue. Mark source as visited. Enqueue source.
9. While queue is not empty:
• Dequeue a vertex u
• For each unvisited neighbor v of u:
– Mark v as visited
– Record distance[v] = distance[u] + 1
– Record parent[v] = u
– Enqueue v
BFS(Graph G, source s):
for each vertex u: visited[u] = false, dist[u] = ∞
visited[s] = true, dist[s] = 0
Queue Q = {s}
while Q not empty:
u = Dequeue(Q)
for each neighbor v of u:
if not visited[v]:
visited[v] = true
dist[v] = dist[u] + 1
parent[v] = u
Enqueue(Q, v)
🔢 BFS Example
Graph: 1-2, 1-3, 2-4, 3-4, 4-5. Source = 1 Queue: [1] → dequeue 1, visit 2,3 → Queue:[2,3]
Dequeue 2, visit 4 → Queue:[3,4] Dequeue 3, (4 already visited) → Queue:[4] Dequeue 4,
visit 5 → Queue:[5] Done! Distances: d(2)=1, d(3)=1, d(4)=2, d(5)=3
Property Value
Time Complexity O(V + E)
Space Complexity O(V)
Finds shortest path? Yes — in UNWEIGHTED graphs only
Data Structure used Queue (FIFO)
1.5 Dijkstra's Algorithm — Shortest Path in Weighted
Graphs
Dijkstra's algorithm finds the shortest path from a single source to all other vertices in a graph with non-
negative weights.
🧠 Why BFS doesn't work for weighted graphs
BFS counts the number of edges (hops). But if edges have different weights, a path with
fewer hops may cost MORE. Example: Path A→B = 1 hop but weight 100. Path A→C→B =
2 hops but weight 3. BFS would choose A→B wrongly!
Key Idea
Maintain a set of visited vertices. At each step, pick the unvisited vertex with the smallest known
distance, update its neighbors' distances. Repeat.
Dijkstra — Step by Step
10. Set dist[source] = 0, dist[all others] = ∞
11. Add all vertices to a priority queue (min-heap) based on distance
12. While priority queue is not empty:
• Extract vertex u with minimum distance
• For each neighbor v of u with edge weight w:
– newDist = dist[u] + w
– If newDist < dist[v]:
• dist[v] = newDist ← this is called 'RELAXATION'
• parent[v] = u
• Update v in priority queue
Dijkstra(Graph G, source s):
for each vertex v: dist[v] = ∞, visited[v] = false
dist[s] = 0
MinHeap H = all vertices with dist values
while H not empty:
u = ExtractMin(H) // get vertex with smallest dist
visited[u] = true
for each neighbor v of u (with edge weight w):
if not visited[v] and dist[u] + w < dist[v]:
dist[v] = dist[u] + w
parent[v] = u
DecreaseKey(H, v, dist[v])
🔢 Dijkstra Example
Graph: A→B(4), A→C(2), C→B(1), B→D(5), C→D(8) Start: dist = {A:0, B:∞, C:∞, D:∞} Pick
A: update B=4, C=2 Pick C (min=2): B = min(4, 2+1)=3 ✓, D = 2+8=10 Pick B (min=3): D =
min(10, 3+5)=8 ✓ Pick D: no updates Final: dist={A:0, B:3, C:2, D:8}
Property Value
Time Complexity (Simple) O(V²)
Time Complexity (Min-Heap) O((V + E) log V)
Works with negative weights? NO! Use Bellman-Ford instead
Algorithm type Greedy
⚠️Important Limitation
Dijkstra fails with negative edge weights! If any edge has negative weight, use Bellman-Ford
algorithm instead.
1.6 DFS — Depth First Search
DFS explores as far as possible along a branch before backtracking. It goes deep first, unlike BFS
which goes wide.
🧠 Real-life Analogy
Exploring a maze: you always keep going forward until you hit a dead-end, then you
backtrack to the last junction and try a different path.
DFS(Graph G, vertex u):
mark u as visited
time = time + 1
discovery[u] = time // when we FIRST visit u
for each neighbor v of u:
if not visited[v]:
parent[v] = u
DFS(G, v)
time = time + 1
finish[u] = time // when we FINISH u (all descendants explored)
DFS Edge Classification
DFS classifies edges into 4 types. This is important for many algorithms:
Edge Type Meaning
Tree Edge Edge in the DFS tree (u → unvisited v)
Back Edge u → ancestor of u (indicates a CYCLE in directed graph)
Forward Edge u → descendant (non-tree edge going forward)
Cross Edge u → vertex in different DFS subtree
Property Value
Time Complexity O(V + E)
Space Complexity O(V)
Data Structure Stack (or recursive call stack)
Detects cycles? Yes — via back edges
1.7 Strongly Connected Components (SCCs)
A Strongly Connected Component is a maximal set of vertices where every vertex is reachable from
every other vertex.
🧠 Real-life Analogy
In a city road map: An SCC is a group of places where you can reach ANY place from ANY
other place. If you can go from A to B but not B to A, they're not in the same SCC.
Kosaraju's Algorithm (2-Pass DFS)
13. Run DFS on the original graph. Record finish times. Push to stack.
14. Create the TRANSPOSE graph (reverse all edges)
15. Run DFS on transpose graph in ORDER of decreasing finish time (pop from stack)
16. Each DFS tree in step 3 is ONE Strongly Connected Component
🔢 Why does this work?
If you can reach B from A in original, AND can reach A from B in the reversed graph (i.e., A
from B in original), then A and B are in the same SCC!
Property Value
Time Complexity O(V + E) — two DFS passes
Space Complexity O(V + E) for transpose graph
Application Finding clusters in social networks, compilers, web crawling
1.8 Amortized Analysis
Amortized analysis finds the average cost of operations over a sequence, even if individual operations
vary in cost. It's NOT the same as average-case analysis.
🧠 Simple Analogy
You go to the gym 20 days a month. One day you pay ₹2000 for a yearly membership. The
amortized cost per visit = ₹2000 / 240 visits per year ≈ ₹8.3. Not every visit costs ₹2000, but
spread out it's cheap.
Three Methods of Amortized Analysis
1. Aggregate Method
Total cost of n operations = T(n). Amortized cost = T(n)/n.
Example: Stack with MultiPop
Push: O(1) each. Pop: O(1) each. MultiPop(k): O(k) but can only pop what was pushed!
Total pushes over n ops ≤ n. So total MultiPops total ≤ n. Amortized cost per operation =
O(n)/n = O(1).
2. Accounting (Banker's) Method
Assign an 'amortized cost' to each operation. Some operations are 'overcharged' to save credit for
expensive future operations.
Example
Push: charge ₹2 (₹1 for actual work, ₹1 saved as credit) Pop: charge ₹0 (use saved credit)
Credit is always ≥ 0, so total amortized ≥ total actual.
3. Potential Method
Define a potential function Φ. Amortized cost = actual cost + ΔΦ (change in potential). If Φ never goes
below initial value, total amortized ≥ total actual.
Method Key Idea Typical Use
Aggregate Divide total cost by n Simple sequences
Accounting Pre-charge operations, save credit Variable-cost operations
Potential Abstract potential function Sophisticated data structures
UNIT 2: Matroids & Graph Matching
📌 What is this unit about?
Unit 2 covers: (1) Matroids and the Greedy Paradigm, (2) Maximum Weight Independent
Set, (3) Application to Minimum Spanning Trees (MST), (4) Graph Matching, (5) Augmenting
Paths, (6) Edmond's Blossom Algorithm.
2.1 Matroids — Introduction to Greedy Paradigm
A matroid is a mathematical structure that captures the essence of 'independence' — it tells us when
greedy algorithms work optimally.
🧠 Big Picture
Not all problems can be solved greedily. Matroids identify the special structure where greedy
works PERFECTLY. If a problem can be modeled as a matroid, the greedy algorithm gives
the optimal solution!
Formal Definition of a Matroid
A matroid M = (S, I) consists of:
• S = a finite ground set of elements
• I = a family of subsets called 'independent sets'
It must satisfy three axioms:
Axiom Condition Meaning
1. Empty Set ∅∈I Empty set is always independent
2. Hereditary If B ∈ I and A ⊆ B, then A ∈ I Subsets of independent sets are independent
3. If A,B ∈ I and |A| < |B|, then ∃ Smaller independent set can be extended
Augmentatio x ∈ B\A such that A ∪ {x} ∈ I using element from larger one
n
Example: Graphic Matroid
S = edges of a graph, I = all acyclic subsets of edges (forests). This is a matroid! The augmentation
axiom holds because you can always add an edge to a smaller spanning forest from a larger one
without creating a cycle.
2.2 Maximum Weight Independent Set (Greedy on Matroid)
Problem
Given a matroid M = (S, I) where each element in S has a weight, find the independent set with
MAXIMUM total weight.
Greedy Algorithm
17. Sort all elements in S by weight in DECREASING order
18. Initialize A = ∅ (empty independent set)
19. For each element x (in sorted order):
• If A ∪ {x} ∈ I (i.e., still independent), then A = A ∪ {x}
20. Return A
✅ Correctness
This greedy algorithm is OPTIMAL for any matroid! It gives the maximum weight
independent set. This is a key theorem in algorithm design.
2.3 Application: Minimum Spanning Tree (MST)
What is MST?
Given a connected, weighted, undirected graph, a Minimum Spanning Tree is a tree that connects all
vertices with minimum total edge weight.
🧠 Real-life Analogy
You want to connect N cities with roads. Each possible road has a construction cost. MST
gives you the cheapest way to connect all cities.
Kruskal's Algorithm (Greedy — uses Matroid property)
21. Sort all edges by weight in INCREASING order
22. Initialize MST = empty set
23. For each edge (u, v) in sorted order:
• If adding (u,v) does NOT create a cycle:
– Add (u,v) to MST
24. Stop when MST has V-1 edges
Kruskal(Graph G = (V, E)):
Sort edges E by weight (ascending)
MST = ∅
Initialize Union-Find on V
for each edge (u, v, w) in sorted order:
if Find(u) ≠ Find(v): // no cycle!
MST = MST ∪ {(u,v)}
Union(u, v)
return MST
🔢 Example
Edges (sorted): (A,B,1), (C,D,2), (B,C,3), (A,D,4), (B,D,5) Add (A,B,1): MST={A-B} Add
(C,D,2): MST={A-B, C-D} Add (B,C,3): connects {A,B} with {C,D}, no cycle → MST={A-B,C-
D,B-C} Add (A,D,4): would create cycle A-B-C-D-A, SKIP Done! Total weight = 1+2+3 = 6
Prim's Algorithm
Another MST algorithm. Start from any vertex. Greedily add the minimum weight edge that connects
the current tree to a new vertex.
25. Start with any vertex s. Mark it visited.
26. While not all vertices are in tree:
• Find the minimum weight edge (u,v) where u is in tree and v is not
• Add v to tree, add edge (u,v) to MST
Algorithm Time Complexity Best For
Kruskal's O(E log E) Sparse graphs
Prim's (simple) O(V²) Dense graphs
Prim's (Fibonacci O(E + V log V) Very dense graphs
heap)
2.4 Graph Matching
What is a Matching?
A matching in a graph is a set of edges where NO two edges share a common vertex. Each vertex is
matched to at most one other vertex.
🧠 Real-life Analogy
Matching medical students to hospitals, matching job applicants to positions. Each person
gets at most one match.
Type of Matching Meaning
Matching Set of vertex-disjoint edges
Maximal Matching Can't add more edges without conflict
Maximum Matching Largest possible matching (most edges)
Perfect Matching Every vertex is matched (only possible when |V| is even)
2.5 Augmenting Paths
Key Theorem (Berge's Theorem)
A matching M is maximum if and only if there is NO augmenting path with respect to M.
What is an Augmenting Path?
An augmenting path with respect to matching M is a path that:
• Starts and ends at UNMATCHED vertices
• Alternates between NON-matching edges and MATCHING edges
• Has one more non-matching edge than matching edges
Why is it called 'augmenting'?
If you find such a path, you can INCREASE the matching by 1: flip all edges on the path
(matched ↔ unmatched). This increases |M| by 1!
🔢 Example
Matching M = {(A,B), (C,D)} Path: X → A → B → C → D → Y (X,Y unmatched) Flipping: M
becomes {(X,A), (B,C), (D,Y)} — 3 edges instead of 2! ✓
2.6 Edmond's Blossom Algorithm
The Problem with Simple Augmenting Path Search
In bipartite graphs (two sides, edges only between sides), finding augmenting paths is easy with BFS.
But in GENERAL graphs (any vertex to any vertex), there can be ODD CYCLES called 'blossoms' that
confuse simple BFS.
What is a Blossom?
A blossom is an odd-length cycle (3, 5, 7... vertices) where all but one vertex is matched within the
cycle. They 'trap' augmenting paths and make them hard to find.
Edmond's Key Idea: Contraction
27. Run BFS to find augmenting paths from unmatched vertices
28. When you encounter a blossom (odd cycle): CONTRACT it into a single 'super-vertex'
29. Continue searching for augmenting paths in the contracted graph
30. If found, EXPAND the blossom back and reconstruct the path
Property Value
Time Complexity O(V³) original; O(V · E) improved versions
Works on General graphs (not just bipartite)
Key Operation Blossom contraction and expansion
Result Guarantees finding maximum matching
UNIT 3: Flow Networks & Matrix Computations
📌 What is this unit about?
Unit 3 covers: (1) Flow Networks, (2) Max-Flow Min-Cut Theorem, (3) Ford-Fulkerson
Method, (4) Edmonds-Karp Algorithm, (5) Matrix Computations — Strassen's Algorithm, (6)
Matrix Operations & LUP Decomposition.
3.1 Flow Networks
What is a Flow Network?
A flow network is a directed graph G = (V, E) where:
• Each edge (u,v) has a capacity c(u,v) ≥ 0 — maximum flow it can carry
• There is a SOURCE vertex s (where flow originates)
• There is a SINK vertex t (where flow ends)
🧠 Real-life Analogy
Water pipes: source is a reservoir, sink is a city, edges are pipes with capacity (max flow),
and we want to maximize water delivery to the city.
Flow Properties (must be satisfied)
Property Condition Meaning
Capacity 0 ≤ f(u,v) ≤ c(u,v) Flow on edge can't exceed capacity
Constraint
Flow Conservation For all v except s,t: flow in = flow What comes in must go out
out
Skew Symmetry f(u,v) = -f(v,u) Flow from u→v is negative of v→u
flow
The VALUE of a flow = total flow out of source = total flow into sink.
3.2 Max-Flow Min-Cut Theorem
What is a Cut?
A cut (S, T) partitions vertices into two sets where s ∈ S and t ∈ T. The capacity of the cut = sum of
capacities of edges going from S to T.
The Theorem
⭐ Max-Flow Min-Cut Theorem
MAXIMUM flow value = MINIMUM cut capacity This is one of the most elegant results in
algorithm theory!
This means: the bottleneck in any network is determined by the minimum capacity cut. If you remove
the min-cut edges, no flow can reach the sink at all.
3.3 Ford-Fulkerson Method
Ford-Fulkerson is a method (framework) for computing maximum flow using augmenting paths.
Residual Graph
The residual graph G_r shows how much MORE flow we can push. For each edge (u,v) with flow f and
capacity c:
• Forward edge: residual capacity = c - f (can push MORE flow)
• Backward edge: residual capacity = f (can CANCEL existing flow)
Ford-Fulkerson — Step by Step
31. Start with zero flow on all edges
32. Build the residual graph
33. Find any path from s to t in the residual graph (augmenting path)
34. Find minimum capacity along this path = bottleneck
35. Update flow along this path by the bottleneck value
36. Update residual graph
37. Repeat from step 3 until no augmenting path exists
FordFulkerson(Graph G, source s, sink t):
Initialize flow f = 0 on all edges
while exists augmenting path p in residual graph:
bottleneck = min capacity along path p
for each edge (u,v) in p:
f(u,v) = f(u,v) + bottleneck // forward
f(v,u) = f(v,u) - bottleneck // backward
return total flow
🔢 Example
Graph: s→A(3), s→B(2), A→t(2), B→t(3), A→B(2) Path 1: s→A→t, bottleneck=min(3,2)=2.
Flow=2 Residual: s→A=1, A→s=2, A→t=0, t→A=2 Path 2: s→A→B→t,
bottleneck=min(1,2,3)=1. Flow=3 Path 3: s→B→t, bottleneck=min(2-0,3-1)=min(2,2)=2
using s→B(2)→t: Flow=5 Max Flow = 5
Property Value
Time Complexity O(E · |f*|) where f* = max flow value
Problem Can be slow if max flow is large (infinite loop possible with
irrational weights)
Type Method/Framework (BFS or DFS to find augmenting path)
3.4 Edmonds-Karp Algorithm
Edmonds-Karp is Ford-Fulkerson with a specific rule: always use BFS to find the augmenting path
(shortest path in terms of number of edges).
Why BFS?
Using BFS guarantees the path has the FEWEST edges. This prevents the worst-case
behavior of Ford-Fulkerson and gives a polynomial time bound!
Property Value
Time Complexity O(V · E²)
Key Difference from Ford- Always uses BFS (shortest augmenting path)
Fulkerson
Number of augmenting paths At most O(V · E) — polynomial!
Advantage Guaranteed polynomial time, unlike basic Ford-Fulkerson
💡 Why O(V·E²)?
Each BFS takes O(E). Between two augmentations involving the same edge, that edge's
level increases. Each edge can be critical at most O(V) times. Total = O(V·E) augmentations
× O(E) per BFS = O(V·E²).
3.5 Matrix Computations — Strassen's Algorithm
Standard Matrix Multiplication
For two n×n matrices A and B, the standard algorithm computes C = A×B:
Standard MatMul(A, B, n):
for i = 1 to n:
for j = 1 to n:
C[i][j] = 0
for k = 1 to n:
C[i][j] += A[i][k] * B[k][j]
// Time: O(n³)
Strassen's Algorithm — Divide and Conquer
Strassen (1969) showed that 2×2 matrix multiplication needs only 7 multiplications instead of 8,
reducing the overall complexity!
The Key Idea
Divide each n×n matrix into four (n/2)×(n/2) sub-matrices. Instead of 8 recursive multiplications
(standard), use only 7 clever combinations.
Strassen's 7 Magic Products
Given A = [[A11, A12], [A21, A22]] and B = [[B11, B12], [B21, B22]]:
Product Formula
M1 (A11 + A22) × (B11 + B22)
M2 (A21 + A22) × B11
M3 A11 × (B12 − B22)
M4 A22 × (B21 − B11)
M5 (A11 + A12) × B22
M6 (A21 − A11) × (B11 + B12)
M7 (A12 − A22) × (B21 + B22)
Resulting Sub-matrices
Quadrant Formula
C11 M1 + M4 − M5 + M7
C12 M3 + M5
C21 M2 + M4
C22 M1 − M2 + M3 + M6
Property Value
Recurrence T(n) = 7T(n/2) + O(n²)
Time Complexity O(n^2.807) by Master Theorem
Standard Algorithm O(n³)
Improvement Significant for large matrices (n ≥ 100+)
🧠 Master Theorem Recap
T(n) = aT(n/b) + f(n) For Strassen: a=7, b=2, f(n)=O(n²) log_b(a) = log₂(7) ≈ 2.807 > 2 So
T(n) = O(n^2.807) [Case 1 of Master Theorem]
3.6 Relations Between Matrix Operations
Key insight: Many matrix operations have related time complexities. If one is solvable in T(n), others
often are too.
Operation Complexity Relation
Matrix Multiply (MM) O(n^ω), ω≈2.373 Base operation
Matrix Inversion O(n^ω) Reducible to MM
LU Decomposition O(n^ω) Reducible to MM
Determinant O(n^ω) Reducible to MM
Triangular Inverse O(n^ω) Using divide & conquer
3.7 Inverse of a Triangular Matrix
A triangular matrix is one where all elements either above or below the diagonal are zero. Inverting a
triangular matrix is easier than general matrix inversion.
Algorithm (Divide & Conquer)
For an upper triangular matrix T split into blocks:
T = [[A, B], [0, D]] where A and D are triangular
• Compute A⁻¹ recursively
• Compute D⁻¹ recursively
• Compute -A⁻¹ × B × D⁻¹
• T⁻¹ = [[A⁻¹, -A⁻¹BD⁻¹], [0, D⁻¹]]
3.8 LUP Decomposition
What is LUP?
LUP decomposition factors a matrix A = L × U × P⁻¹ where:
Matrix Meaning
L Lower triangular (all zeros above diagonal, ones on diagonal)
U Upper triangular (all zeros below diagonal)
P Permutation matrix (reorders rows for numerical stability)
🧠 Why do we need this?
Solving Ax = b directly is hard. But with LUP decomposition: 1. Rewrite as L(Ux) = P·b 2.
Solve Ly = Pb by forward substitution (easy! L is lower triangular) 3. Solve Ux = y by
backward substitution (easy! U is upper triangular)
LUP Algorithm Steps
38. Find the pivot: choose largest |A[i][k]| in column k (for numerical stability)
39. Swap rows to bring pivot to top → update permutation matrix P
40. Perform elimination: for rows below pivot, eliminate column k
41. Record multipliers in L, result in U
42. Recurse on (n-1)×(n-1) submatrix
Property Value
Time Complexity O(n³) — same as Gaussian elimination
Space O(n²) — stores L, U, P in place
Used for Solving linear systems, computing determinant, matrix
inverse
Advantage over Gaussian Numerically stable with partial pivoting
Elimination
🔢 Example
A = [[2, 3], [4, 7]] P = I (no swap needed) L = [[1,0],[2,1]], U = [[2,3],[0,1]] Verify: L×U = [[2,3],
[4,6+1]] = [[2,3],[4,7]] = A ✓
UNIT 4: Shortest Paths & Modulo Arithmetic
📌 What is this unit about?
Unit 4 covers: (1) Floyd-Warshall Algorithm (All-Pairs Shortest Path), (2) Dynamic
Programming paradigm with more examples, (3) Modulo Representation of integers and
polynomials, (4) Chinese Remainder Theorem (CRT), (5) Interpolation Problem.
4.1 Floyd-Warshall Algorithm
The Problem: All-Pairs Shortest Path
Dijkstra finds shortest path FROM ONE source. But what if we need shortest paths BETWEEN ALL
pairs of vertices? That's the All-Pairs Shortest Path (APSP) problem.
🧠 Real-life Analogy
A Google Maps for a city: you want to know the shortest travel time between every possible
pair of locations, not just from one starting point.
Brute Force: Run Dijkstra from every vertex → O(V · (V+E) log V). Floyd-Warshall is often simpler to
implement and works even with negative edge weights (as long as no negative cycles).
Key Idea — Dynamic Programming
Floyd-Warshall uses DP. Define:
d[i][j][k] = shortest path from i to j using ONLY vertices {1, 2, ..., k} as intermediate vertices
Recurrence
📐 The Formula
d[i][j][k] = min( d[i][j][k-1], d[i][k][k-1] + d[k][j][k-1] ) Meaning: Either the path from i to j
doesn't use vertex k at all, OR it goes through k (i→k then k→j).
Base Case
d[i][j][0] = weight of direct edge (i,j) if it exists, else ∞. d[i][i][0] = 0 (distance to itself).
Floyd-Warshall Algorithm
FloydWarshall(Graph G with n vertices):
Initialize D[n][n] with edge weights (∞ if no edge, 0 on diagonal)
Initialize Next[i][j] = j (for path reconstruction)
for k = 1 to n: // consider each vertex as intermediate
for i = 1 to n: // for each source
for j = 1 to n: // for each destination
if D[i][k] + D[k][j] < D[i][j]:
D[i][j] = D[i][k] + D[k][j]
Next[i][j] = Next[i][k] // update path
// Check for negative cycles: if D[i][i] < 0 for any i → negative cycle!
return D, Next
🔢 Step-by-step Example
Graph: A→B(3), A→C(8), A→D(-4), B→D(1), B→E(7), C→B(4), D→C(2), E→A(2), E→D(6)
Initial D matrix (A=1,B=2,C=3,D=4,E=5): D[1][2]=3, D[1][3]=8, D[1][4]=-4, D[2][4]=1, D[2]
[5]=7, D[3][2]=4, D[4][3]=2, D[5][1]=2, D[5][4]=6 After k=1 (through A): D[5][2]=2+3=5
(5→A→B), D[5][3]=2+8=10, D[5][4]=min(6,2-4)=min(6,-2)=-2! Final shortest paths after all
k: D[1][3]=-4+2=-2, etc.
Property Value
Time Complexity O(V³) — three nested loops
Space Complexity O(V²)
Negative weights? YES (allowed, but no negative cycles)
Negative cycle detection If D[i][i] < 0 after algorithm
Paradigm Dynamic Programming
4.2 Dynamic Programming — The Paradigm
What is Dynamic Programming (DP)?
DP is an algorithm design technique for solving problems with:
• Optimal Substructure: The optimal solution contains optimal solutions to subproblems
• Overlapping Subproblems: The same subproblems are solved multiple times (unlike Divide &
Conquer)
🧠 Simple Analogy
DP is like remembering answers to math problems you've solved before. Instead of
recalculating, you look up the stored answer. This is called MEMOIZATION.
DP Approaches
Approach Description When to use
Top-Down Recursion + cache results Natural recursive structure, don't need
(Memoization) all subproblems
Bottom-Up Fill table iteratively from small to Need all subproblems, better space
(Tabulation) large
Classic DP Examples
Example 1: Fibonacci Numbers
Without DP: Exponential O(2ⁿ) time (recomputes same values). With DP: O(n) time.
fib(n): fib_dp(n):
if n ≤ 1: return n dp[0]=0, dp[1]=1
return fib(n-1)+fib(n-2) for i=2..n: dp[i]=dp[i-1]+dp[i-2]
// O(2ⁿ) ← SLOW // O(n) ← FAST
Example 2: Longest Common Subsequence (LCS)
Given two strings X and Y, find the longest common subsequence.
Recurrence
LCS(i,j) = LCS(i-1,j-1)+1 if X[i]=Y[j] = max(LCS(i-1,j), LCS(i,j-1)) otherwise Time:
O(mn), Space: O(mn)
Example 3: 0/1 Knapsack
N items, each with weight w[i] and value v[i]. Knapsack capacity W. Maximize total value without
exceeding weight.
Recurrence
DP[i][w] = max value using first i items with capacity w = max(DP[i-1][w], v[i] + DP[i-1][w-
w[i]]) if w[i] ≤ w = DP[i-1][w] otherwise Time: O(nW)
Example 4: Matrix Chain Multiplication
Given n matrices, find the optimal parenthesization to minimize scalar multiplications.
Recurrence
m[i][j] = min cost to multiply matrices i through j m[i][j] = min over k from i to j-1: m[i][k] +
m[k+1][j] + p[i-1]·p[k]·p[j] Time: O(n³)
4.3 Modulo Representation of Integers
Why Modular Arithmetic?
Working with very large numbers is slow. Modular arithmetic lets us work with smaller numbers while
preserving important properties. It's the foundation of cryptography (RSA), hashing, and number theory
algorithms.
Basic Concepts
Term Meaning Example
a mod m Remainder when a is divided by m 17 mod 5 = 2
a ≡ b (mod m) a and b have the same remainder 17 ≡ 2 (mod 5)
mod m
Modular Addition (a + b) mod m = ((a mod m) + (b (13+7) mod 5 = (3+2) mod 5 = 0
mod m)) mod m
Modular (a × b) mod m = ((a mod m) × (b (13×7) mod 5 = (3×2) mod 5 = 1
Multiplication mod m)) mod m
4.4 Chinese Remainder Theorem (CRT)
The Theorem
If m₁, m₂, ..., mₖ are pairwise coprime integers (gcd(mᵢ, mⱼ) = 1 for i≠j), then for any integers r₁, r₂, ...,
rₖ, there exists a UNIQUE integer x such that:
📐 CRT
x ≡ r₁ (mod m₁) x ≡ r₂ (mod m₂) ... x ≡ rₖ (mod mₖ) Unique solution exists modulo M = m₁
× m₂ × ... × mₖ
🧠 Real-life Analogy
Imagine you arrange items in rows of 3 and have 2 left over, rows of 5 and have 3 left, rows
of 7 and have 2 left. CRT tells you there's a unique number of items (mod 105) satisfying all
conditions!
CRT — Step by Step
43. Calculate M = m₁ × m₂ × ... × mₖ
44. For each i, calculate Mᵢ = M / mᵢ
45. Find yᵢ = Mᵢ⁻¹ mod mᵢ (modular inverse of Mᵢ mod mᵢ)
46. Solution: x = (Σ rᵢ × Mᵢ × yᵢ) mod M
🔢 CRT Example
Solve: x ≡ 2 (mod 3), x ≡ 3 (mod 5), x ≡ 2 (mod 7) M = 3×5×7 = 105 M₁=35, M₂=21, M₃=15
Find inverses: y₁: 35y₁≡1(mod 3) → 2y₁≡1(mod 3) → y₁=2 y₂: 21y₂≡1(mod 5) →
y₂=1 y₃: 15y₃≡1(mod 7) → y₃=1 x = (2×35×2 + 3×21×1 + 2×15×1) mod 105 =
(140 + 63 + 30) mod 105 = 233 mod 105 = 23 ✓ Check: 23=7×3+2 ✓, 23=4×5+3 ✓,
23=3×7+2 ✓
Conversion Between Representations
Conversion Method
Standard → CRT x = (x mod m₁, x mod m₂, ..., x mod mₖ)
CRT → Standard Use CRT reconstruction formula above
Addition in CRT (a₁+b₁ mod m₁, a₂+b₂ mod m₂, ...)
Multiplication in CRT (a₁×b₁ mod m₁, a₂×b₂ mod m₂, ...)
4.5 Extension to Polynomials
Polynomial Representation
Just as integers can be represented in base-b form, polynomials can be represented in two ways:
• Coefficient representation: p(x) = aₙxⁿ + aₙ₋₁xⁿ⁻¹ + ... + a₀
• Point-value representation: p(x) specified by n+1 points (x₀,y₀), ..., (xₙ,yₙ)
Representation Evaluation Addition Multiplication
Coefficient O(n) Horner's O(n) O(n²)
Point-value O(1) direct O(n) O(n)
💡 Key Insight
Multiplication in point-value form is O(n), much faster! But converting between forms takes
O(n log n) using FFT (Fast Fourier Transform).
Polynomial Modular Arithmetic
Similar to integer mod: p(x) mod q(x) = remainder when p is divided by q. CRT extends to polynomials:
If m₁(x), m₂(x), ..., mₖ(x) are pairwise coprime polynomials, then a polynomial p(x) can be uniquely
recovered from (p mod m₁, p mod m₂, ..., p mod mₖ).
4.6 Interpolation Problem
What is Interpolation?
Given n+1 points (x₀,y₀), (x₁,y₁), ..., (xₙ,yₙ) with distinct xᵢ, find the unique polynomial of degree ≤ n
that passes through ALL these points.
🧠 Why unique?
A degree-n polynomial has n+1 coefficients. n+1 points give n+1 equations. This system has
a unique solution (when xᵢ are distinct). The solution matrix (Vandermonde) is invertible!
Lagrange Interpolation Formula
📐 Formula
p(x) = Σᵢ yᵢ × Lᵢ(x) where Lᵢ(x) = Πⱼ≠ᵢ (x - xⱼ)/(xᵢ - xⱼ) Lᵢ(x) is the 'basis polynomial' that equals 1
at xᵢ and 0 at all other xⱼ.
🔢 Example
Points: (0,1), (1,3), (2,7). Find polynomial. L₀(x) = (x-1)(x-2)/((0-1)(0-2)) = (x-1)(x-2)/2 L₁(x)
= (x-0)(x-2)/((1-0)(1-2)) = x(x-2)/(-1) = -x(x-2) L₂(x) = (x-0)(x-1)/((2-0)(2-1)) = x(x-1)/2 p(x) =
1·L₀ + 3·L₁ + 7·L₂ = (x-1)(x-2)/2 - 3x(x-2) + 7x(x-1)/2 = (x²-3x+2)/2 + (-3x²+6x) + (7x²-7x)/2
= x² + x + 1 Check: p(0)=1 ✓, p(1)=3 ✓, p(2)=7 ✓
Property Value
Time Complexity O(n²) for Lagrange interpolation
Better algorithm Newton's divided differences: O(n²) but incremental
Application Error correction codes, secret sharing, numeric analysis
Unique? Yes — unique polynomial of degree ≤ n through n+1 points
QUICK REFERENCE — All Units 1-4
⏱ Time Complexities Summary
Algorithm Time Space Key Feature
Bubble/Selection/Insertion O(n²) O(1) Simple, small data
Sort
Merge Sort O(n log n) O(n) Stable, guaranteed
Quick Sort O(n log n) avg O(log n) Fastest in practice
Heap Sort O(n log n) O(1) No extra space
BFS O(V+E) O(V) Shortest path (unweighted)
DFS O(V+E) O(V) Cycle detection, SCC
Dijkstra O((V+E) log V) O(V) Shortest path (weighted, non-neg)
Topological Sort O(V+E) O(V) Only on DAGs
Kosaraju's SCC O(V+E) O(V+E) Strongly Connected Components
Kruskal's MST O(E log E) O(V) Sparse graphs
Prim's MST O(V²) O(V) Dense graphs
Ford-Fulkerson O(E·|f*|) O(V+E) Max flow (can be slow)
Edmonds-Karp O(V·E²) O(V+E) Max flow (polynomial)
Strassen's MatMul O(n^2.807) O(n²) Divide & Conquer
LUP Decomposition O(n³) O(n²) Solving linear systems
Floyd-Warshall O(V³) O(V²) All-pairs shortest path
Blossom Algorithm O(V·E) O(V+E) Max matching (general graph)
🧠 Algorithm Paradigms
Paradigm Key Idea Examples in Units 1-4
Divide & Conquer Split, solve recursively, Merge Sort, Quick Sort, Strassen's
combine
Greedy Always pick locally optimal Kruskal's MST, Dijkstra, Greedy on Matroid
choice
Dynamic Cache subproblem solutions Floyd-Warshall, LCS, Knapsack
Programming
Augmenting Path Iteratively improve Ford-Fulkerson, Matching algorithms
matching/flow
Amortized Average cost over sequence Stack operations, Union-Find
Analysis
📝 Exam Tips
1. For any shortest path question: Ask 'unweighted?' → BFS, 'single-source weighted?' →
Dijkstra, 'all-pairs?' → Floyd-Warshall 2. For flow problems: Use Ford-Fulkerson framework,
Edmonds-Karp for polynomial guarantee 3. For MST: Remember both Kruskal's (sort edges)
and Prim's (grow tree vertex by vertex) 4. Greedy works optimally for matroid problems —
this is the key theoretical insight 5. DP problems: identify optimal substructure first, write
recurrence, then code
BONUS SECTION: When to Use Which Algorithm?
🎯 Purpose of This Section
This section answers the most practical exam & real-life question: given a situation, WHICH
algorithm should you pick and WHY? Includes best case, worst case, and real-world
examples for every algorithm.
SORTING: Best Case, Worst Case & Real-Life Use
S1. Bubble Sort — When does it shine? When does it fail?
Scenario Performance Why?
Array is ALREADY BEST — O(n) In one pass, no swaps occur. With early-exit flag,
sorted it stops immediately.
Array is NEARLY sorted GREAT — O(n) Very few swaps needed. Almost like scanning
(1-2 elements out of once.
place)
Array is REVERSE WORST — O(n²) Every pair needs swapping every pass.
sorted (worst possible) Maximum comparisons and swaps.
Large random array TERRIBLE — Extremely slow. Avoid completely for large data.
(10,000+ elements) O(n²)
Tiny array (< 10 FINE Overhead is negligible. Simplicity wins.
elements)
🏭 Real-Life Scenario: Bubble Sort
Stock Ticker Display: A live stock screen shows 10 stocks. Every second, one stock's price
changes slightly, making the list nearly sorted. Bubble sort re-sorts it with very few swaps —
perfect! AVOID for: Sorting 1 million customer records — would take billions of operations.
Use Merge Sort instead.
S2. Selection Sort — When does it shine? When does it
fail?
Scenario Performance Why?
Any input (sorted, random, SAME — O(n²) Always scans the full unsorted portion to find
reverse) always minimum. No shortcuts.
Memory writes are VERY BEST CHOICE Makes at most O(n) swaps — fewer writes
expensive (e.g., flash than any other O(n²) sort.
storage)
Checking if already sorted BAD Still does O(n²) comparisons even on sorted
input. No early exit.
Large datasets POOR O(n²) in all cases. Not scalable.
🏭 Real-Life Scenario: Selection Sort
Embedded Systems / Flash Memory: Writing to flash memory is expensive (wears it out,
consumes power). Selection Sort minimizes writes — it finds the minimum first, then does
ONE swap. Used in IoT sensors and embedded microcontrollers where minimizing write
operations extends device life. AVOID for: Any general-purpose sorting where n > 1000.
S3. Insertion Sort — When does it shine? When does it
fail?
Scenario Performance Why?
Array is ALREADY BEST — O(n) Each element just compares with its left
SORTED neighbor and stays. Only n-1 comparisons.
Array is NEARLY sorted EXCELLENT — Each element travels only a short distance to
(few inversions) close to O(n) find its spot.
Small array (< 20 BEST CHOICE Low constant factors, simple code, cache-
elements) friendly. Beats quicksort here.
Online sorting (elements PERFECT Can insert each new element into already-
arrive one by one) sorted portion immediately.
REVERSE sorted array WORST — O(n²) Each new element must travel all the way to
the front. Maximum shifts.
Large random array POOR — O(n²) On average, each element shifts halfway
back. Not scalable.
🏭 Real-Life Scenario: Insertion Sort
Card Game Hand Sorting: When you pick up playing cards one by one, you naturally insert
each new card into the correct position in your hand. That's insertion sort! Online
Leaderboard (small): A gaming app maintains a top-10 leaderboard. When a new score
arrives, insertion sort inserts it into the correct rank — very fast for small lists. Used inside
TimSort (Python's built-in sort): Python uses Insertion Sort for small sub-arrays (size < 64)
within its hybrid TimSort algorithm. So it IS used in real production code!
S4. Merge Sort — When does it shine? When does it fail?
Scenario Performance Why?
Any input (sorted, random, EXCELLENT — O(n Guaranteed performance. Never degrades
reverse) log n) always like Quick Sort.
Sorting LINKED LISTS BEST CHOICE No random access needed. Merging linked
lists is O(1) extra space.
Need STABLE sort BEST CHOICE Merge sort is naturally stable. Quicksort and
(preserve equal elements' Heapsort are not.
order)
External sorting (data on BEST CHOICE Data read/written in sequential chunks —
disk/files) perfect for disk I/O.
Limited memory / in-place BAD Requires O(n) extra space for the auxiliary
needed array. Costly on tight memory.
Small arrays OVERKILL Overhead of recursion and auxiliary array
outweighs benefits for n < 20.
🏭 Real-Life Scenario: Merge Sort
Database Systems (External Sort): When sorting a 1TB database table that doesn't fit in
RAM, database engines (PostgreSQL, MySQL) use external merge sort. They sort chunks
that fit in RAM, write them to disk, then merge the sorted files. Git / Version Control: Merge
sort inspired the merge step in Git's 3-way merge of code changes. E-commerce Order
History: Amazon sorts millions of order records stably (preserving original order of same-
date orders). Merge sort is used for this stability guarantee.
S5. Quick Sort — When does it shine? When does it fail?
Scenario Performance Why?
Large random array BEST PRACTICAL Best cache performance, in-place, low
— O(n log n) avg constant factor. Fastest in practice.
Pivot = median (random BEST — O(n log n) Balanced partitions. Each level of recursion
pivot or median-of-3) has ~n/2 elements.
Already sorted array + bad WORST — O(n²) Pivot is always the min/max. Partitions are 0
pivot (first element) and n-1. Completely unbalanced.
All elements EQUAL WORST — O(n²) All elements go to same side. Fixed by 3-
without fix way partition (Dutch National Flag).
Need stable sort BAD Quicksort is NOT stable. Equal elements
can swap.
Small array (< 10) NOT WORTH IT Recursion overhead. Use Insertion Sort
instead (hybrid approach).
🏭 Real-Life Scenario: Quick Sort
C Standard Library (qsort): The qsort() function in C uses a variant of Quicksort. Used in
millions of programs daily. Java [Link]() for primitives: Java uses Dual-Pivot Quicksort
for sorting int[], double[], etc. because it's 10-20% faster than classic quicksort in practice.
Search Engines: When ranking millions of web pages by relevance score (a float), search
engines use quicksort variants because cache efficiency matters enormously at scale.
AVOID when: You need guaranteed O(n log n) (use Merge Sort) or stable sort (use Merge
Sort).
S6. Heap Sort — When does it shine? When does it fail?
Scenario Performance Why?
Need guaranteed O(n log BEST CHOICE Only algorithm with both guarantees
n) with O(1) space simultaneously. No worst case like Quicksort.
Finding Top-K elements EXCELLENT Build a min-heap of K elements. Process
from large dataset remaining n-K elements. Only O(K) space!
Any input (sorted, random, O(n log n) always Heapify then extract. Performance doesn't
reverse) depend on input order.
Cache performance NOT IDEAL Heap accesses memory in non-sequential
matters pattern (jumps around). Poor cache locality.
Need stable sort BAD Heap sort is NOT stable.
Nearly sorted data NOT OPTIMAL Can't take advantage of existing order like
Insertion Sort.
🏭 Real-Life Scenario: Heap Sort
Operating System Process Scheduling: Linux kernel's process scheduler uses a heap
(priority queue) to always pick the highest-priority runnable process in O(log n). Top-K
Results: 'Show the 10 most recent tweets from 50 million posts' — use a min-heap of size
10. Much faster than full sort! Network Packet Routing: Routers use priority queues (heap-
based) to process high-priority packets (e.g., VoIP) before lower-priority ones (e.g., file
downloads).
THE ULTIMATE SORTING DECISION GUIDE
Step-by-Step: How to choose the right sorting algorithm
Question to Ask If YES → Use If NO → Continue
Is n very small (< 20)? Insertion Sort — fastest for Next question
tiny arrays
Is data already / nearly sorted? Insertion Sort or Bubble Next question
Sort (with flag)
Is data on disk / external Merge Sort — sequential Next question
memory? disk I/O
Is it a Linked List? Merge Sort — no random Next question
access needed
Must the sort be STABLE? Merge Sort — only stable Next question
O(n log n) sort
Is memory very limited (O(1) Heap Sort — O(1) space, Next question
space)? O(n log n) guaranteed
Minimize memory WRITES Selection Sort — only O(n) Next question
(flash/embedded)? swaps
General purpose, need fastest Quick Sort (random pivot) Merge Sort (safe default)
in practice? — best average case
🏆 The Golden Rule
In 90% of real-world cases: Use Quick Sort (for primitives) or Merge Sort (for
objects/stability). Python uses TimSort (Merge + Insertion hybrid). Java uses Dual-Pivot
Quicksort for primitives and TimSort for objects. For interviews: Merge Sort is the safest
answer because it's O(n log n) guaranteed, stable, and works on linked lists.
Sorting Best Case vs Worst Case — At a Glance
BEST CASE WORST CASE
Algorithm Avoid when...
(when?) (when?)
Bubble Sort O(n) — already sorted O(n²) — reverse sorted n > 1000 or random data
Selection Sort O(n²) — always same O(n²) — always same General sorting, large n
Insertion Sort O(n) — already/nearly O(n²) — reverse sorted Large random arrays
sorted
Merge Sort O(n log n) — always O(n log n) — always Very memory-constrained
systems
Quick Sort O(n log n) — O(n²) — sorted + bad Need guaranteed time or
random/median pivot pivot stable sort
Heap Sort O(n log n) — always O(n log n) — always Cache-sensitive or stable-
sort needed
GRAPH ALGORITHMS: Which to Use When?
G1. BFS vs DFS — When to use which?
Use BFS when... Use DFS when...
Finding SHORTEST PATH in unweighted Detecting CYCLES in a graph
graph
Finding all nodes at distance k from source Topological sorting
Testing if graph is BIPARTITE Finding Strongly Connected Components
Web crawling level by level (pages at 1 hop, 2 Solving mazes (backtracking needed)
hops...)
Social network: find closest friends first Generating all paths between two nodes
Peer-to-peer networks: finding nearby peers Scheduling tasks with dependencies
🏭 Real-Life: BFS
WhatsApp 'People You May Know': Finds all your friends-of-friends (2 hops away) using
BFS. Starts at YOU, explores friends (level 1), then their friends (level 2). Google Maps —
Fewest Transfers: When you ask for directions with fewest bus/metro changes, Google uses
BFS on the transit graph. Network Broadcasting: A router broadcasts a message to all
devices on a LAN using BFS-like flooding.
🏭 Real-Life: DFS
File System Traversal: When you 'Find in Files' recursively in a folder, your OS uses DFS —
it goes deep into subfolders before coming back. Compiler: Parsing nested code blocks (if
inside for inside while) uses DFS on the syntax tree. Solving Sudoku / Chess AI:
Backtracking algorithms (try a move, go deep, backtrack if wrong) are DFS at heart.
G2. Dijkstra vs BFS vs Floyd-Warshall — Which to Use?
Situation Algorithm to Use Why
Unweighted graph, BFS O(V+E), simpler than Dijkstra
single source
Weighted graph (non- Dijkstra O((V+E) log V), handles weights
negative), single source
Weighted graph WITH Bellman-Ford (not in Dijkstra FAILS with negative edges
negative edges, single syllabus but good to
source know)
Need shortest path Floyd-Warshall O(V³) but finds everything at once
between ALL pairs of
vertices
Sparse graph, multiple Run Dijkstra from O(V(V+E) log V) — efficient for sparse graphs
sources each source
Dense graph, all pairs Floyd-Warshall O(V³) — simple implementation, better
constant
🏭 Real-Life: Dijkstra
Google Maps / GPS Navigation: Finding the fastest route from your location to a destination.
Edge weights = travel time. Dijkstra runs in real-time on road networks. Internet Routing —
OSPF Protocol: Routers use Dijkstra's algorithm (OSPF = Open Shortest Path First) to find
the fastest route for your data packets across the internet. Airlines: Finding cheapest flight
route from city A to city B where edge weights = ticket prices.
🏭 Real-Life: Floyd-Warshall
Navigation Apps — Precomputed Distance Matrix: Uber/Ola precomputes a distance table
between all major intersections in a city. Floyd-Warshall runs offline to fill this O(V²) table,
then trip ETA is a simple lookup. Network Analysis: Finding the 'diameter' of a social
network (longest shortest path between any two users). Traffic Control Systems: City traffic
management systems precompute all inter-junction distances for signal optimization.
G3. Kruskal's vs Prim's MST — When to use which?
Situation Kruskal's Prim's
Graph type Better for SPARSE graphs Better for DENSE graphs (many
(few edges) edges)
Edge list available? YES — sorts edge list directly Less natural — needs adjacency
matrix/list
Time complexity O(E log E) — dominated by O(V²) simple, O(E + V log V) with
sorting heap
Parallel / distributed? Easier to parallelize edge Sequential by nature
sorting
Adding edges Easy — just re-sort and re-run Must restart tree growth
incrementally?
🏭 Real-Life: MST Algorithms
Telecommunications Network Design: A telecom company wants to lay fiber optic cables to
connect 50 cities with minimum total cable length. This is exactly MST! Kruskal's is used
here. Water Pipeline Network: Connecting villages with water pipes at minimum
construction cost. Circuit Board Design: Minimizing total wire length on a PCB to connect
components — MST gives the minimum wiring. Clustering Algorithm: MST is used in data
clustering — remove the K-1 most expensive MST edges to get K clusters.
G4. Ford-Fulkerson vs Edmonds-Karp — When to use
which?
Situation Ford-Fulkerson Edmonds-Karp
Edge capacities are OK — terminates in |f*| steps OK — always terminates
integers (known upper
bound)
Edge capacities are DANGEROUS — may not SAFE — always O(VE²)
irrational numbers terminate!
Need polynomial time NO — depends on max flow YES — always O(VE²)
guarantee value
Simple implementation Simpler (just DFS for path) Slightly more complex (BFS)
needed
Large capacity values SLOW — O(E × BETTER — independent of
max_capacity) capacity
🏭 Real-Life: Flow Networks
Internet Traffic Routing: Data centers route network packets between servers. Max flow =
maximum data throughput. Telecom companies use flow algorithms to maximize bandwidth.
Supply Chain / Logistics: A company ships goods from warehouses (sources) to stores
(sinks) through distribution centers. Max flow = maximum units shippable per day. Bipartite
Matching via Flow: Matching job applicants to job openings: source → each applicant →
each matching job → sink. Max flow = maximum number of hires. Hospital Bed Allocation:
Matching patients to available hospital beds in different wards.
G5. Topological Sort — When to use?
Use
Situation Topological Real Example
Sort?
Task scheduling with YES Build systems: compile file A before file B
prerequisites
Course prerequisite YES University curriculum planning
ordering
Graph HAS a cycle NO — If A depends on B and B depends on A → no valid
impossible! order
Undirected graph NO — only for Road networks (bidirectional) don't have
DAGs topological order
Detecting circular YES — cycle Python import system detecting circular imports
dependencies detection
🏭 Real-Life: Topological Sort
Build Systems (Make, Gradle, Maven): When you compile a large software project, files
must be compiled in order (header files before source files, base classes before derived
classes). Topological sort determines the build order. Spreadsheet Calculation: In Excel, if
cell C3 depends on B2 which depends on A1, the cells must be recalculated in topological
order. Package Managers (npm, pip, apt): Before installing a package, all its dependencies
must be installed first. Topological sort determines the installation order.
MATRIX ALGORITHMS: When to Use Which?
M1. Standard Matrix Multiply vs Strassen's
Situation Standard O(n³) Strassen's O(n^2.807)
Matrix size n < 64 BETTER — lower NOT WORTH IT — setup cost too high
overhead
Matrix size n > 512 SLOWER BETTER — speedup becomes
significant
Numerical stability needed BETTER — SLIGHTLY WORSE — more additions,
straightforward arithmetic potential rounding errors
Simple implementation YES — 3 nested loops NO — 7 magic products, complex
needed
Teaching / understanding Always use standard first Use Strassen to show D&C beats brute
force
🏭 Real-Life: Matrix Multiplication
Machine Learning / Deep Learning: Training neural networks is essentially massive matrix
multiplication (weight matrices × input). Libraries like NumPy, TensorFlow, PyTorch use
highly optimized BLAS routines that incorporate Strassen-like tricks for large matrices.
Computer Graphics: 3D transformations (rotation, scaling, projection) are matrix
multiplications. Game engines (Unity, Unreal) do billions of 4×4 matrix multiplies per second
— standard multiply, highly optimized with SIMD instructions. Scientific Computing: Physics
simulations, weather forecasting, and structural engineering use large matrix operations
where Strassen variants save significant time.
M2. LUP Decomposition — When and Why?
Situation Use LUP? Alternative
Solve Ax = b for MULTIPLE BEST CHOICE — Gaussian elimination — must
different b values decompose once, solve redo O(n³) each time
many times O(n²) each
Compute matrix INVERSE YES — A⁻¹ via LUP Gauss-Jordan — same
complexity but LUP is
numerically better
Compute DETERMINANT YES — det(A) = product of Cofactor expansion — O(n!) —
U's diagonal never use for large n
Singular matrix (det = 0) DETECTS IT — pivot = 0 Gaussian elimination may fail
silently
Floating point precision YES — partial pivoting Without pivoting, rounding errors
needed prevents blow-up accumulate
🏭 Real-Life: LUP Decomposition
Engineering Simulations (FEM): Finite Element Analysis for structural engineering (bridges,
buildings) solves systems of thousands of equations (Ax = b). LUP decomposes A once,
then solves for different load conditions (different b vectors) cheaply. Economics / Linear
Programming: Solving economic models, input-output tables. The same coefficient matrix A
is reused with different right-hand sides. Computer Graphics — Transformation Matrices:
Inverting transformation matrices for ray casting in 3D rendering.
MATCHING & DP: Which Algorithm for Which Problem?
MA1. Bipartite Matching vs General Matching (Blossom)
Graph Type Algorithm Why
Bipartite graph (2 sides, Augmenting path via No odd cycles possible. Simple and
edges only between sides) BFS/DFS — O(VE) fast.
General graph (any structure, Edmond's Blossom — Odd cycles (blossoms) exist. Need
may have odd cycles) O(V³) contraction trick.
Bipartite + weights (maximize Hungarian Algorithm — Weighted bipartite matching
total weight) O(V³)
🏭 Real-Life: Graph Matching
Tinder / Dating Apps: Matching users based on mutual likes is a bipartite matching problem
(men ↔ women, or any two groups). Maximum matching = maximum possible mutual
matches. Hospital Residency Matching: In India/US, medical graduates are matched to
hospital residency programs. This is a classic stable matching problem solved by matching
algorithms. Ride Sharing (Uber/Ola): Matching available drivers (set A) to ride requests (set
B) to maximize number of matches — bipartite matching. Chess Pairing: In a chess
tournament, pairing players of similar ratings without rematches uses general graph
matching (Blossom algorithm).
DP1. When to use Dynamic Programming?
Problem Type DP Applicable? Classic Example
Optimization: find YES if optimal Shortest path, Knapsack, Matrix Chain
min/max value substructure exists
Counting: how many YES if overlapping Coin change (number of ways), staircase
ways? subproblems problem
Decision: is it possible? YES Can we achieve sum S? Is string X a
subsequence?
Greedy works (matroid NO — use greedy MST, Activity selection
structure) instead
Independent NO — use Divide & Binary search, Merge Sort
subproblems (no Conquer
overlap)
Exponential state space MAYBE — check if Traveling Salesman (DP with bitmask)
states can be
memoized
🏭 Real-Life: Dynamic Programming
Autocorrect / Spell Check: Finding the closest word to a misspelled word uses Edit Distance
(Levenshtein distance) — a DP problem. Your phone's autocorrect solves this hundreds of
times per keystroke. DNA Sequence Alignment (Bioinformatics): Comparing two DNA
strands to find similarities uses LCS / sequence alignment — DP. Used in cancer research,
evolutionary biology. Google's PageRank: The iterative computation of page importance
can be viewed as a DP on the web graph. Financial Options Pricing (Black-Scholes): Option
pricing models use DP (binomial trees) to compute fair prices of financial derivatives. Video
Compression (H.264): Deciding which frames to store fully vs. as differences from previous
frames is a DP optimization problem.
MASTER DECISION TABLE — One-Stop Reference
Given a problem, use this table to immediately identify the right algorithm:
Problem Description Best Algorithm Key Reason
Sort small array (n < 20) Insertion Sort Low overhead, cache-friendly
Sort large array, fastest in Quick Sort (random Best cache behavior, O(n log n) avg
practice pivot)
Sort and need stable + Merge Sort Always O(n log n), stable
guaranteed O(n log n)
Sort with minimal memory Selection Sort Only O(n) swaps
writes
Sort with O(1) space + O(n Heap Sort Both constraints met
log n) guarantee
Shortest path in unweighted BFS Finds fewest hops, O(V+E)
graph
Shortest path, weighted, non- Dijkstra O((V+E) log V), correct for non-neg
negative, single source weights
Shortest path, ALL pairs of Floyd-Warshall O(V³), simple DP on all pairs
vertices
Minimum Spanning Tree, Kruskal's O(E log E), sort + Union-Find
sparse graph
Minimum Spanning Tree, Prim's O(V²) or O(E + V log V) with heap
dense graph
Maximum flow in a network Edmonds-Karp (BFS- Polynomial time O(VE²)
based Ford-Fulkerson)
Maximum matching in Augmenting path / Simpler than Blossom, bipartite has
bipartite graph Hopcroft-Karp no odd cycles
Maximum matching in Edmond's Blossom Handles odd cycles via contraction
general graph Algorithm
Task scheduling with Topological Sort (Kahn's Detects cycles too
dependencies BFS)
Find strongly connected Kosaraju's 2-pass DFS O(V+E), elegant two-pass approach
components
Optimal substructure + Dynamic Programming Cache subproblem results
overlapping subproblems
Locally optimal = globally Greedy Algorithm Matroid structure guarantees
optimal (matroid) optimality
Multiply large matrices Strassen's Algorithm (n > O(n^2.807) beats O(n³)
efficiently 512)
Solve system of linear LUP Decomposition Numerically stable, reusable
equations decomposition
Find unique x satisfying Chinese Remainder Unique solution mod M
modular equations Theorem
Find polynomial through Lagrange / Newton Unique degree-n polynomial
given points Interpolation
🎓 Final Exam Strategy
When you see a problem in the exam, ask these questions in order: 1. Is it a GRAPH
problem? → What type: unweighted/weighted/directed/undirected? 2. Is it SORTING? →
What constraints: stability, memory, data size, already sorted? 3. Is it OPTIMIZATION? →
Can I use greedy (check matroid) or do I need DP? 4. Is it FLOW? → Single source/sink?
Need max flow or max matching? 5. Is it LINEAR ALGEBRA? → Need to solve equations or
multiply matrices? Always write: Algorithm name → Time complexity → Key condition why it
works here.