Algorithm Analysis and Complexity Basics
Algorithm Analysis and Complexity Basics
ALGORITHM
Advantages of Algorithms:
• It is easy to understand.
• Algorithm is a step-wise representation of a solution to a given problem.
• In Algorithm the problem is broken down into smaller pieces or steps hence, it
is easier for the programmer to convert it into an actual program.
Disadvantages of Algorithms:
We write algorithms in a step-by-step manner, but it is not always the case. Algorithm
writing is a process and is executed after the problem domain is well-defined. That is, we
should know the problem domain, for which we are designing a solution.
Example
Let's try to learn algorithm-writing by using an example.
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Hence, many solution algorithms can be derived for a given problem. The next step is to
analyze those proposed solution algorithms and implement the best suitable solution
ALGORITHM COMPLEXITY
Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
• Time Factor − Time is measured by counting the number of key operations such
as comparisons in the sorting algorithm.
• Space Factor − Space is measured by counting the maximum memory space
required by the algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space
required by the algorithm in terms of n as the size of input data
ALGORITHM ANALYSIS
Efficiency of an algorithm can be analyzed at two different stages, before implementation
and after implementation. They are the following –
• A Priori Analysis or Performance or Asymptotic Analysis − This is a theoretical
analysis of an algorithm. Efficiency of an algorithm is measured by assuming that
all other factors, for example, processor speed, are constant and have no effect on
the implementation.
• A Posterior Analysis or Performance Measurement − This is an empirical analysis
of an algorithm. The selected algorithm is implemented using programming language.
This is then executed on target computer machine. In this analysis, actual statistics
like running time and space required, are collected.
Algorithm analysis framework involves finding out the time taken and the memory space
required by a program to execute the program. It also determines how the input size of a
program influences the running time of the program.
Algorithms are widely used in various areas of study. We can solve different problems
using the same algorithm. Therefore, all algorithms must follow a standard. The
mathematical notations use symbols or symbolic expressions, which have a precise
semantic meaning.
Asymptotic Notations
A problem may have various algorithmic solutions. In order to choose the best algorithm
for a particular process, you must be able to judge the time taken to run a particular solution.
More accurately, you must be able to judge the time taken to run two solutions, and choose
the better among the two.
To select the best algorithm, it is necessary to check the efficiency of each algorithm. The
efficiency of each algorithm can be checked by computing its time complexity. The
asymptotic notations help to represent the time complexity in a shorthand way. It can
generally be represented as the fastest possible, slowest possible or average possible.
The notations such as O (Big-O), Ώ (Omega), and θ (Theta) are called as asymptotic
notations. These are the mathematical notations that are used in three different cases of
time complexity.
Big-O Notation
‘O’ is the representation for Big-O notation. Big -O is the method used to express the upper
bound of the running time of an algorithm. It is used to describe the performance or time
complexity of the algorithm. Big-O specifically describes the worst-case scenario and can
be used to describe the execution time required or the space used by the algorithm.
Table 2.1 gives some names and examples of the common orders used to
describe functions. These orders are ranked from top to bottom.
f(n) ≤ c ∗ g(n)
where, n can be any number of inputs or outputs and f(n) as well as g(n) are two non-negative
functions. These functions are true only if there is a constant c and a non-negative integer n0
such that,
n ≥ n0.
The Big-O can also be denoted as f(n) = O(g(n)), where f(n) and g(n) are two non -negative
functions and f(n) < g(n) if g(n) is multiple of some constant c. The graphical representation
of f(n) = O(g(n)) is shown in figure 2.1, where the running time increases considerably when
n increases.
The values of n for f(n) and C* g(n) will not be less than n0. Therefore, the
values less than n0 are not considered relevant.
Thus, when n is greater than 2, we get f(n)<g(n). In other words, as n becomes larger,
the running time increases considerably. This concludes that the Big-O helps to
determine the ‘upper bound’ of the algorithm’s run-time.
Limitations of Big O Notation
There are certain limitations with the Big O notation of expressing the complexity of
algorithms. These limitations are as follows:
• Many algorithms are simply too hard to analyse mathematically.
• There may not be sufficient information to calculate the behaviour of the
algorithm in the average case.
• Big O analysis only tells us how the algorithm grows with the size of the problem,
not how efficient it is, as it does not consider the programming effort.
It ignores important constants. For example, if one algorithm takes O(n2 ) time to execute and the other
takes O(100000n2 ) time to execute, then as per Big O, both algorithm have equal time complexity. In
real-time systems, this may be a serious consideration
Omega Notation
‘Ω’ is the representation for Omega notation. Omega describes the manner in which an
algorithm performs in the best case time complexity. This notation provides the minimum
amount of time taken by an algorithm to compute a problem. Thus, it is considered that
omega gives the "lower bound" of the algorithm's run-time. Omega is defined as:
f(n) ≥ c ∗ g(n)
Where, n is any number of inputs or outputs and f(n) and g(n) are two non-negative
functions. These functions are true only if there is a constant c and a non-negative integer
n0 such that n>n0.
Example:
Consider function f(n) = 2n2+5 and g(n) = 7n.
We need to find the constant c such that f(n) ≥ c ∗ g(n).
Let n = 0, then
f(n) = 2n2+5 = 2(0)2+5 = 5
g(n) = 7(n) = 7(0) = 0
Here, f(n)>g(n)
Let n = 1, then
Thus, for n=1, we get f(n) ≥ c ∗ g(n). This concludes that Omega helps to determine the "lower bound"
of the algorithm's run-time.
Theta Notation
'θ' is the representation for Theta notation. Theta notation is used when the upper
bound and lower bound of an algorithm are in the same order of magnitude.
Where, n is any number of inputs or outputs and f(n) and g(n) are two
non- negative functions. These functions are true only if there are two
constants namely, c1, c2, and a non-negative integer n0.
Example: Consider function f(n) = 4n + 3 and g(n) = 4n for all n ≥ 3; and f(n) = 4n
+ 3 and g(n) = 5n for all n ≥ 3.
f(n) = 4n + 3 = 4(3)+3 = 15
Binary Search Tree is a data structure used in computer science for organizing and storing data in
a sorted manner. Binary search tree follows all properties of binary tree and for every nodes,
its left subtree contains values less than the node and the right subtree contains values greater than
the node. This hierarchical structure allows for efficient Searching, Insertion,
and Deletion operations on the data stored in the tree.
• A Binary Search Tree is useful for maintaining sorted stream of data. It allows search, insert,
Applications of BST
Binary Search Tree (BST) is a data structure that is commonly used to implement efficient
searching, insertion, and deletion operations along with maintaining sorted sequence of data. The
following properties are:
• The left subtree of a node contains only nodes with keys lesser than the node's key.
• The right subtree of a node contains only nodes with keys greater than the node's key.
• The left and right subtree each must also be a binary search tree. There must be no duplicate
nodes.
• Efficient searching: O(log n) time complexity for searching with a self balancing BST
• Ordered structure: Elements are stored in sorted order, making it easy to find the next or
previous element
• Dynamic insertion and deletion: Elements can be added or removed efficiently
• Balanced structure: Balanced BSTs maintain a logarithmic height, ensuring efficient
operations
• Doubly Ended Priority Queue: In BSTs, we can maintain both maximum and minimum
efficiently
Definition:
Insertion in a Binary Search Tree is the process of adding a new node at the correct position so that
the BST properties are maintained.
BST Property:
• Left subtree contains values less than the node.
• Right subtree contains values greater than the node.
Syntax:
//Insert a node
struct Node* insert(struct Node* root, int key)
{
if (root == NULL)
return createNode(key);
if (key < root->key)
root->left = insert(root->left, key);
else if (key > root->key)
root->right = insert(root->right, key);
return root;
}
Definition:
Deletion in a BST means removing a node from the tree while maintaining the BST properties.
Three cases when deleting a node:
1. Leaf Node (no children): Just remove the node.
2. One Child: Replace the node with its child.
3. Two Children:
o Find the in-order successor (smallest in right subtree) or in-order predecessor
(largest in left subtree).
o Replace the node’s value with it.
o Delete the successor/predecessor node.
Syntax:
// Delete a node
struct Node* deleteNode(struct Node* root, int key) {
if (root == NULL)
return root;
A BST, The task is to delete a node in this BST, which can be broken down into 3 scenarios:
Case 1. Delete a Leaf Node in BST
Deleting a single child node is also simple in BST. Copy the child to the node and delete the
node.
Deleting a node with both children is not so simple. Here we have to delete the node is such a way,
that the resulting tree follows the properties of a BST.
The trick is to find the inorder successor of the node. Copy contents of the inorder successor to the
node, and delete the inorder successor.
Given a BST, the task is to search a node in this BST. For searching a value in BST, consider it as
a sorted array. Now we can easily perform search operation in BST using Binary Search Algorithm.
Definition:
Search in a BST is used to find whether a given key exists in the tree.
How it works:
• Start at the root.
• Compare the key with the current node:
o If equal → found.
o If smaller → search left subtree.
o If larger → search right subtree.
• If NULL is reached, the key does not exist.
Syntax:
// Search a node
struct Node* search(struct Node* root, int key) {
if (root == NULL || root->key == key)
return root;
if (key < root->key)
return search(root->left, key);
else
return search(root->right, key);
}
Output: True
Explanation: 8 is present in the BST as right child of root
The importance of AVL trees in data structures is that they ensure operations like search, insert, and
delete are always fast, even if we do them many times. This makes AVL trees very useful in
applications like databases and memory management, where we need to quickly find or update
information.
Example:
Consider the following AVL tree:
Balancing Factors:
• For node 10: Left height = 0, Right height = 0, BF = 0 - 0 = 0
• For node 20: Left height = 1, Right height = 0, BF = 1 - 0 = 1
• For node 50: Left height = 0, Right height = 0, BF = 0 - 0 = 0
• For node 40: Left height = 0, Right height = 1, BF = 0 - 1 = -1
Example:
Insert keys 10, 20, 30 into an AVL tree.
• Insert 10:
• Insert 20:
After rotation:
2. Deletion
Deletion in an AVL tree involves removing a node and then ensuring the tree remains balanced.
After deleting a node, the balance factor of each node is checked, and rotations are performed if
necessary to maintain the AVL property.
Algorithm:
• Perform a standard BST deletion.
• Update the height of the current node.
• Calculate the balance factor of the current node.
• Perform rotations if the node becomes unbalanced.
Syntax: // Deletion
free(temp);
} else {
struct Node* temp = minValueNode(root->right);
root->key = temp->key;
root->right = deleteNode(root->right, temp->key);
}
}
if (root == NULL)
return root;
root->height = 1 + max(height(root->left), height(root->right));
int balance = getBalance(root);
Example:
Delete key 20 from the AVL tree:
Before rotation:
Before rotation:
Before rotation:
Before rotation:
B Trees: B trees are extended binary search trees that are specialized in m-way searching, since the order of B trees
is 'm'. Order of a tree is defined as the maximum number of children a node can accommodate. Therefore, the height
of a b tree is relatively smaller than the height of AVL tree and RB tree.
They are general form of a Binary Search Tree as it holds more than one key and two children.
The various properties of B trees include –
• Every node in a B Tree will hold a maximum of m children and (m-1) keys, since the order of the
tree is m.
Properties of a B-Tree
A B Tree of order m can be defined as an m-way search tree which satisfies the following properties:
1. All leaf nodes of a B tree are at the same level, i.e. they have the same depth (height of the
tree).
2. The keys of each node of a B tree (in case of multiple keys), should be stored in the
ascending order.
3. In a B tree, all non-leaf nodes (except root node) should have at least m/2 children.
4. All nodes (except root node) should have at least m/2 - 1 keys.
5. If the root node is a leaf node (only node in the tree), then it will have no children and will
have at least one key. If the root node is a non-leaf node, then it will have at least 2
children and at least one key.
6. A non-leaf node with n-1 key values should have n non NULL children.
We can see in the above diagram that all the leaf nodes are at the same level and all non-leaf nodes have
no empty sub-tree and have number of keys one less than the number of their children.
Interesting Facts about B-Tree
• The minimum height of the B-Tree that can exist with n number of nodes and m is the
maximum number of children of a node can have is: hmin=⌈logm(n+1)⌉−1 hmin
=⌈logm(n+1)⌉−1
• The maximum height of the B-Tree that can exist with n number of nodes and t is the
The insert() operation in a B-Tree. A new key is always inserted into a leaf node. To insert a key k,
we start from the root and traverse down the tree until we reach the appropriate leaf node. Once there, the
key is added to the leaf.
Unlike Binary Search Trees (BSTs), nodes in a B-Tree have a predefined range for the number of keys
they can hold. Therefore, before inserting a key, we ensure the node has enough space. If the node is full,
an operation called splitChild() is performed to create space by splitting the node.
Insertion Operation
To insert a new key, we go down from root to leaf. Before traversing down to a node, we first check if the
node is full. If the node is full, we split it to create space. Following is the complete algorithm.
Insertion Algorithm
1: procedure B-Tree-Insert (Node x, Key k)
2: find i such that x:keys[i] > k or i >=numkeys(x)
3: if x is a leaf then
4: Insert k into [Link] at i
5: else
6: if x:child[i] is full then
7: Split x:child[i]
8: if k > x:key[i] then
9: i = i + 1
10: end if
11: end if
12: B-Tree-Insert(x:child[i]; k)
Example
The deletion procedure deletes the key k from the subtree rooted at x. This procedure guarantees that
whenever it calls itself recursively on a node x, the number of keys in x is at least the minimum degree t.
Note that this condition requires one more key than the minimum required by the usual B-tree conditions,
so sometimes a key may have to be moved into a child node before recursion descends to that child. This
strengthened condition allows us to delete a key from the tree in one downward pass without having to
"back up" (with one exception, which we’ll explain). You should interpret the following specification for
deletion from a B-tree with the understanding that if the root node x ever becomes an internal node having
no keys (this situation can occur in cases 2c and 3b then we delete x, and x’s only child x.c1 becomes the
new root of the tree, decreasing the height of the tree by one and preserving the property that the root of
the tree contains at least one key (unless the tree is empty).
Various Cases of Deletion
Case 1: If the key k is in node x and x is a leaf, delete the key k from x.
Case 2: If the key k is in node x and x is an internal node, do the following.
• If the child y that precedes k in node x has at least t keys, then find the predecessor k0 of k
in the sub-tree rooted at y. Recursively delete k0, and replace k with k0 in x. (We can find
k0 and delete it in a single downward pass.)
• If y has fewer than t keys, then, symmetrically, examine the child z that follows k in node x.
If z has at least t keys, then find the successor k0 of k in the subtree rooted at z. Recursively
delete k0, and replace k with k0 in x. (We can find k0 and delete it in a single downward
pass.)
• Otherwise, if both y and z have only t-1 keys, merge k and all of z into y, so that x loses both
k and the pointer to z, and y now contains 2t-1 keys. Then free z and recursively delete k
from y.
Case 3: If the key k is not present in internal node x, determine the root x.c(i) of the appropriate subtree
that must contain k, if k is in the tree at all. If x.c(i) has only t-1 keys, execute steps 3a or 3b as
necessary to guarantee that we descend to a node containing at least t keys. Then finish by recursing on
the appropriate child of x.
• If x.c(i) has only t-1 keys but has an immediate sibling with at least t keys, give x.c(i) an
extra key by moving a key from x down into x.c(i), moving a key from x.c(i) ’s immediate
left or right sibling up into x, and moving the appropriate child pointer from the sibling into
x.c(i).
• If x.c(i) and both of x.c(i)'s immediate siblings have t-1 keys, merge x.c(i) with one sibling,
which involves moving a key from x down into the new merged node to become the median
key for that node.
Since most of the keys in a B-tree are in the leaves, deletion operations are most often used to delete
keys from leaves. The recursive delete procedure then acts in one downward pass through the tree,
without having to back up. When deleting a key in an internal node, however, the procedure makes a
1
ADSA UNIT-II JSR
Representation of Heap Tree:
An array can be used to simulate a tree in the following way. If we are storing one element at
index i in array Ar, then its parent will be stored at index i/2 (unless its a root, as root has no
parent) and can be accessed by Ar[ i/2 ], and its left child can be accessed by Ar[ 2 * i ] and its
right child can be accessed by Ar[ 2 * i +1 ]. The index of root will be 1 in an array.
Heap Operations
1. Insertion
● Insert the new element at the end of the heap (bottom-most, right-most position).
● Perform heapify-up (bubble-up):
○ Compared with parent.
○ If heap property is violated → swap with parent.
○ Repeat until the root or property is satisfied.
3. Heapify
● Operation to fix heap property in a subtree.
● Compare node with its children:
○ If property is violated, swap with the appropriate child.
○ Recursively apply to the affected subtree.
● Used when building a heap or after deletion.
Types of Heaps
There are two main types of heaps:
[Link] Heap: The root node contains the maximum value, and the values
decrease as you move down the tree.
● In other words, the max heap can be defined as for every node i; the value
of node i is less than or equal to its parent value except the root node.
Mathematically, it can be defined as:
● A[Parent(i)] >= A[i]
2
ADSA UNIT-II JSR
The above tree is a max heap tree as it satisfies the property of the max heap. Now,
let's see the array representation of the max heap.
[Link] Heap: The root node contains the minimum value, and the values increase
as you move down the tree
● In other words, the min-heap can be defined as, for every node i, the value
of node i is greater than or equal to its parent value except the root node.
Mathematically, it can be defined as:
A[Parent(i)] <= A[i]
Let's understand the min-heap through an example.
● In the above figure, 11 is the root node, and the value of the root node is
less than the value of all the other nodes (left child or a right child).
Max Heap Construction Algorithm
We shall use the same example to demonstrate how a Max Heap is created. The procedure to
create Min Heap is similar but we go for min values instead of max values.
We are going to derive an algorithm for max heap by inserting one element at a time. At any
point of time, heap must maintain its property. While insertion, we also assume that we are
inserting a node in an already heapified tree.
Step 1 − Create a new node at the end of heap.
Step 2 − Assign new value to the node.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If value of parent is less than child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.
Note − In Min Heap construction algorithm, we expect the value of the parent node
to be less than that of the child node.
Let’s understand the max heap through an example.
In the above figure, 55 is the parent node and it is greater than both of its child, and
11 is the parent of 9 and 8, so 11 is also greater than from both of its child.
Therefore, we can say that the above tree is a max heap tree.
Insertion in the Heap tree 44,33, 77, 11, 55, 88, 66
Suppose we want to create the max heap tree. To create the max heap tree, we need
to consider the following two cases:
• First, we have to insert the element in such a way that the property of the
complete binary tree must be maintained.
• Secondly, the value of the parent node should be greater than the either of
its child.
3
ADSA UNIT-II JSR
Step 1: First we add the 44 element in the tree as shown below:
Step 2: The next element is 33. As we know that insertion in the binary tree always starts from the
Step 3: The next element is 77 and it will be added to the right of the 44 as shown below:
As we can observe in the above tree that it does not satisfy the max heap property, i.e., parent node
44 is less than the child 77. So, we will swap these two values as shown below:
Step 4: The next element is 11. The node 11 is added to the left of 33 as shown below:
4
ADSA UNIT-II JSR
Step 5: The next element is 55. To make it a complete binary tree, we will add the node 55 to the
right of 33 as shown below:
As we can observe in the above figure that it does not satisfy the property of the max heap because
33<55, so we will swap these two values as shown below:
Step 6: The next element is 88. The left subtree is completed so we will add 88 to the left of 44 as
shown below:
As we can observe in the above figure that it does not satisfy the property of the max heap because
44<88, so we will swap these two values as shown below:
Again, it is violating the max heap property because 88>77 so we will swap these two values as
shown below:
Step 7: The next element is 66. To make a complete binary tree, we will add the 66 element to the
right side of 77 as shown below:
5
ADSA UNIT-II JSR
Max Heap Deletion Algorithm
Let us derive an algorithm to delete from max heap. Deletion in Max (or Min) Heap always happens at the root
to remove the Maximum (or minimum) value.
Step 1 − Remove root node.
Step 2 − Move the last element of the last level to root.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If the value of the parent is less than the child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.
Array representation:
88, 55, 77, 11, 33, 44, 66
Tree view:
88
/ \
55 77
/ \ /\
11 33 44 66
6
ADSA UNIT-II JSR
Tree:
77
/ \
55 66
/ \ /
11 33 44
● Now, 66 is at index 2 (value = 66)
This walkthrough shows all steps (placement + every swap) while building a Min Heap by inserting the keys
[20, 15, 30, 10, 8, 25], then
Start
1. (empty)
2. Insert 20
20
3. Insert 15
20
/
15
Compare & swap with parent (15 < 20)
7
ADSA UNIT-II JSR
Array (after swap):** `[15, 20]`
15
/
20
4. Insert 30
15
/ \
20 30
5. Insert 10
15
/ \
20 30
/
10
Step 1 – swap with parent (10 < 20)
Array: [15, 10, 30, 20]
15
/ \
10 30
/
20
Step 2 – swap with parent (10 < 15)
Array (final after insert 10):
[10, 15, 30, 20]
10
/ \
15 30
/
20
6. Insert 8
10
/ \
15 30
/ \
20 8
Step 1 – swap with parent (8 < 15)
Array:[10, 8, 30, 20, 15]
10
/ \
8 30
/ \
20 15
Step 2 – swap with parent (8 < 10)
Array (final after insert 8):[8, 10, 30, 20, 15]
8
/ \
10 30
/ \
20 15
[Link] 25
8
ADSA UNIT-II JSR
8
/ \
10 30
/ \ /
20 15 25
Step 1 – swap with parent (25 < 30)
Array (final after insert 25):[8, 10, 25, 20, 15, 30]
8
/ \
10 25
/ \ /
20 15 30
Final Min Heap after all insertions
Steps:
1. Remove root element (minimum element).
Step 1 – Compare 30 with children (10, 25) → swap with smaller (10)
Array:[10, 30, 25, 20, 15]
10
/ \
30 25
/ \
20 15
9
ADSA UNIT-II JSR
Step 2 – Compare 30 with children (20, 15) → swap with smaller (15)
Array (final after delete-min): [10, 15, 25, 20, 30]
10
/ \
15 25
/ \
20 30
Applications
Heap Data Structure is generally taught with Heapsort. Heap sort algorithm has limited uses because Quicksort is
better in practice. Nevertheless, the Heap data structure itself is enormously used.
1. Priority Queues: Heaps are commonly used to implement priority queues, where elements with higher
priority are extracted first. This is useful in many applications such as scheduling tasks, handling
interruptions, and processing events.
2. Sorting Algorithms: Heapsort, a comparison-based sorting algorithm, is implemented using the Heap
data structure. It has a time complexity of O(n log n), making it efficient for large datasets.
3. Graph algorithms: Heaps are used in graph algorithms such as Prim's Algorithm, Dijkstra's algorithm., and
the A* search algorithm.
4. Lossless Compression: Heaps are used in data compression algorithms such as Huffman coding, which
uses a priority queue implemented as a min-heap to build a Huffman tree.
5. Medical Applications: In medical applications, heaps are used to store and manage patient information based
on priority, such as vital signs, treatments, and test results.
6. Load balancing: Heaps are used in load balancing algorithms to distribute tasks or requests to servers,
by processing elements with the lowest load first.
7. Order statistics: The Heap data structure can be used to efficiently find the kth smallest (or largest) element in
an array. See method 4 and 6 of this post for details.
8. Resource allocation: Heaps can be used to efficiently allocate resources in a system, such as memory blocks
or CPU time, by assigning a priority to each resource and processing requests in order of priority.
9. Job scheduling: The heap data structure is used in job scheduling algorithms, where tasks are scheduled based
on their priority or deadline. The heap data structure allows efficient access to the highest-priority task, making
it a useful data structure for job scheduling applications
GRAPHS
Introduction to Graphs
Graph is a non linear data structure; A map is a well-known example of a graph. In a map various connections are
made between the cities. The cities are connected via roads, railway lines and aerial network. We can assume that
the graph is the interconnection of cities by roads. Euler used graph theory to solve Seven Bridges of Königsberg
problem. Is there a possible way to traverse every bridge exactly once – Euler Tour
10
ADSA UNIT-II JSR
as all four vertices are of odd degree.
A graph contains a set of points known as nodes (or vertices) and set of links known as edges (or Arcs) which
connects the vertices.
Definition:
A graph is defined as Graph is a collection of vertices and arcs which connects vertices in the graph. A graph G is
represented as G = ( V , E ), where V is set of vertices and E is set of edges.
Example: graph G can be defined as G = ( V , E ) Where V = {A,B,C,D,E} and
E = {(A,B),(A,C)(A,D),(B,D),(C,D),(B,E),(E,D)}. This is a graph with 5 vertices and 6 edges.
Graph Terminology
[Link] : An individual data element of a graph is called as Vertex. Vertex is also known as node. In
above example graph, A, B, C, D & E are known as vertices.
[Link] : An edge is a connecting link between two vertices. Edge is also known as Arc. An edge is represented
as (starting Vertex, ending Vertex).
In above graph, the link between vertices A and B is represented as (A,B).
Edges are three types:
[Link] Edge - An undirected edge is a bidirectional edge. If there is an undirected edge between vertices A
and B then edge (A , B) is equal to edge (B , A).
[Link] Edge - A directed edge is a unidirectional edge. If there is a directed edge between vertices A and B
then edge (A , B) is not equal to edge (B , A).
[Link] Graph
A graph with only directed edges is said to be directed graph.
11
ADSA UNIT-II JSR
[Link] Graph
A graph in which any V node is adjacent to all other nodes present in the graph is known as a complete graph. An
undirected graph contains the edges that are equal to edges = n(n-1)/2 where n is the number of vertices present in
the graph. The following figure shows a complete graph.
[Link] Graph
Regular graph is the graph in which nodes are adjacent to each other, i.e., each node is accessible from
any other node.
[Link] Graph
A graph having cycle is called cycle graph. In this case the first and last nodes are the same. A closed simple path
is a cycle.
12
ADSA UNIT-II JSR
[Link] Graph
A graph without cycle is called acyclic graphs.
7. Weighted Graph
A graph is said to be weighted if there are some non negative value assigned to each edges of the graph. The
value is equal to the length between two vertices. Weighted graph is also called a network.
Outgoing Edge
A directed edge is said to be outgoing edge on its orign vertex.
Incoming Edge
A directed edge is said to be incoming edge on its destination vertex.
Degree
Total number of edges connected to a vertex is said to be degree of that vertex.
Indegree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
Parallel edges or Multiple edges
If there are two undirected edges to have the same end vertices, and for two directed edges to have the same
origin and the same destination. Such edges are called parallel edges or multiple edges.
Self-loop
An edge (undirected or directed) is a self-loop if its two endpoints coincide.
Simple Graph
A graph is said to be simple if there are no parallel and self-loop edges.
Adjacent nodes
When there is an edge from one node to another then these nodes are called adjacent nodes.
Incidence
In an undirected graph the edge between v1 and v2 is incident on node v1 and v2.
13
ADSA UNIT-II JSR
Walk
A walk is defined as a finite alternating sequence of vertices and edges, beginning and ending with vertices, such
that each edge is incident with the vertices preceding and following it.
Closed walk
A walk which is to begin and end at the same vertex is called close walk. Otherwise it is an open walk.
If e1,e2,e3,and e4 be the edges of pair of vertices (v1,v2),(v2,v4),(v4,v3) and (v3,v1) respectively ,then v1 e1 v2
e2 v4 e3 v3 e4 v1 be its closed walk or circuit.
Path
A open walk in which no vertex appears more than once is called a path.
If e1 and e2 be the two edges between the pair of vertices (v1,v3) and (v1,v2) respectively, then v3 e1 v1 e2 v2 be
its path.
Length of a path
The number of edges in a path is called the length of that path. In the following, the length of the path is 3.
14
ADSA UNIT-II JSR
Sub Graph
A graph S is said to be a sub graph of a graph G if all the vertices and all the edges of S are in G, and each edge
of S has the same end vertices in S as in G. A subgraph of G is a graph G’ such that V(G’) ⊆ V(G) and E(G’) ⊆
E(G)
Connected Graph
A graph G is said to be connected if there is at least one path between every pair of vertices in G. Otherwise,G is
disconnected.
15
ADSA UNIT-II JSR
In the above graph,the indegree of vertices v1, v3 is 2, indegree of vertices v2, v5 is 1 and indegree of v4 is zero.
Outdegree
The outdegree of a node (or vertex) is the number of edges going outside from that node or in other words the
ADT of Graph:
Structure Graph is
objects: a nonempty set of vertices and a set of undirected edges, where each edge is a pair of vertices
functions: for all graph ∈ Graph, v, v1 and v2 ∈ Vertices
Graph Create()::=return an empty graph
Graph InsertVertex(graph, v)::= return a graph with v inserted. v has no edge.
Graph InsertEdge(graph, v1,v2)::= return a graph with new edge between v1 and v2
Graph DeleteVertex(graph, v)::= return a graph in which v and all edges incident to it are removed
Graph DeleteEdge(graph, v1, v2)::=return a graph in which the edge (v1, v2) is removed
Boolean IsEmpty(graph)::= if (graph==empty graph) return TRUE else return FALSE
List Adjacent(graph,v)::= return a list of all vertices that are adjacent to v
Graph Representations
Graph data structure is represented using following representations
1. Adjacency Matrix
2. Adjacency List
[Link] Matrix
In this representation, graph can be represented using a matrix of size total number of vertices by total number of
vertices; means if a graph with 4 vertices can be represented using a matrix of 4X4 size.
In this matrix, rows and columns both represent vertices.
This matrix is filled with either 1 or 0. Here, 1 represents there is an edge from row vertex to column vertex and 0
represents there is no edge from row vertex to column vertex.
Adjacency Matrix : let G = (V, E) with n vertices, n ≥ 1. The adjacency matrix of G is a 2-dimensional n × n
matrix, A, A(i, j) = 1 iff (vi, vj) ∈E(G) (〈vi, vj〉 for a diagraph), A(i, j) = 0 otherwise.
example : for undirected graph
+n
16
ADSA UNIT-II JSR
The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a digraph need not be
symmetric.
Merits of Adjacency Matrix:
From the adjacency matrix, to determine the connection of vertices is easy
n−1
The degree of a vertex is ∑ adj _ mat[i][ j]
j =0
For a digraph, the row sum is the out_degree, while the column sum is the in_degree
n−1 n−1
ind (vi) = A[ j, i] outd(vi) = A[i, j]
j =0 j =0
The space needed to represent a graph using adjacency matrix is n2 bits. To identify the edges in a graph,
adjacency matrices will require at least O(n2) time.
2. Adjacency List
In this representation, every vertex of graph contains list of its adjacent vertices. The n rows of the adjacency
matrix are represented as n chains. The nodes in chain I represent the vertices that are adjacent to vertex i.
It can be represented in two forms. In one form, array is used to store n vertices and chain is used to store its
adjacencies. Example:
So that we can access the adjacency list for any vertex in O(1) time. Adjlist[i] is a pointer to to first node in the
17
ADSA UNIT-II JSR
This r0130epresentation can also be implemented using array
We begin by visiting the start vertex v. Next an unvisited vertex w adjacent to v is selected, and a depth-first
search from w is initiated. When a vertex u is reached such that all its adjacent vertices have been visited, we back
up to the last vertex visited that has an unvisited vertex w adjacent to it and initiate a depth-first search from w.
The search terminates when no unvisited vertex can be reached from any of the visited vertices.
DFS traversal of a graph, produces a spanning tree as final result. Spanning Tree is a graph without any loops.
We use Stack data structure with maximum size of total number of vertices in the graph to implement DFS
traversal of a graph.
18
ADSA UNIT-II JSR
node_pointer w;
visited[v]= TRUE;
printf(“%d”, v);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex])
dfs(w->vertex);
}
Consider the graph G of Figure 6.16(a), which is represented by its adjacency lists as in Figure 6.16(b). If a depth-
first search is initiated from vertex 0 then the vertices of G are visited in the following order: 0 1 3 7 4 5 2 6. Since
DFS(O) visits all vertices that can be reached from 0 the vertices visited, together with all edges in G incident to
these vertices form a connected component of G.
Figure: Graph and its adjacency list representation, DFS spanning tree
Analysis or DFS:
When G is represented by its adjacency lists, the vertices w adjacent to v can be determined by following a chain
of links. Since DFS examines each node in the adjacency lists at most once and there are 2e list nodes the time to
complete the search is O(e). If G is represented by its adjacency matrix then the time to determine all vertices
adjacent to v is O(n). Since at most n vertices are visited the total time is O(n2).
19
ADSA UNIT-II JSR
20
ADSA UNIT-II JSR
Breadth-First Search
In a breadth-first search, we begin by visiting the start vertex v. Next all unvisited vertices adjacent to v are visited.
Unvisited vertices adjacent to these newly visited vertices are then visited and so on. Algorithm BFS (Program 6.2)
gives the details.
typedef struct queue *queue_pointer;
typedef struct queue {
int vertex;
queue_pointer link;
};
void addq(queue_pointer *,
queue_pointer *, int);
int deleteq(queue_pointer *);
void bfs(int v)
{
node_pointer w;
queue_pointer front, rear;
front = rear = NULL;
printf(“%d”, v);
visited[v] = TRUE;
addq(&front, &rear, v);
while (front) {
v= deleteq(&front);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex]) {
printf(“%d”, w->vertex);
addq(&front, &rear, w->vertex);
visited[w->vertex] = TRUE;
}
}
}
Steps:
BFS traversal of a graph, produces a spanning tree as final result. Spanning Tree is a graph without any loops. We
use Queue data structure with maximum size of total number of vertices in the graph to implement BFS traversal
of a graph.
21
ADSA UNIT-II JSR
22
ADSA UNIT-II JSR
Difference between DFS and BFS:
Approach It works on the concept of FIFO It works on the concept of LIFO (Last In
used (First In First Out). First Out).
23
ADSA UNIT-II JSR
For example in the graph shown below, {0, 1, 2} form a connected component and {3, 4} form
another connected component.
Biconnected Components
Example-1
Example-2
Example-3
Example-4
24
ADSA UNIT-II JSR
Example-5
Example-6
25
ADSA UNIT-II JSR
Divide And Conquer(DAC)
Divide and Conquer is an algorithmic pattern. In algorithmic methods, the design is to take a dispute
on a huge input, break the input into minor pieces, decide the problem on each of the small pieces,
and then merge the piecewise solutions into a global solution. This mechanism of solving the
problem is called the Divide & Conquer Strategy.
Divide and Conquer algorithm consists of a dispute using the following three steps.
Divide the original problem into a set of subproblems. Break down the original problem into smaller
subproblems.
● Each subproblem should represent a part of the overall problem.
● The goal is to divide the problem until no further division is possible.
Conquer: Solve every subproblem individually, recursively. Solve each of the smaller subproblems
individually.
● If a subproblem is small enough (often referred to as the “base case”), we solve it directly
without further recursion.
● The goal is to find solutions for these subproblems independently.
Combine: Put together the solutions of the subproblems to get the solution to the whole problem.
Combine the sub-problems to get the final solution of the whole problem.
● Once the smaller subproblems are solved, we recursively combine their solutions to get the
solution of larger problem.
● The goal is to formulate a solution for the original problem by merging the results from the
subproblems.
26
ADSA UNIT-II JSR
exchange sort. It starts by selecting a pivot value from an array followed by dividing the rest
of the array elements into two sub-arrays. The partition is made by comparing each of the
elements with the pivot value. It compares whether the element holds a greater value or lesser
value than the pivot and then sort the arrays recursively.
3. Merge Sort: It is a sorting algorithm that sorts an array by making comparisons. It starts by
dividing an array into sub-array and then recursively sorts each of them. After the sorting is
done, it merges them back.
27
ADSA UNIT-II JSR
● Conquering Each Subproblem: Once divided, the subproblems are solved individually.
This may involve applying the same divide and conquer approach recursively until the
subproblems become simple enough to solve directly, or it may involve applying a different
algorithm or technique.
● Combining Solutions: After solving the subproblems, their solutions are combined to obtain
the solution to the original problem. This combination step should be relatively efficient and
straightforward, as the solutions to the subproblems should be designed to fit together
seamlessly.
Divide: Rearrange the elements and split arrays into two sub-arrays and an element in between
search that each element in left sub array is less than or equal to the average element and each
element in the right sub- array is larger than the middle element.
Conquer: Recursively, sort two sub arrays.
Combine: Combine the already sorted array.
Algorithm:
QuickSort function
void quicksort(int a[], int first, int last) {
int i, j, pivot, temp;
while (i < j) {
while (a[i] <= a[pivot] && i < last)
i++;
while (a[j] > a[pivot])
j--;
if (i < j) {
temp = a[i];
a[i] = a[j];
a[j] = temp;
}
}
temp = a[pivot];
a[pivot] = a[j];
a[j] = temp;
28
ADSA UNIT-II JSR
Figure: shows the execution trace partition algorithm
And these sublists are sorted under the same process as above done. These two sorted sublists side by side
29
ADSA UNIT-II JSR
Merging Sublists:
30
ADSA UNIT-II JSR
Step 2: Partitioning:
We reorder the array so that all elements less than the pivot come before it and all elements greater
than the pivot go after it. In the images, we can see that 70 and 40 are in place because they are less
than the pivot, and we are moving 50 to the position just after the pivot because it is the next smallest
element after the pivot.
The process of selecting a pivot and partitioning the array is repeated recursively on the subarrays,
and this recursion continues until all the subarrays are sorted.
Dry Run
31
ADSA UNIT-II JSR
Step Array i j Pivot Element Subarrays
3 [10, 40, 50, 70, 90] 2 2 50 [10, 40] (50) [70, 90]
6 [10, 40] 1 0 40 -
7 [10, 40) - - 40 -
9 [70, 90] 1 0 90 -
10 [70, 90] - - 90 -
Pseudo Code
Before we code, let's outline our strategy with pseudo-code:
function quickSort(array, low, high) if
low < high
pivotIndex = partition(array, low, high)
quickSort(array, low, pivotIndex - 1) // Before pivot
quickSort(array, pivotIndex + 1, high) // After pivot
Time Complexity
The speed of an algorithm is often captured by its time complexity, which expresses the
relationship between the input size and the time required to complete the task.
● Best and Average Case: In the best and average scenarios, where the pivot element is
chosen to divide the array into proportionate halves consistently, the time complexity is O(n
log n). This is because the array is halved with each pass (log n), and this operation is
performed for each n element.
● Worst Case: The worst case occurs when the pivot selection results in one partition
containing all but one of the elements, leading to O(n^2). This situation commonly happens
when the pivot is the smallest or largest element in the array, as it would be in an already
sorted or reverse-sorted array.
32
ADSA UNIT-II JSR
Space Complexity
Space complexity refers to the amount of memory space required by an algorithm in its life cycle.
● In-Place Sorting: Quick Sort is an in-place sorting algorithm. However, it requires space
for the stack frames of the recursive calls. In the best case, this leads to a space complexity
of O(log n), representing the height of the balanced partition tree.
● Worst Case: In the worst case, with unbalanced partitions, the space complexity can degrade
to O(n), where n is the depth of the recursive call stack, one for each array element
Advantages of Quick Sort
● Efficient average-case complexity: Quick Sort has an average-case time complexity of O(n
log n), making it efficient for large datasets.
● In-place sorting: It doesn't require additional memory beyond the existing array, reducing
space complexity.
● Parallelizable: The algorithm can be easily parallelized across multiple processors or cores,
further improving its speed.
● Cache-friendly: Quick Sort works well with modern processor caches due to its locality of
reference, enhancing performance.
● No recursion depth limit: Unlike some other recursive algorithms, Quick Sort doesn't have
a bound on its recursion depth, making it suitable for very large datasets.
Disadvantages of Quick Sort
● Worst-case complexity: In the worst-case scenario, Quick Sort's time complexity can
deteriorate to O(n^2), which can occur when the chosen pivot elements are poorly selected.
● Unstable sorting: Quick Sort does not preserve the relative order of equal elements, which
might be important for specific applications.
● High constant factor: Compared to other sorting algorithms like Merge Sort, Quick Sort
might have a higher constant factor in its time complexity, leading to slightly slower
performance in some cases.
● Not ideal for small datasets: For small datasets, the overhead of partitioning and recursion
might outweigh the benefits, making simpler algorithms like Bubble Sort or Insertion Sort
more efficient.
Merge Sort
Merge sort is yet another sorting algorithm that falls under the category of Divide and Conquer
technique. It is one of the best sorting techniques that successfully build a recursive algorithm.
In this technique, we segment a problem into two halves and solve them individually. After finding
the solution of each half, we merge them back to represent the solution of the main problem.
Suppose we have an array A, such that our main concern will be to sort the subsection, which starts
at index p and ends at index r, represented by A[p..r].
Divide
If assumed q to be the central point somewhere in between p and r, then we will fragment the
subarray A[p..r] into two arrays A[p..q] and A[q+1, r].
Conquer
After splitting the arrays into two halves, the next step is to conquer. In this step, we individually
sort both of the subarrays A[p..q] and A[q+1, r]. In case if we did not reach the base situation, then
we again follow the same procedure, i.e., we further segment these subarrays followed by sorting
them separately.
33
ADSA UNIT-II JSR
Combine
As when the base step is acquired by the conquer step, we successfully get our sorted subarrays
A[p..q] and A[q+1, r], after which we merge them back to form a new sorted array [p..r].
Merge Sort algorithm
The MergeSort function keeps on splitting an array into two halves until a condition is met where
we try to perform MergeSort on a subarray of size 1, i.e., p == r.
Algorithm
Step 1: Create two pointers, one for each sorted half.
Step 2: Initialize an empty temporary array to hold the merged result.
Step 3: Compare the elements at the pointers of the two halves:
Copy the smaller element into the temporary array.
Move the pointer of the sublist with the smaller element forward.
Step 4: Repeat step 3 until one of the sublists is empty.
Step 5: Copy the remaining elements from the non-empty sublist to the temporary array.
Step 6: Copy the elements back from the temporary array to the original list. Algorithm
Algorithm:
Merge function
void merge(int arr[], int low, int mid, int high) {
int i = low, j = mid + 1, k = low;
34
ADSA UNIT-II JSR
Now, let's see the working of merge sort Algorithm.
To understand the working of the merge sort algorithm, let's take an unsorted array. It will be
easier to understand the merge sort via an example.
Let the elements of array are -
According to the merge sort, first divide the given array into two equal halves. Merge sort keeps
dividing the list into equal parts until it cannot be further divided.
As there are eight elements in the given array, so it is divided into two arrays of size 4.
Now, again divide these two arrays into halves. As they are of size 4, so divide them into new arrays
of size 2.
Now, again divide these arrays to get the atomic value that cannot be further divided.
In the next iteration of combining, now compare the arrays with two data values and merge them
into an array of found values in sorted order.
35
ADSA UNIT-II JSR
Now, there is a final merging of the arrays. After the final merging of above arrays, the array will
look like -
36
ADSA UNIT-II JSR
Hence the array is sorted.
Advantages of Merge Sort:
● Stability : Merge sort is a stable sorting algorithm, which means it maintains the relative
order of equal elements in the input array.
● Guaranteed worst-case performance: Merge sort has a worst-case time complexity of O(N
logN) , which means it performs well even on large datasets.
● Simple to implement: The divide-and-conquer approach is straightforward.
● Naturally Parallel : We independently merge subarrays that makes it suitable for parallel
processing.
Disadvantages of Merge Sort:
● Space complexity: Merge sort requires additional memory to store the merged sub- arrays
during the sorting process.
● Not in-place: Merge sort is not an in-place sorting algorithm, which means it requires
additional memory to store the sorted data. This can be a disadvantage in applications where
memory usage is a concern.
● Slower than QuickSort in general. QuickSort is more cache friendly because it works in-
place.
Complexity Analysis of Merge Sort:
● Time Complexity:
o Best Case: O(n log n), When the array is already sorted or nearly sorted.
o Average Case: O(n log n), When the array is randomly ordered.
o Worst Case: O(n log n), When the array is sorted in reverse order.
● Auxiliary Space: O(n), Additional space is required for the temporary array used during
merging.
Applications of Merge Sort:
● Sorting large datasets
● External sorting (when the dataset is too large to fit in memory)
● Inversion counting
● Merge Sort and its variations are used in library methods of programming languages. For
example its variation TimSort is used in Python, Java Android and Swift. The main reason
why it is preferred to sort non-primitive types is stability which is not there in QuickSort.
For example [Link] in Java uses QuickSort while [Link] uses MergeSort.
● It is a preferred algorithm for sorting Linked lists.
● It can be easily parallelized as we can independently sort subarrays and then merge.
● The merge function of merge sort to efficiently solve the problems like union and
intersection of two sorted arrays.
Finding Minimum And Maximum (Application Of Divide And Conquer)
Finding a maximum and minimum element from a given array is the application of the Divide and
Conquer algorithm.
There are various ways to this problem, but the most traditional approach to to solve this problem is
the linear approach. In the linear approach, we traverse all elements once and find the minimum and
maximum element.
In this approach, the time complexity to solve this problem is θ(n).
Let's see the algorithm for the linear approach,
suppose A is the array of integers and n is the number of elements in the array A.
MinMax(A, n){ int
min = A[0]; int
max = A[0];
for(int i = 1; i < n; i++){
37
ADSA UNIT-II JSR
if(max < A[i])
max = A[i]; else
if(min > A[i])
min = A[i];
}
return (min, max);
}
Time Complexity for the above algorithm is T(n) = 2(n-1) + C ≈ θ(n).
Using Divide And Conquer Approach:
As we know in the divide and conquer approach, we first divide the problem into small problems
and combine the results of these small problems to solve the main problem.
In Divide and Conquer approach:
● Step 1: Find the mid of the array.
● Step 2: Find the maximum and minimum of the left subarray recursively.
● Step 3: Find the maximum and minimum of the right subarray recursively.
● Step 4: Compare the result of step 3 and step 4
● Step 5: Return the minimum and maximum.
Let's see the algorithm for the Divide and Conquer approach,
suppose A is the array of integers and n is the number of elements in the array A.
● i = 0 (Index of first element of array)
● j = n -1 (index of last element of array)
MinMaxDAC(A, i, j) {
if(i == j)
return (A[i], A[i]); if((j -
i) == 1)
if (A[i] < A[j])
return (A[i], A[j]) else
return (A[j], A[i]) else
{
int mid = (i + j) / 2;
LMin, LMax = MinMaxDAC(A, i, mid); RMin,
RMax = MinMaxDAC(A, mid + 1, j); if(LMax >
RMax)
max = LMax; else
max = RMax; if(LMin
< RMin)
38
ADSA UNIT-II JSR
Strassen’s Matrix Multiplication
1. Standard Multiplication
● Multiplying two n×nn \times n matrices → O(n³) time.
3. Reconstructing Result
4. Time Complexity
● Recurrence:
T(n)=7T(n2)+O(n2)T(n) = 7T\left(\frac{n}{2}\right) + O(n^2)
● Using Master Theorem:
T(n)=O(nlog27)≈O(n2.81)T(n) = O(n^{\log_2 7}) \approx O(n^{2.81})
Faster than O(n3)O(n^3), especially for large n.
5. Applications
● Large-scale scientific computing
● Linear algebra problems
● Basis for faster algorithms (Coppersmith–Winograd, etc.)\
● Not efficient for small matrices (overhead cost of extra additions).
39
ADSA UNIT-II JSR
UNIT – III
Greedy Method: General Method, Job Sequencing with deadlines, Knapsack Problem, Minimum
cost spanning trees, Single Source Shortest Paths
Dynamic Programming: General Method, Multi Stage graphs, All pairs shortest paths, Single
Source Shortest Paths – General Weights (Bellman Ford Algorithm), Optimal Binary Search
Trees, 0/1 Knapsack, Travelling Salesperson problem
Greedy Method
General Method
Optimization Problem: An optimization problem is the problem of finding the best solution
(optimal solution) from all the feasible solutions (practicable of possible solutions). In an
optimization problem we are given a set of constraints and an optimization function. Solutions
that satisfy the constraints are called feasible solutions. A feasible solution for which the
optimization function has the best possible value is called optimal solution.
Optimization Problem Can be divided into three types
1. Greedy Method
2. Dynamic Programming
3. Branch and Bound
○ The problem must be broken into subproblems such that the optimal solution
can be formed from optimal solutions of subproblems.
○ Example: In job sequencing, the optimal profit depends on the optimal scheduling
of remaining jobs.
○ At each step, make the choice that looks best at the moment.
○ This local best choice should lead to the global optimum.
○ Example: Pick the job with the maximum profit first (if feasible).
○ Arrange the possible decisions (jobs, items, edges, etc.) in an order that allows
greedy selection.
○ Example: Sort items by value/weight ratio in fractional knapsack.
5. Check Feasibility
○ Ensure that including the current choice doesn’t break constraints.
○ Example: In Job Sequencing, a job is placed only if a free slot ≤ deadline is
available.
Advantages Disadvantages
Simple and easy to implement Not always optimal (may miss the best
solution)
Fast and efficient (often O(n log n)) Works only if problem satisfies Greedy
Choice Property + Optimal Substructure
Requires less memory May fail for complex problems (e.g., 0/1
Knapsack, TSP)
Gives optimal solutions for many standard Each problem needs a correctness proof
problems (MST, Huffman, Activity Selection) (cannot assume greedy works)
This problem consists of n jobs each associated with a deadline and profit and
our objective is to earn maximum profit. We will earn profit only when the job is
completed on or before the deadline. We assume that each job will take unit time to
complete.
Points to remember:
● In this problem we have n jobs j1, j2, … jn, each has an associated deadlines are d1,
d2, … dn and profits are p1, p2, ... pn.
● Profit will only be awarded or earned if the job is completed on or before the
deadline.
● We assume that each job takes unit time to complete.
● The objective is to earn maximum profit when only one job can be scheduled or
processed at any given time
Example: Consider the following 5 jobs and their associated deadline and profit.
index 1 2 3 4 5
JOB j1 j2 j3 j4 j5
2 1 3 2 1
DEADLINE
PROFIT 60 100 20 40 20
index 1 2 3 4 5
JOB j2 j1 j4 j3 j5
DEADLINE 1 2 2 3 1
PROFIT 100 60 40 20 20
Looking at the jobs we can say the max deadline value is 3. So, dmax = 3
As dmax = 3 so we will have THREE slots to keep track of free time slots. Set the time slot
status to EMPTY
time slot 1 2 3
Similarly, if we look at job j1 it has a deadline 2. This means we have to complete job j1
on or before time slot 2 in order to earn its profit.
Similarly, if we look at job j3 it has a deadline 3. This means we have to complete job j3
on or before time slot 3 in order to earn its profit.
time slot 1 2 3
Job J1 J2 J4
Profit 100 60 20
Pseudo Code:
JobSequencing(jobs):
if slot[s] is empty:
slot[s] = job
break
Knapsack Problem
Firstly, we have given a knapsack of the maximum capacity of m kg and n items with their weight and
profit. Fill in the knapsack with a subset of items such that the selected weight is less than or equal to the
capacity of the knapsack and the profit of items is maximum.
Example:-
Let us consider that the capacity of the knapsack M=60 and the list of provided items are shown in the
following table-
However, the provided items are not sorted based on (pi/wi). After sorting, the items are shown in the
following table.
Solution:-
Afterward, sorting all the items according to (pi/wi), first item B is chosen as the weight of B is less than
the capacity of the knapsack. Next, item A is chosen, as the available capacity of the knapsack is greater
than the weight of A. Then, C is chosen as the next item.
However, the whole item cannot be chosen as the remaining capacity of the knapsack is less than the
Chiefly, the capacity of the knapsack is equal to the selected items. Hence, no more items can be selected.
Hence, this is the optimal solution. Moreover, we cannot gain more profit by selecting any different
combination of items.
● The number of vertices (V) in the graph and the spanning tree is the same.
● There is a fixed number of edges in the spanning tree which is equal to one less than the total
number of vertices ( E = V-1 ).
● The spanning tree should not be disconnected, as in there should only be a single source of
component, not more than that.
● The spanning tree should be acyclic, which means there would not be any cycle in the tree.
● The total cost (or weight) of the spanning tree is defined as the sum of the edge weights of all
the edges of the spanning tree.
● There can be many possible spanning trees for a graph.
Prim’s Algorithm
● Idea: Grow MST from a starting vertex by adding the smallest edge that connects a vertex inside
the tree to one outside.
Steps:
1. Start with any vertex.
2. Choose the minimum weight edge from this vertex to a new vertex.
3. Keep adding the minimum edge that connects a visited vertex to an unvisited vertex.
4. Repeat until all vertices are included.
Time Complexity:
● Using adjacency matrix: O(V²)
Steps:
1. Sort all edges in non-decreasing order of weight.
Time Complexity:
● Sorting edges: O(E log E) ≈ O(E log V)
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will have (9
- 1) = 8 edges.
● Greedy approach.
Steps:
1. Assign distance 0 to source and ∞ to all others.
2. Pick the unvisited vertex with the smallest distance.
3. Update distances of its neighbors.
4. Repeat until all vertices are visited.
Complexity:
Given a weighted undirected graph represented as an edge list and a source vertex src, find the
shortest path distances from the source vertex to all other vertices in the graph. The graph
contains V vertices, numbered from 0 to V - 1.
Note: The given graph does not contain any negative edge.
Examples:
Input: src = 0, V = 5, edges[][] = [[0, 1, 4], [0, 2, 8], [1, 4, 6], [2, 3, 2], [3, 4, 10]]
3. Pop the top element (node with the smallest distance) from the min heap.
1. For each adjacent neighbor of the current node:
2. Calculate the distance using the formula:
dist[v] = dist[u] + weight[u][v]
If this new distance is shorter than the current dist[v], update it.
Push the updated pair <dist[v], v> into the min heap
It is mainly used to solve optimization problems (e.g., shortest path, minimal cost, maximal
profit).
Characteristics
1. The graph is divided into k stages (from source to destination).
• Forward Approach
• Backward Approach
A multistage graph G = (V, E) is a directed graph where vertices are partitioned into k (where k
> 1) number of disjoint subsets S = {s1,s2,,sk} such that edge (u, v) is in E, then u Є si and v Є
s1 + 1 for some subsets in the partition and |s1| = |sk| = 1.
The vertex s Є s1 is called the source and the vertex t Є sk is called sink.
G is usually assumed to be a weighted graph. In this graph, the cost of an edge (i, j) is
represented by c(i, j). Hence, the cost of path from source s to sink t is the sum of costs of each
edge in this path.
The multistage graph problem is finding the path with minimum cost from source s to sink t.
Example
Find minimum path cost between vertex s and t for following multistage graph using dynamic
programming.
Solution:In the above graph, the cost of an edge is represented as c(i, j). We need to find the
minimum cost path from vertex 1 to vertex 12. Using the below formula we can find the shortest
cost path from source to destination:
cost(i,j)=min{c(j,l)+cost(i+1,l)}
Step 1:Stage 5
cost(5,12)=c(12,12)=0
We use a forward approach here therefore (cost(5,12) = 0 ). Here, 5 represents the stage number
and 12 represents a node in that stage. Since there are no outgoing edges from vertex 12, the cost
is 0 and D[12]=12
Step 2:Stage 4
cost(4,9)= c(9,12) = 4 D[9]=12
cost(4,10) = c(10,12) = 2 D[10]=12
● Stage 2: 2, 3
● Stage 3: 4, 5
● Stage 4: 6 (destination)
Edges:
● 1→2=2, 1→3=3
● 2→4=2, 2→5=3
● 3→4=1, 3→5=4
● 4→6=3, 5→6=2
Step-by-step Forward Computation
Initialization:
● dist[1] = 0
● dist[2..6] = ∞
Stage 1 → Stage 2
● From 1→2:
dist[2] = min(∞, 0+2) = 2
Now:
dist = [0, 2, 3, ∞, ∞, ∞]
Stage 2 → Stage 3
● From 2→4: dist[4] = min(∞, 2+2) = 4
Now:
dist = [0, 2, 3, 4, 5, ∞]
Stage 3 → Stage 4
● From 4→6: dist[6] = min(∞, 4+3) = 7
Now:
dist = [0, 2, 3, 4, 5, 7]
Final Result
● Shortest path cost from 1 → 6 = 7
○ 1→2→4→6
○ 1→3→4→6
○ 1→2→5→6
Applications
● Shortest path problems.
● Project scheduling.
● Network routing.
● Resource allocation problems.
All pairs shortest paths
Each edge has a number, called a weight, that represents the distance or cost to travel between two
nodes. The goal of the Floyd-Warshall algorithm is to find the shortest possible path between every
pair of nodes in the graph.
The algorithm works by checking every possible path between the nodes and updating the shortest
known distance between them. It repeats this process for each node in the graph until it has
considered all possible paths, ensuring that the shortest path is found for every pair of nodes.
● All-Pairs Shortest Paths: The Floyd-Warshall algorithm finds the shortest paths for all
pairs of nodes in the graph, not just from one node to all others. This means that after
running the algorithm, you know the shortest path between any two nodes in the graph.
● Matrix Representation: The algorithm uses a matrix (a table with rows and columns) to
represent the graph and store the distances between nodes. Each cell in the matrix
represents the distance between two nodes. As the algorithm runs, it updates the matrix to
reflect the shortest known distances between all pairs of nodes.
Example Walkthrough
Let’s walk through an example using a simple graph with 4 nodes (A, B, C, D) and the following
weighted edges:
Where "∞" represents that there is no direct path between the nodes.
Step 1: Initialization
Initialize the dist[][] matrix as the adjacency matrix of the graph.
Step 2: Iteration through all nodes (k)
● For each node k, consider it as an intermediate node, and update the shortest paths.
Iteration with k = A:
Update distances considering A as an intermediate node:
● No updates needed as there are no shorter paths through A.
Iteration with k = B:
Update distances considering B as an intermediate node:
● dist[A][C] = min(dist[A][C], dist[A][B] + dist[B][C]) = min(∞, 3 + 1) = 4
● dist[D][C] remains unchanged as there is no shorter path.
Updated matrix:
Iteration with k = C:
Update distances considering C as an intermediate node:
● dist[A][D] = min(dist[A][D], dist[A][C] + dist[C][D]) = min(7, 4 + 2) = 6
● dist[B][D] = min(dist[B][D], dist[B][C] + dist[C][D]) = min(∞, 1 + 2) = 3
Updated matrix:
This matrix represents the shortest paths between all pairs of nodes.
Time O(V^3) The algorithm uses three nested loops, each iterating
Complexity over the vertices (V), leading to O(V^3).
●
V represents the number of vertices in the graph.
● The time complexity is O(V^3) because the algorithm checks all pairs of vertices for each
possible intermediate vertex.
The space complexity is O(V^2) due to the use of a matrix that stores the shortest path between
every pair of vertices.
● Network Routing: Finds the shortest paths between all pairs of nodes in communication
networks to optimize routing protocols.
● Urban Planning: Helps in determining the most efficient routes between multiple
locations in a city for infrastructure development.
● Social Network Analysis: Used to determine the shortest connection paths between
individuals in social networks.
● Flight Scheduling: Determines the shortest flight routes between multiple airports for
airlines.
Purpose Finds the shortest Finds the shortest Finds the shortest path
paths between all pairs path from a single from a single source to
of nodes source to all other all other nodes
nodes
Best for Dense graphs or when Sparse graphs or Graphs with negative
all-pairs shortest paths when shortest path weights or where
are needed from a single source negative weight cycle
is needed detection is needed
Use Case Network routing with GPS navigation, Finding shortest paths in
Examples multiple connections, shortest path in road graphs with negative
traffic management networks weights, detecting
negative cycles
The search time can be improved in Optimal Cost Binary Search Tree, placing the most
frequently used data in the root and closer to the root element, while placing the least
frequently used data near leaves and in leaves.
Eg.
For our tiny example, we could find the optimal tree by generating all 14 binary search trees
with 4 keys. As a general algorithm, this exhaustive- search approach is unrealistic: the total
number of binary search trees with n keys is equal to the nth Catalan number,
If a binary search tree represents n identifiers, then there will be exactly n internal nodes and
n+1 external nodes. Every node internal node represents a point where a successful search
may terminate. Every external node represents a point where an unsuccessful search may
terminate.
The identifiers not in the binary search tree can be partitioned into n+1 equivalence classes Ei,
0 ≤ i ≤ n. If the failure node for Ei is at level l, then only l -1 comparison are needed.
Let q(i) be the probability that the identifier x being searched for is in then clearly
and the cost contribution for the failure node for Ei is q(i)*(level (Ei) -
1).
There fore, the cost of the optimal binary search tree is:
Time complexity: The computing time for above algorithm is O(n2). To construct obst from r[i,j] is
O(n). So total time to construct obst is O(n3).
Unsuccessful search cost of the tree = 4(3-1) + 2(3-1) + 4(3-1) + 1(4-1) + 1(4-1) = 26
Optimal Binary Search Tree with Successful search probabilities with suitable examples.
• OBST is a binary search tree which provides the smallest possible search time
(or expected search) for a given sequence of accesses (or access probabilities).
• There fore, the cost of the optimal binary search tree is:
Knapsack Problem
Given a set of items, each with a weight and a value, determine a subset of items to include
in a collection so that the total weight is less than or equal to a given limit and the total value
is as large as possible.
Fractional Knapsack
In this version of Knapsack problem, items can be broken into smaller pieces. So, the thief
may take only a fraction xi of ith item.
0 ≤ xi ≤ 1
0/1 Knapsack:
In this item cannot be broken which means we should take the item as a whole or should
leave it. That's why it is called 0/1 knapsack Problem.
bottom up: it fills a table with solutions to all smaller subproblems, but each of them is solved
only once. An unsatisfying aspect of this approach is that solutions to some of these smaller
subproblems are often not necessary for getting a solution to the problem given. Since this
drawback is not present in the top-down approach, it is natural to try to combine the strengths
of the top-down and bottom-up approaches.
The goal is to get a method that solves only subproblems that are necessary and does so only
once. Such a method exists; it is based on using memory functions.
This method solves a given problem in the top-down manner but, in addition, maintains a
table of the kind that would have been used by a bottom-up dynamic programming algorithm.
Time Complexity:
The time efficiency and space efficiency of this algorithm are both in θ(nW). The time
Given a set of cities and distance between every pair of cities, the problem is to find the
shortest possible route that visits every city exactly once and returns to the starting point.
Every tour consists of an edge (1,k) for some k ∈ V-{1} and a path from vertex k to vertex 1.
The path from vertex k to vertex 1 goes through each vertex in V-{1,k} exactly once. It is
easy to see that if the tour is optimal, then the path from k to 1 must be a shortest k to1 path
going through all vertices in V-{1,k}
Let g(i,S) be the length of a shortest path starting at vertex i, going through all vertices in S
and terminating at vertex 1. The function g(1, V-{1}) is the length of an optimal salesperson
tour.
From the principal of optimality, it follows that g(1,V-
2≤k≤n
Backtracking- General Method- 8 Queens Problem- Sum of subset problem- Graph coloring- Hamiltonian
Cycles- 0/1Knapsack Problem
1
2
Recursive Backtracking Algorithm:
3
Applications of Backtracking:
Backtracking method is applied to solve various problems like:
1. N Queens Problem
2. Sum of Subsets Problem
3. Graph Coloring
4. Hamiltonian Cycles
5. Knapsack Problem
4
4- Queens Problem –state space tree:
5
N-Queens Problem- algorithm1: Placing a new queen in kth row & ith column.
6
8-Queens Problem solution:
7
Sum of Subsets Problem-Algorithm
8
Sum of Subsets Problem-Example
9
Graph coloring:
Let G be a graph and m be a positive integer.
It is to find whether that nodes of G can be colored in such a way that no two adjacent nodes have the same
color
yet only m colors are used where m is a chromatic number.
If d is degree of a given graph G, then it is colored with d+ 1 colors.
Degree means number of edges connected to that node.
10
Graph coloring- m coloring algorithm
11
Graph coloring- generating color algorithm
12
Hamiltonian Cycles
13
Hamiltonian Cycles-Finding all Hamiltonian Cycles Algorithm
Knapsack Problem
14
Knapsack Problem- Bounding Function Algorithm
15
Knapsack Problem- Example
16
17
18
Branch and Bound: The method, Travelling salesperson, 0/1 Knapsack problem.
1
FIFO (First- In- First- Out) Search or BFS:
2
Least Count (LC) Search:
3
Travelling Sales Person using B & B
4
TSP Example:
5
6
7
8
9
10
11
12
0/1 Knapsack problem using B & B
13
14
15
16
UNIT-V
P, NP, NP-COMPLETE, NP-HARD PROBLEMS
Introduction to Problems:
Types of Algorithms
1
2. Non Deterministic Algorithm: It terminates unsuccessfully if and only if there exists no set of choices
leading to a success signal.
To specify such algorithms, we introduce 3 functions:
2
3
4
Cook’s Theorem:
5
6
Satifiability problem
7
3 CNF Satifiability problem
8
9
10
11
12
NP Hard Scheduling Problems: Scheduling Identical Processors, Job Shop Scheduling
13
14
15