0% found this document useful (0 votes)

16 views80 pages

Advanced Data Structures & Algorithms

The document discusses the role of algorithms in computing, focusing on their efficiency and complexity analysis. It covers various algorithm types, methods for writing and analyzing algorithms, and the importance of efficient algorithms in advanced data structures. Additionally, it explains time and space complexity, asymptotic notation, and performance measurement techniques.

Uploaded by

sharmimahiii

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views80 pages

Advanced Data Structures & Algorithms

Uploaded by

sharmimahiii

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CS4101- ADVANCED DATA STRUCTURES AND ALGORITHMS

UNIT 1 - ROLE OF ALGORITHMS IN COMPUTING & COMPLEXITY ANALYSIS

Algorithms – Algorithms as a Technology -Time and Space complexity of algorithms-
Asymptotic analysis-Average and worst-case analysis-Asymptotic notation-Importance of
efficient algorithms- Program performance measurement - Recurrences: The Substitution
Method – The Recursion-Tree Method- Data structures and algorithms

Algorithm:
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output. Algorithms are generally created independent of
underlying languages, i.e. an algorithm can be implemented in more than one programming
language.
From the data structure point of view, following are some important categories of algorithms
−
 Search − Algorithm to search an item in a data structure.
 Sort − Algorithm to sort items in a certain order.
 Insert − Algorithm to insert item in a data structure.
 Update − Algorithm to update an existing item in a data structure.
 Delete − Algorithm to delete an existing item from a data structure.
 How to Write an Algorithm?
 There are no well-defined standards for writing algorithms. Rather, it is problem and
resource dependent. Algorithms are never written to support a particular programming
code.
 As we know that all programming languages share basic code constructs like loops (do,
for, while), flow-control (if-else), etc. These common constructs can be used to write
an algorithm.
 We write algorithms in a step-by-step manner, but it is not always the case. Algorithm
writing is a process and is executed after the problem domain is well-defined. That is,
we should know the problem domain, for which we are designing a solution.

Example
Let's try to learn algorithm-writing by using an example.
Problem − Design an algorithm to add two numbers and display the result.
Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP
Algorithms tell the programmers how to code the program. Alternatively, the algorithm can be
written as −
Step 1 − START ADD
Step 2 − get values of a & b
Step 3 − c ← a + b
Step 4 − display c
Step 5 − STOP
In design and analysis of algorithms, usually the second method is used to describe an
algorithm. It makes it easy for the analyst to analyze the algorithm ignoring all unwanted
definitions. He can observe what operations are being used and how the process is flowing.
Writing step numbers, is optional.
We design an algorithm to get a solution of a given problem. A problem can be solved in more
than one ways.

Hence, many solution algorithms can be derived for a given problem. The next step is to
analyze those proposed solution algorithms and implement the best suitable solution.

Algorithm Analysis

Efficiency of an algorithm can be analyzed at two different stages, before implementation and
after implementation. They are the following −
 A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an
algorithm is measured by assuming that all other factors, for example, processor speed,
are constant and have no effect on the implementation.
 A Posterior Analysis − This is an empirical analysis of an algorithm. The selected
algorithm is implemented using programming language. This is then executed on target
computer machine. In this analysis, actual statistics like running time and space
required, are collected.
We shall learn about a priori algorithm analysis. Algorithm analysis deals with the execution
or running time of various operations involved. The running time of an operation can be
defined as the number of computer instructions executed per operation.
Algorithms as a Technology

Efficiency Different algorithms devised to solve the same problem often differ dramatically in
their efficiency. These differences can be much more significant than differences due to
hardware and software.
Algorithms and other technologies The example above shows that we should consider
algorithms, like computer hardware, as a technology. Total system performance depends on
choosing efficient algorithms as much as on choosing fast hardware. Just as rapid advances are
being made in other computer technologies, they are being made in algorithms as well. You
might wonder whether algorithms are truly that important on contemporary computers in light
of other advanced technologies, such as
 advanced computer architectures and fabrication technologies,
 easy-to-use, intuitive, graphical user interfaces (GUIs),
 object-oriented systems,
 integrated Web technologies, and
 fast networking, both wired and wireless.

Insertion sort
Our first algorithm, insertion sort, solves the sorting problem introduced in Chapter 1: Input:
A sequence of n numbers ha1; a2;:::;ani. Output: A permutation (reordering) ha0 1; a0 2;:::;a0
ni of the input sequence such that a0 1 a0 2 a0 n. The numbers that we wish to sort are also
known as the keys. Although conceptually we are sorting a sequence, the input comes to us in
the form of an array with n elements.
Algorithm Complexity:

Suppose X is an algorithm and n is the size of input data, the time and space used by the
algorithm X are the two main factors, which decide the efficiency of X.
 Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
 Space Factor − Space is measured by counting the maximum memory space required
by the algorithm.
The complexity of an algorithm f(n) gives the running time and/or the storage space required
by the algorithm in terms of n as the size of input data.

Space Complexity

Space complexity of an algorithm represents the amount of memory space required by the
algorithm in its life cycle. The space required by an algorithm is equal to the sum of the
following two components −
 A fixed part that is a space required to store certain data and variables, that are
independent of the size of the problem. For example, simple variables and constants
used, program size, etc.
 A variable part is a space required by variables, whose size depends on the size of the
problem. For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and
S(I) is the variable part of the algorithm, which depends on instance characteristic I. Following
is a simple example that tries to explain the concept −
Algorithm: SUM(A, B)
Step 1 - START
Step 2 - C ← A + B + 10
Step 3 - Stop
Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now, space
depends on data types of given variables and constant types and it will be multiplied
accordingly.

Time Complexity
Time complexity of an algorithm represents the amount of time required by the algorithm to
run to completion. Time requirements can be defined as a numerical function T(n), where T(n)
can be measured as the number of steps, provided each step consumes constant time.
For example, addition of two n-bit integers takes n steps. Consequently, the total
computational time is T(n) = c ∗ n, where c is the time taken for the addition of two bits. Here,
we observe that T(n) grows linearly as the input size increases.

Asymptotic Notation:

The main idea of asymptotic analysis is to have a measure of efficiency of algorithms that
doesn’t depend on machine specific constants, and doesn’t require algorithms to be
implemented and time taken by programs to be compared. Asymptotic notations are
mathematical tools to represent time complexity of algorithms for asymptotic analysis. The
following 3 asymptotic notations are mostly used to represent time complexity of algorithms.

1) Θ Notation: The theta notation bounds a functions from above and below, so it defines
exact asymptotic behavior.
A simple way to get Theta notation of an expression is to drop low order terms and ignore
leading constants. For example, consider the following expression.
3n3 + 6n2 + 6000 = Θ(n3)
Dropping lower order terms is always fine because there will always be a n0 after which Θ(n3)
has higher values than Θn2) irrespective of the constants involved.
For a given function g(n), we denote Θ(g(n)) is following set of functions.
Θ(g(n)) = {f(n): there exist positive constants c1, c2 and n0 such
that 0 <= c1*g(n) <= f(n) <= c2*g(n) for all n >= n0}
The above definition means, if f(n) is theta of g(n), then the value f(n) is always between
c1*g(n) and c2*g(n) for large values of n (n >= n0). The definition of theta also requires that
f(n) must be non-negative for values of n greater than n0.
2) Big O Notation: The Big O notation defines an upper bound of an algorithm, it bounds a
function only from above. For example, consider the case of Insertion Sort. It takes linear time
in best case and quadratic time in worst case. We can safely say that the time complexity of
Insertion sort is O(n^2). Note that O(n^2) also covers linear time.
If we use Θ notation to represent time complexity of Insertion sort, we have to use two
statements for best and worst cases:
1. The worst case time complexity of Insertion Sort is Θ(n^2).
2. The best case time complexity of Insertion Sort is Θ(n).
The Big O notation is useful when we only have upper bound on time complexity of an
algorithm. Many times we easily find an upper bound by simply looking at the algorithm.
O(g(n)) = { f(n): there exist positive constants c and
n0 such that 0 <= f(n) <= c*g(n) for
all n >= n0}

3) Ω Notation: Just as Big O notation provides an asymptotic upper bound on a function, Ω

notation provides an asymptotic lower bound.
Ω Notation can be useful when we have lower bound on time complexity of an algorithm. As
discussed in the previous post, the best case performance of an algorithm is generally not
useful, the Omega notation is the least used notation among all three.
For a given function g(n), we denote by Ω(g(n)) the set of functions.
Ω (g(n)) = {f(n): there exist positive constants c and
n0 such that 0 <= c*g(n) <= f(n) for
all n >= n0}.
Let us consider the same Insertion sort example here. The time complexity of Insertion Sort
can be written as Ω(n), but it is not a very useful information about insertion sort, as we are
generally interested in worst case and sometimes in average case.

Importance of efficient algorithms:

Efficient algorithms are of paramount importance when working with advanced data structures.
Advanced data structures provide more specialized ways of organizing and storing data to
improve various aspects of algorithmic performance, such as time complexity, space
complexity, and overall efficiency. However, without efficient algorithms to operate on these
data structures, their potential benefits might not be fully realized. Here's why efficient
algorithms are crucial when working with advanced data structures:
Optimized Resource Usage: Advanced data structures are designed to optimize certain
operations, like insertion, deletion, search, etc. However, if inefficient algorithms are used, the
benefits of these data structures might be offset by poor performance. Efficient algorithms help
ensure that the advanced data structures are utilized optimally, leading to minimized resource
usage (CPU time, memory, etc.).
Time Complexity: Advanced data structures often offer improved time complexity for specific
operations compared to simpler data structures. For example, self-balancing binary search trees
like AVL trees or red-black trees have logarithmic height, which ensures efficient searching.
However, to maintain this time complexity, you need algorithms that take advantage of the
balanced properties. Inefficient algorithms might lead to worst-case behavior, degrading
performance.
Space Complexity: While advanced data structures might have more complex internal
arrangements, they often optimize space usage. For instance, a hash table with open addressing
can have better space efficiency compared to a basic array. Efficient algorithms ensure that this
space optimization is maintained and not squandered due to poor algorithmic choices.
Consistency: Advanced data structures often come with certain properties and invariants that
need to be maintained to ensure their efficiency. Efficient algorithms help in performing
operations that respect these properties, keeping the data structure in a consistent and optimized
state.
Scalability: As datasets grow larger, the impact of algorithmic inefficiencies becomes more
pronounced. Advanced data structures are often chosen to handle large-scale applications
precisely because of their efficiency improvements. Efficient algorithms are crucial for scaling
these structures to handle massive datasets.
Real-world Applications: Many real-world applications, such as databases, networking
protocols, compilers, and simulations, rely on advanced data structures to optimize their
operations. Without efficient algorithms, these applications might suffer from poor performance
and sluggish behavior.
Algorithmic Innovation: The study of advanced data structures and the development of
efficient algorithms often go hand in hand. The design and analysis of algorithms for these
structures can lead to new insights and innovations in computer science, driving progress in
various fields.
Competitive Advantage: In industries where performance is crucial, such as finance, gaming,
and machine learning, having efficient algorithms operating on advanced data structures can
provide a competitive advantage. Faster processing and response times can lead to better user
experiences and more efficient business operations.

Program performance measurement:

Comparison of running times For each function f .n/ and time t in the following table,
determine the largest size n of a problem that can be solved in time t, assuming that the
algorithm to solve the problem takes f .n/ microseconds.

Recurrences:
Many algorithms are recursive in nature. When we analyze them, we get a recurrence relation
for time complexity. We get running time on an input of size n as a function of n and the
running time on inputs of smaller sizes. For example in merge sort, to sort a given array, we
divide it in two halves and recursively repeat the process for the two halves. Finally we merge
the results. Time complexity of Merge Sort can be written as T(n) = 2T(n/2) + cn.

1) Substitution Method: We make a guess for the solution and then we use mathematical
induction to prove the guess is correct or incorrect.
For example consider the recurrence T(n) = 2T(n/2) + n

We guess the solution as T(n) = O(nLogn). Now we use induction

to prove our guess.

We need to prove that T(n) <= cnLogn. We can assume that it is true
for values smaller than n.

T(n) = 2T(n/2) + n
<= cn/2Log(n/2) + n
= cnLogn - cnLog2 + n
= cnLogn - cn + n
<= cnLogn

2) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate the
time taken by every level of tree. Finally, we sum the work done at all levels. To draw the
recurrence tree, we start from the given recurrence and keep drawing till we find a pattern
among levels. The pattern is typically a arithmetic or geometric series.
For example consider the recurrence relation
T(n) = T(n/4) + T(n/2) + cn2

cn2
/ \
T(n/4) T(n/2)

If we further break down the expression T(n/4) and T(n/2),

we get following recursion tree.

cn2
/ \
2
c(n )/16 c(n2)/4
/ \ / \
T(n/16) T(n/8) T(n/8) T(n/4)
Breaking down further gives us following
cn2
/ \
c(n2)/16 c(n2)/4
/ \ / \
c(n2)/256 c(n2)/64 c(n2)/64 c(n2)/16
/ \ / \ / \ / \

To know the value of T(n), we need to calculate sum of tree

nodes level by level. If we sum the above tree level by level,
we get the following series
T(n) = c(n^2 + 5(n^2)/16 + 25(n^2)/256) + ....
The above series is geometrical progression with ratio 5/16.

To get an upper bound, we can sum the infinite series.

We get the sum as (n2)/(1 - 5/16) which is O(n2)
3) Master Method:
Master Method is a direct way to get the solution. The master method works only for
following type of recurrences or for recurrences that can be transformed to following type.
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
There are following three cases:
1. If f(n) = Θ(nc) where c < Logba then T(n) = Θ(nLogba)
2. If f(n) = Θ(nc) where c = Logba then T(n) = Θ(ncLog n)
[Link] f(n) = Θ(nc) where c > Logba then T(n) = Θ(f(n))

How does this work?

Master method is mainly derived from recurrence tree method. If we draw recurrence tree of
T(n) = aT(n/b) + f(n), we can see that the work done at root is f(n) and work done at all leaves
is Θ(nc) where c is Logba. And the height of recurrence tree is Logbn

In recurrence tree method, we calculate total work done. If the work done at leaves is
polynomially more, then leaves are the dominant part, and our result becomes the work done
at leaves (Case 1). If work done at leaves and root is asymptotically same, then our result
becomes height multiplied by work done at any level (Case 2). If work done at root is
asymptotically more, then our result becomes work done at root (Case 3).

Examples of some standard algorithms whose time complexity can be evaluated using
Master Method
Merge Sort: T(n) = 2T(n/2) + Θ(n). It falls in case 2 as c is 1 and Logba] is also 1. So the
solution is Θ(n Logn)
Binary Search: T(n) = T(n/2) + Θ(1). It also falls in case 2 as c is 0 and Logba is also 0. So the
solution is Θ(Logn)
Notes:
1) It is not necessary that a recurrence of the form T(n) = aT(n/b) + f(n) can be solved using
Master Theorem. The given three cases have some gaps between them. For example, the
recurrence T(n) = 2T(n/2) + n/Logn cannot be solved using master method.
2) Case 2 can be extended for f(n) = Θ(ncLogkn)
If f(n) = Θ(ncLogkn) for some constant k >= 0 and c = Logba, then T(n) = Θ(ncLogk+1n)
Applications of Data Structure and Algorithms
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a
certain order to get the desired output. Algorithms are generally created independent of
underlying languages, i.e. an algorithm can be implemented in more than one programming
language.
From the data structure point of view, following are some important categories of algorithms −
 Search − Algorithm to search an item in a data structure.
 Sort − Algorithm to sort items in a certain order.
 Insert − Algorithm to insert item in a data structure.
 Update − Algorithm to update an existing item in a data structure.
 Delete − Algorithm to delete an existing item from a data structure.
The following computer problems can be solved using Data Structures −

 Fibonacci number series

 Knapsack problem
 Tower of Hanoi
 All pair shortest path by Floyd-Warshall
 Shortest path by Dijkstra
 Project scheduling

Data Structure and Algorithms - Tree

A tree is a non-linear abstract data type with a hierarchy-based structure. It consists of nodes
(where the data is stored) that are connected via links. The tree data structure stems from a single
node called a root node and has subtrees connected to the root.

Important Terms
Following are the important terms with respect to tree.
 Path − Path refers to the sequence of nodes along the edges of a tree.
 Root − The node at the top of the tree is called root. There is only one root per tree and
one path from the root node to any node.
 Parent − Any node except the root node has one edge upward to a node called parent.
 Child − The node below a given node connected by its edge downward is called its child
node.
 Leaf − The node which does not have any child node is called the leaf node.
 Subtree − Subtree represents the descendants of a node.
 Visiting − Visiting refers to checking the value of a node when control is on the node.
 Traversing − Traversing means passing through nodes in a specific order.
 Levels − Level of a node represents the generation of a node. If the root node is at level
0, then its next child node is at level 1, its grandchild is at level 2, and so on.
 Keys − Key represents a value of a node based on which a search operation is to be
carried out for a node.
Types of Trees
There are three types of trees −
 General Trees
 Binary Trees
 Binary Search Trees
General Trees
General trees are unordered tree data structures where the root node has minimum 0 or
maximum ‘n’ subtrees.
The General trees have no constraint placed on their hierarchy. The root node thus acts like the
superset of all the other subtrees.
UNIT II- HIERARCHICAL DATA STRUCTURES
Binary Search Trees: Basics – Querying a Binary search tree – Insertion and Deletion- Red
Black trees: Properties of Red-Black Trees – Rotations – Insertion – Deletion -B-Trees:
Definition of B -trees – Basic operations on B-Trees – Deleting a key from a B-Tree- Heap –
Heap Implementation – Disjoint Sets - Fibonacci Heaps: structure – Mergeable-heap
operations- Decreasing a key and deleting a node-Bounding the maximum degree.

Binary Search Trees:

A Binary Search Tree (BST) is a tree in which all the nodes follow the below-mentioned
properties −
 The left sub-tree of a node has a key less than or equal to its parent node's key.
 The right sub-tree of a node has a key greater than or equal to its parent node's key.
Thus, BST divides all its sub-trees into two segments; the left sub-tree and the right sub-tree
and can be defined as −
left_subtree (keys) ≤ node (key) ≤ right_subtree (keys)

Representation
BST is a collection of nodes arranged in a way where they maintain BST properties. Each node
has a key and an associated value. While searching, the desired key is compared to the keys in
BST and if found, the associated value is retrieved.
Following is a pictorial representation of BST −

We observe that the root node key (27) has all less-valued keys on the left sub-tree and the
higher valued keys on the right sub-tree.
Basic Operations
Following are the basic operations of a tree −
 Search − Searches an element in a tree.
 Insert − Inserts an element in a tree.
 Pre-order Traversal − Traverses a tree in a pre-order manner.
 In-order Traversal − Traverses a tree in an in-order manner.
 Post-order Traversal − Traverses a tree in a post-order manner.

Advantages of Binary search tree

o Searching an element in the Binary search tree is easy as we always have a hint that
which subtree has the desired element.
o As compared to array and linked lists, insertion and deletion operations are faster in
BST.

Insertion and Deletion:

Example of creating a binary search tree

Now, let's see the creation of binary search tree using an example.

Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50
o First, we have to insert 45 into the tree as the root of the tree.
o Then, read the next element; if it is smaller than the root node, insert it as the root of the
left subtree, and move to the next element.
o Otherwise, if the element is larger than the root node, then insert it as the root of the
right subtree.
Now, let's see the process of creating the Binary search tree using the given data element. The
process of creating the BST is shown below -
Step 1 - Insert 45. Step 2 - Insert 15. Step 3 - Insert 79.

As 15 is smaller than 45, so insert it as the root node of the left subtree.
As 79 is greater than 45, so insert it as the root node of the right subtree.
90 is greater than 45 and 79, so it will be inserted as the right subtree of 79.

Step 4 - Insert 90. Step 5 - Insert 10.

10 is smaller than 45 and 15, so it will be inserted as a left subtree of 15.

Step 6 - Insert 55. Step 7 - Insert 12.
55 is larger than 45 and smaller than 79, so it will be inserted as the left subtree of 79.
12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the right subtree of 10.
Step 8 - Insert 20. Step 9 - Insert 50.
20 is smaller than 45 but greater than 15, so it will be inserted as the right subtree of 15.

50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a left subtree of 55.

Now, the creation of binary search tree is completed. After that, let's move towards the
operations that can be performed on Binary search tree.

We can perform insert, delete and search operations on the binary search tree.

Let's understand how a search is performed on a binary search tree.

Searching in Binary search tree

Searching means to find or locate a specific element or node in a data structure. In Binary search
tree, searching a node is easy because elements in BST are stored in a specific order. The steps
of searching a node in Binary Search tree are listed as follows -
1. First, compare the element to be searched with the root element of the tree.
2. If root is matched with the target element, then return the node's location.
3. If it is not matched, then check whether the item is less than the root element, if it is
smaller than the root element, then move to the left subtree.
4. If it is larger than the root element, then move to the right subtree.
5. Repeat the above procedure recursively until the match is found.
6. If the element is not found or not present in the tree, then return NULL.
Now, let's understand the searching in binary tree using an example. We are taking the binary
search tree formed above. Suppose we have to find node 20 from the below tree.
Step1: Step2:
Step3:

Now, let's see the algorithm to search an element in the Binary search tree.
Algorithm to search an element in Binary search tree
1. Search (root, item)
2. Step 1 - if (item = root → data) or (root = NULL)
3. return root
4. else if (item < root → data)
5. return Search(root → left, item)
6. else
7. return Search(root → right, item)
8. END if
9. Step 2 - END
Now let's understand how the deletion is performed on a binary search tree. We will also see an
example to delete an element from the given tree.

Deletion in Binary Search tree

In a binary search tree, we must delete a node from the tree by keeping in mind that the property
of BST is not violated. To delete a node from BST, there are three possible situations occur -
o The node to be deleted is the leaf node, or,
o The node to be deleted has only one child, and,
o The node to be deleted has two children
We will understand the situations listed above in detail.
When the node to be deleted is the leaf node
It is the simplest case to delete a node in BST. Here, we have to replace the leaf node with
NULL and simply free the allocated space.
We can see the process to delete a leaf node from BST in the below image. In below image,
suppose we have to delete node 90, as the node to be deleted is a leaf node, so it will be replaced
with NULL, and the allocated space will free.
When the node to be deleted has only one child
In this case, we have to replace the target node with its child, and then delete the child node. It
means that after replacing the target node with its child node, the child node will now contain
the value to be deleted. So, we simply have to replace the child node with NULL and free up
the allocated space.
We can see the process of deleting a node with one child from BST in the below image. In the
below image, suppose we have to delete the node 79, as the node to be deleted has only one
child, so it will be replaced with its child 55.
So, the replaced node 79 will now be a leaf node that can be easily deleted.

When the node to be deleted has two children

This case of deleting a node in BST is a bit complex among other two cases. In such a case, the
steps to be followed are listed as follows -
o First, find the inorder successor of the node to be deleted.
o After that, replace that node with the inorder successor until the target node is placed at
the leaf of tree.
o And at last, replace the node with NULL and free up the allocated space.
The inorder successor is required when the right child of the node is not empty. We can obtain
the inorder successor by finding the minimum element in the right child of the node.
We can see the process of deleting a node with two children from BST in the below image. In
the below image, suppose we have to delete node 45 that is the root node, as the node to be
deleted has two children, so it will be replaced with its inorder successor. Now, node 45 will be
at the leaf of the tree so that it can be deleted easily.
Now let's understand how insertion is performed on a binary search tree.
Insertion in Binary Search tree
A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start
searching from the root node; if the node to be inserted is less than the root node, then search
for an empty location in the left subtree. Else, search for the empty location in the right subtree
and insert the data. Insert in BST is similar to searching, as we always have to maintain the rule
that the left subtree is smaller than the root, and right subtree is larger than the root.
Now, let's see the process of inserting a node into BST using an example.

The complexity of the Binary Search tree

Let's see the time and space complexity of the Binary search tree. We will see the time
complexity for insertion, deletion, and searching operations in best case, average case, and worst
case.
1. Time Complexity
Operations Best case time Average case time complexity Worst case time
complexity complexity

Insertion O(log n) O(log n) O(n)

Deletion O(log n) O(log n) O(n)

Search O(log n) O(log n) O(n)

Where 'n' is the number of nodes in the given tree.

2. Space Complexity
Operations Space complexity

Insertion O(n)

Deletion O(n)

Search O(n)
o The space complexity of all operations of Binary search tree is O(n).

Red Black trees:

Red-Black tree is a binary search tree in which every node is colored with either red or
black. It is a type of self balancing binary search tree. It has a good efficient worst case
running time complexity.

Properties of Red Black Tree:

The Red-Black tree satisfies all the properties of binary search tree in addition to that it
satisfies following additional properties –
1. Root property: The root is black.
2. External property: Every leaf (Leaf is a NULL child of a node) is black in Red-Black
tree.
3. Internal property: The children of a red node are black. Hence possible parent of red
node is a black node.
4. Depth property: All the leaves have the same black depth.
5. Path property: Every simple path from root to descendant leaf node contains same
number of black nodes.
The result of all these above-mentioned properties is that the Red-Black tree is roughly
balanced.
Rules That Every Red-Black Tree Follows:
1. Every node has a color either red or black.
2. The root of the tree is always black.
3. There are no two adjacent red nodes (A red node cannot have a red parent or red child).
4. Every path from a node (including root) to any of its descendants NULL nodes has the
same number of black nodes.
5. Every leaf (e.i. NULL node) must be colored BLACK.
Most of the BST operations (e.g., search, max, min, insert, delete.. etc) take O(h) time where
h is the height of the BST. The cost of these operations may become O(n) for a skewed
Binary tree. If we make sure that the height of the tree remains O(log n) after every insertion
and deletion, then we can guarantee an upper bound of O(log n) for all these operations. The
height of a Red-Black tree is always O(log n) where n is the number of nodes in the tree.
Sr. No. Algorithm Time Complexity

1. Search O(log n)

2. Insert O(log n)

3. Delete O(log n)

“n” is the total number of elements in the red-black tree.

Interesting points about Red-Black Tree:
1. The black height of the red-black tree is the number of black nodes on a path from the
root node to a leaf node. Leaf nodes are also counted as black nodes. So, a red-black tree
of height h has black height >= h/2.
2. Height of a red-black tree with n nodes is h<= 2 log 2(n + 1).
3. All leaves (NIL) are black.
4. The black depth of a node is defined as the number of black nodes from the root to that
node i.e the number of black ancestors.
5. Every red-black tree is a special case of a binary tree.
Black Height of a Red-Black Tree :
Black height is the number of black nodes on a path from the root to a leaf. Leaf nodes are
also counted black nodes. From the above properties 3 and 4, we can derive, a Red-Black
Tree of height h has black-height >= h/2.
Every Red Black Tree with n nodes has height <= 2Log2(n+1)
This can be proved using the following facts:
1. For a general Binary Tree, let k be the minimum number of nodes on all root to NULL
paths, then n >= 2k – 1 (Ex. If k is 3, then n is at least 7). This expression can also be
written as k <= Log2(n+1).
2. From property 4 of Red-Black trees and above claim, we can say in a Red-Black Tree
with n nodes, there is a root to leaf path with at-most Log2(n+1) black nodes.
3. From properties 3 and 5 of Red-Black trees, we can claim that the number of black
nodes in a Red-Black tree is at least ⌊ n/2 ⌋ where n is the total number of nodes.

Insertion in Red Black tree

The following are some rules used to create the Red-Black tree:
1. If the tree is empty, then we create a new node as a root node with the color black.
2. If the tree is not empty, then we create a new node as a leaf node with a color red.
3. If the parent of a new node is black, then exit.
4. If the parent of a new node is Red, then we have to check the color of the parent's sibling
of a new node.
4a) If the color is Black, then we perform rotations and recoloring.

4b) If the color is Red then we recolor the node. We will also check whether the parents' parent
of a new node is the root node or not; if it is not a root node, we will recolor and recheck the
node.

Let's understand the insertion in the Red-Black tree.

10, 18, 7, 15, 16, 30, 25, 40, 60

Step 1: Initially, the tree is empty, so we create a new node having value 10. This is the first
node of the tree, so it would be the root node of the tree. As we already discussed, that root node
must be black in color, which is shown below:

Step 2: The next node is 18. As 18 is greater than 10 so it will come at the right of 10 as shown
below.

We know the second rule of the Red Black tree that if the tree is not empty then the newly
created node will have the Red color. Therefore, node 18 has a Red color, as shown in the below
figure:

Now we verify the third rule of the Red-Black tree, i.e., the parent of the new node is black or
not. In the above figure, the parent of the node is black in color; therefore, it is a Red-Black tree.

Step 3: Now, we create the new node having value 7 with Red color. As 7 is less than 10, so it
will come at the left of 10 as shown below.
Now we verify the third rule of the Red-Black tree, i.e., the parent of the new node is black or
not. As we can observe, the parent of the node 7 is black in color, and it obeys the Red-Black
tree's properties.

Step 4: The next element is 15, and 15 is greater than 10, but less than 18, so the new node will
be created at the left of node 18. The node 15 would be Red in color as the tree is not empty.

The above tree violates the property of the Red-Black tree as it has Red-red parent-child
relationship. Now we have to apply some rule to make a Red-Black tree. The rule 4 says that if
the new node's parent is Red, then we have to check the color of the parent's sibling of a new
node. The new node is node 15; the parent of the new node is node 18 and the sibling of the
parent node is node 7. As the color of the parent's sibling is Red in color, so we apply the rule
4a. The rule 4a says that we have to recolor both the parent and parent's sibling node. So, both
the nodes, i.e., 7 and 18, would be recolored as shown in the below figure.

We also have to check whether the parent's parent of the new node is the root node or not. As
we can observe in the above figure, the parent's parent of a new node is the root node, so we do
not need to recolor it.

Step 5: The next element is 16. As 16 is greater than 10 but less than 18 and greater than 15, so
node 16 will come at the right of node 15. The tree is not empty; node 16 would be Red in color,
as shown in the below figure:
In the above figure, we can observe that it violates the property of the parent-child relationship
as it has a red-red parent-child relationship. We have to apply some rules to make a Red-Black
tree. Since the new node's parent is Red color, and the parent of the new node has no sibling, so
rule 4a will be applied. The rule 4a says that some rotations and recoloring would be performed
on the tree.

Since node 16 is right of node 15 and the parent of node 15 is node 18. Node 15 is the left of
node 18. Here we have an LR relationship, so we require to perform two rotations. First, we
will perform left, and then we will perform the right rotation. The left rotation would be
performed on nodes 15 and 16, where node 16 will move upward, and node 15 will move
downward. Once the left rotation is performed, the tree looks like as shown in the below figure:

In the above figure, we can observe that there is an LL relationship. The above tree has a Red-
red conflict, so we perform the right rotation. When we perform the right rotation, the median
element would be the root node. Once the right rotation is performed, node 16 would become
the root node, and nodes 15 and 18 would be the left child and right child, respectively, as shown
in the below figure.
After rotation, node 16 and node 18 would be recolored; the color of node 16 is red, so it will
change to black, and the color of node 18 is black, so it will change to a red color as shown in
the below figure:

Step 6: The next element is 30. Node 30 is inserted at the right of node 18. As the tree is not
empty, so the color of node 30 would be red.

The color of the parent and parent's sibling of a new node is Red, so rule 4b is applied. In rule
4b, we have to do only recoloring, i.e., no rotations are required. The color of both the parent
(node 18) and parent's sibling (node 15) would become black, as shown in the below image.
We also have to check the parent's parent of the new node, whether it is a root node or not. The
parent's parent of the new node, i.e., node 30 is node 16 and node 16 is not a root node, so we
will recolor the node 16 and changes to the Red color. The parent of node 16 is node 10, and it
is not in Red color, so there is no Red-red conflict.

Step 7: The next element is 25, which we have to insert in a tree. Since 25 is greater than 10,
16, 18 but less than 30; so, it will come at the left of node 30. As the tree is not empty, node 25
would be in Red color. Here Red-red conflict occurs as the parent of the newly created is Red
color.

Since there is no parent's sibling, so rule 4a is applied in which rotation, as well as recoloring,
are performed. First, we will perform rotations. As the newly created node is at the left of its
parent and the parent node is at the right of its parent, so the RL relationship is formed. Firstly,
the right rotation is performed in which node 25 goes upwards, whereas node 30 goes
downwards, as shown in the below figure.
After the first rotation, there is an RR relationship, so left rotation is performed. After right
rotation, the median element, i.e., 25 would be the root node; node 30 would be at the right of
25 and node 18 would be at the left of node 25.

Now recoloring would be performed on nodes 25 and 18; node 25 becomes black in color, and
node 18 becomes red in color.
Step 8: The next element is 40. Since 40 is greater than 10, 16, 18, 25, and 30, so node 40 will
come at the right of node 30. As the tree is not empty, node 40 would be Red in color. There is
a Red-red conflict between nodes 40 and 30, so rule 4b will be applied.

As the color of parent and parent's sibling node of a new node is Red so recoloring would be
performed. The color of both the nodes would become black, as shown in the below image.

After recoloring, we also have to check the parent's parent of a new node, i.e., 25, which is not
a root node, so recoloring would be performed, and the color of node 25 changes to Red.

After recoloring, red-red conflict occurs between nodes 25 and 16. Now node 25 would be
considered as the new node. Since the parent of node 25 is red in color, and the parent's sibling
is black in color, rule 4a would be applied. Since 25 is at the right of the node 16 and 16 is at
the right of its parent, so there is an RR relationship. In the RR relationship, left rotation is
performed. After left rotation, the median element 16 would be the root node, as shown in the
below figure.

After rotation, recoloring is performed on nodes 16 and 10. The color of node 10 and node 16
changes to Red and Black, respectively as shown in the below figure.
Step 9: The next element is 60. Since 60 is greater than 16, 25, 30, and 40, so node 60 will come
at the right of node 40. As the tree is not empty, the color of node 60 would be Red.

As we can observe in the above tree that there is a Red-red conflict occurs. The parent node is
Red in color, and there is no parent's sibling exists in the tree, so rule 4a would be applied. The
first rotation would be performed. The RR relationship exists between the nodes, so left rotation
would be performed.

When left rotation is performed, node 40 will come upwards, and node 30 will come
downwards, as shown in the below figure:
After rotation, the recoloring is performed on nodes 30 and 40. The color of node 30 would
become Red, while the color of node 40 would become black.

The above tree is a Red-Black tree as it follows all the Red-Black tree properties.

Deletion in Red Back tree

Let's understand how we can delete the particular node from the Red-Black tree. The following
are the rules used to delete the particular node from the tree:
Step 1: First, we perform BST rules for the deletion.
Step 2:
Case 1: if the node is Red, which is to be deleted, we simply delete it.
Let's understand case 1 through an example.
Suppose we want to delete node 30 from the tree, which is given below.

Initially, we are having the address of the root node. First, we will apply BST to search the node.
Since 30 is greater than 10 and 20, which means that 30 is the right child of node 20. Node 30
is a leaf node and Red in color, so it is simply deleted from the tree.
If we want to delete the internal node that has one child. First, replace the value of the internal
node with the value of the child node and then simply delete the child node.

Let's take another example in which we want to delete the internal node, i.e., node 20.

We cannot delete the internal node; we can only replace the value of that node with another
value. Node 20 is at the right of the root node, and it is having only one child, node 30. So, node
20 is replaced with a value 30, but the color of the node would remain the same, i.e., Black. In
the end, node 20 (leaf node) is deleted from the tree.

If we want to delete the internal node that has two child nodes. In this case, we have to decide
from which we have to replace the value of the internal node (either left subtree or right subtree).
We have two ways:
o Inorder predecessor: We will replace with the largest value that exists in the left
subtree.
o Inorder successor: We will replace with the smallest value that exists in the right
subtree.
Suppose we want to delete node 30 from the tree, which is shown below:

Node 30 is at the right of the root node. In this case, we will use the inorder successor. The
value 38 is the smallest value in the right subtree, so we will replace the value 30 with 38, but
the node would remain the same, i.e., Red. After replacement, the leaf node, i.e., 30, would be
deleted from the tree. Since node 30 is a leaf node and Red in color, we need to delete it (we do
not have to perform any rotations or any recoloring).

Case 2: If the root node is also double black, then simply remove the double black and make it
a single black.

Case 3: If the double black's sibling is black and both its children are black.
o Remove the double black node.
o Add the color of the node to the parent (P) node.
1. If the color of P is red then it becomes black.
2. If the color of P is black, then it becomes double black.
o The color of double black's sibling changes to red.
o If still double black situation arises, then we will apply other cases.

Case 4: If double black's sibling is Red.

o Swap the color of its parent and its sibling.
o Rotate the parent node in the double black's direction.
o Reapply cases.

Let's understand this case through an example.

Suppose we want to delete node 15.

Initially, the 15 is replaced with a nil value. After replacement, the node becomes double black.
Since double black's sibling is Red so color of the node 20 changes to Red and the color of the
node 30 changes to Black.

Once the swapping of the color is completed, the rotation towards the double black would be
performed. The node 30 will move upwards and the node 20 will move downwards as shown
in the below figure.

In the above tree, we can observe that double black situation still exists in the tree. It satisfies
the case 3 in which double black's sibling is black as well as both its children are black. First,
we remove the double black from the node and add the black color to its parent node. At the
end, the color of the double black's sibling, i.e., node 25 changes to Red as shown in the below
figure.
In the above tree, we can observe that the double black situation has been resolved. It also
satisfies the properties of the Red Black tree.

Case 5: If double black's sibling is black, sibling's child who is far from the double black is
black, but near child to double black is red.
o Swap the color of double black's sibling and the sibling child which is nearer to the
double black node.
o Rotate the sibling in the opposite direction of the double black.
o Apply case 6

Suppose we want to delete the node 1 in the below tree.

First, we replace the value 1 with the nil value. The node becomes double black as both the
nodes, i.e., 1 and nil are black. It satisfies the case 3 that implies if DB's sibling is black and
both its children are black. First, we remove the double black of the nil node. Since the parent
of DB is Black, so when the black color is added to the parent node then it becomes double
black. After adding the color, the double black's sibling color changes to Red as shown below.

We can observe in the above screenshot that the double black problem still exists in the tree.
So, we will reapply the cases. We will apply case 5 because the sibling of node 5 is node 30,
which is black in color, the child of node 30, which is far from node 5 is black, and the child of
the node 30 which is near to node 5 is Red. In this case, first we will swap the color of node 30
and node 25 so the color of node 30 changes to Red and the color of node 25 changes to Black
as shown below.
Once the swapping of the color between the nodes is completed, we need to rotate the sibling
in the opposite direction of the double black node. In this rotation, the node 30 moves
downwards while the node 25 moves upwards as shown below.

As we can observe in the above tree that double black situation still exists. So, we need to case
6. Let's first see what is case 6.

Case 6: If double black's sibling is black, far child is Red

o Swap the color of Parent and its sibling node.
o Rotate the parent towards the Double black's direction
o Remove Double black
o Change the Red color to black.

Now we will apply case 6 in the above example to solve the double black's situation.

In the above example, the double black is node 5, and the sibling of node 5 is node 25, which is
black in color. The far child of the double black node is node 30, which is Red in color as shown
in the below figure:
First, we will swap the colors of Parent and its sibling. The parent of node 5 is node 10, and the
sibling node is node 25. The colors of both the nodes are black, so there is no swapping would
occur.

In the second step, we need to rotate the parent in the double black's direction. After rotation,
node 25 will move upwards, whereas node 10 will move downwards. Once the rotation is
performed, the tree would like, as shown in the below figure:

In the next step, we will remove double black from node 5 and node 5 will give its black color
to the far child, i.e., node 30. Therefore, the color of node 30 changes to black as shown in the
below figure.
B Tree:
B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of order
m can have at most m-1 keys and m children. One of the main reason of using B tree is its
capability to store large number of keys in a single node and large key values by keeping the
height of the tree relatively small.

A B tree of order m contains all the properties of an M way tree. In addition, it contains the
following properties.
1. Every node in a B-Tree contains at most m children.
2. Every node in a B-Tree except the root node and the leaf node contain at least m/2
children.
3. The root nodes must have at least 2 nodes.
4. All leaf nodes must be at the same level.

It is not necessary that, all the nodes contain the same number of children but, each node must
have m/2 number of nodes.

A B tree of order 4 is shown in the following image.

Operations
Searching :

Searching in B Trees is similar to that in Binary search tree. For example, if we search for an
item 49 in the following B Tree. The process will something like following :
1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. match found, return.

Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n)
time to search any element in a B tree.

Insertionof B-Tree:

Insertions are done at the leaf node level. The following algorithm needs to be followed in order
to insert an item into B Tree.
1. Traverse the B Tree in order to find the appropriate leaf node at which the node can be
inserted.
2. If the leaf node contain less than m-1 keys then insert the element in the increasing order.
3. Else, if the leaf node contains m-1 keys, then follow the following steps.
o Insert the new element in the increasing order of elements.
o Split the node into the two nodes at the median.
o Push the median element upto its parent node.
o If the parent node also contain m-1 number of keys, then split it too by following
the same steps.

Example:

Insert the node 8 into the B Tree of order 5 shown in the following image.

8 will be inserted to the right of 5, therefore insert 8.

The node, now contain 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the node
from the median i.e. 8 and push it up to its parent node shown as follows.

Deletion of B-Tree:

Deletion is also performed at the leaf nodes. The node which is to be deleted can either be a leaf
node or an internal node. Following algorithm needs to be followed in order to delete a node
from a B tree.
1. Locate the leaf node.
2. If there are more than m/2 keys in the leaf node then delete the desired key from the
node.
3. If the leaf node doesn't contain m/2 keys then complete the keys by taking the element
from eight or left sibling.
o If the left sibling contains more than m/2 elements then push its largest element
up to its parent and move the intervening element down to the node where the
key is deleted.
o If the right sibling contains more than m/2 elements then push its smallest
element up to the parent and move intervening element down to the node where
the key is deleted.
4. If neither of the sibling contain more than m/2 elements then create a new leaf node by
joining two leaf nodes and the intervening element of the parent node.
5. If parent is left with less than m/2 nodes then, apply the above process on the parent too.

If the the node which is to be deleted is an internal node, then replace the node with its in-order
successor or predecessor. Since, successor or predecessor will always be on the leaf node hence,
the process will be similar as the node is being deleted from the leaf node.

Example 1

Delete the node 53 from the B Tree of order 5 shown in the following figure.

53 is present in the right child of element 49. Delete it.

Now, 57 is the only element which is left in the node, the minimum number of elements that
must be present in a B tree of order 5, is 2. it is less than that, the elements in its left and right
sub-tree are also not sufficient therefore, merge it with the left sibling and intervening element
of parent i.e. 49.
The final B tree is shown as follows.

Application of B tree

B tree is used to index the data and provides fast access to the actual data stored on the disks
since, the access to value stored in a large database that is stored on a disk is a very time
consuming process.

Searching an un-indexed and unsorted database containing n key values needs O(n) running
time in worst case. However, if we use B Tree to index this database, it will be searched in
O(log n) time in worst case.

Heap:
A Heap is a special Tree-based data structure in which the tree is a complete binary tree.
Operations of Heap Data Structure:
 Heapify: a process of creating a heap from an array.
 Insertion: process to insert an element in existing heap time complexity O(log N).
 Deletion: deleting the top element of the heap or the highest priority element, and then
organizing the heap and returning the element with time complexity O(log N).
 Peek: to check or find the first (or can say the top) element of the heap.

Types of Heap Data Structure

Generally, Heaps can be of two types:

1. Max-Heap: In a Max-Heap the key present at the root node must be greatest among the
keys present at all of it’s children. The same property must be recursively true for all
sub-trees in that Binary Tree.
2. Min-Heap: In a Min-Heap the key present at the root node must be minimum among the
keys present at all of it’s children. The same property must be recursively true for all
sub-trees in that Binary Tree.

Insertion in the Heap tree

44, 33, 77, 11, 55, 88, 66

Suppose we want to create the max heap tree. To create the max heap tree, we need
to consider the following two cases:
o First, we have to insert the element in such a way that the property of the
complete binary tree must be maintained.
o Secondly, the value of the parent node should be greater than the either of its
child.

Step 1: First we add the 44 element in the tree as shown below:

Step 2: The next element is 33. As we know that insertion in the binary tree always
starts from the left side so 44 will be added at the left of 33 as shown below:

Step 3: The next element is 77 and it will be added to the right of the 44 as shown
below:

As we can observe in the above tree that it does not satisfy the max heap property, i.e.,
parent node 44 is less than the child 77. So, we will swap these two values as shown
below:
Step 4: The next element is 11. The node 11 is added to the left of 33 as shown below:

Step 5: The next element is 55. To make it a complete binary tree, we will add the node
55 to the right of 33 as shown below:

As we can observe in the above figure that it does not satisfy the property of the max
heap because 33<55, so we will swap these two values as shown below:

Step 6: The next element is 88. The left subtree is completed so we will add 88 to the
left of 44 as shown below:

As we can observe in the above figure that it does not satisfy the property of the max
heap because 44<88, so we will swap these two values as shown below:
Again, it is violating the max heap property because 88>77 so we will swap these two
values as shown below:

Step 7: The next element is 66. To make a complete binary tree, we will add the 66
element to the right side of 77 as shown below:

In the above figure, we can observe that the tree satisfies the property of max heap;
therefore, it is a heap tree.

Deletion in Heap Tree

In Deletion in the heap tree, the root node is always deleted and it is replaced with the
last element.

Let's understand the deletion through an example.

Step 1: In the above tree, the first 30 node is deleted from the tree and it is replaced
with the 15 element as shown below:

Now we will heapify the tree. We will check whether the 15 is greater than either of its
child or not. 15 is less than 20 so we will swap these two values as shown below:

Again, we will compare 15 with its child. Since 15 is greater than 10 so no swapping will
occur.

Disjoint set
The disjoint set data structure is also known as union-find data structure and merge-
find set. It is a data structure that contains a collection of disjoint or non-overlapping
sets. The disjoint set means that when the set is partitioned into the disjoint subsets.
The various operations can be performed on the disjoint subsets. In this case, we can
add new sets, we can merge the sets, and we can also find the representative member
of a set. It also allows to find out whether the two elements are in the same set or not
efficiently.

The disjoint set can be defined as the subsets where there is no common element
between the two sets. Let's understand the disjoint sets through an example.
s1 = {1, 2, 3, 4}

s2 = {5, 6, 7, 8}

When the union operation is applied, the set would be represented as:

s1Us2 = {1, 2, 3, 4, 5, 6, 7, 8}

Suppose we add one more edge between 1 and 5. Now the final set can be represented
as:

s3 = {1, 2, 3, 4, 5, 6, 7, 8}

How can we detect a cycle in a graph?

We will understand this concept through an example. Consider the below example to
detect a cycle with the help of using disjoint sets.
U = {1, 2, 3, 4, 5, 6, 7, 8}

Each vertex is labelled with some weight. There is a universal set with 8 vertices. We will
consider each edge one by one and form the sets.

First, we consider vertices 1 and 2. Both belong to the universal set; we perform the
union operation between elements 1 and 2. We will add the elements 1 and 2 in a set
s1 and remove these two elements from the universal set shown below:

s1 = {1, 2}

The vertices that we consider now are 3 and 4. Both the vertices belong to the universal
set; we perform the union operation between elements 3 and 4. We will form the set
s3 having elements 3 and 4 and remove the elements from the universal set shown as
below:

s2 = {3, 4}

Fibonacci Heap
Heaps are the abstract data type which is used to show the relationship between parents and
children: Heap is categorized into Min-Heap and Max-Heap:
1. A min-heap is a tree in which, for all the nodes, the key value of the parent must be
smaller than the key value of the children.
2. A max-heap is a tree in which, for all the nodes, the key value of the parent must be
greater than the key value of the children.
Define Fibonacci Heap:

Fibonacci Heap - A Fibonacci heap is defined as the collection of rooted-tree in which all the
trees must hold the property of Min-heap. That is, for all the nodes, the key value of the parent
node should be greater than the key value of the parent node:

Structure
A Fibonacci heap is a collection of trees satisfying the minimum-heap property, that is, the key
of a child is always greater than or equal to the key of the parent. This implies that the minimum
key is always at the root of one of the trees.

Implementation of operations
Operation find minimum is now trivial because we keep the pointer to the node containing it.
It does not change the potential of the heap, therefore both actual and amortized cost are
constant.
As mentioned above, merge is implemented simply by concatenating the lists of tree roots of
the two heaps. This can be done in constant time and the potential does not change, leading
again to constant amortized time.
Operation insert works by creating a new heap with one element and doing merge. This takes
constant time, and the potential increases by one, because the number of trees increases. The
amortized cost is thus still constant.
Operation extract minimum (same as delete minimum) operates in three phases. First we take
the root containing the minimum element and remove it. Its children will become roots of new
trees. If the number of children was d, it takes time O(d) to process all new roots and the
potential increases by d−1. Therefore, the amortized running time of this phase is O(d)
= O(log n).
Operation decrease key will take the node, decrease the key and if the heap property becomes
violated (the new key is smaller than the key of the parent), the node is cut from its parent. If
the parent is not a root, it is marked. If it has been marked already, it is cut as well and its
parent is marked. We continue upwards until we reach either the root or an unmarked node.
Now we set the minimum pointer to the decreased value if it is the new minimum. In the
process we create some number, say k, of new trees. Each of these new trees except possibly
the first one was marked originally but as a root it will become unmarked. One node can
become marked. Therefore, the number of marked nodes changes by −(k − 1) + 1 = − k + 2.
Combining these 2 changes, the potential changes by 2(−k + 2) + k = −k + 4. The actual time
to perform the cutting was O(k), therefore (again with a sufficiently large choice of c) the
amortized running time is constant.
Finally, operation delete can be implemented simply by decreasing the key of the element to
be deleted to minus infinity, thus turning it into the minimum of the whole heap. Then we call
extract minimum to remove it. The amortized running time of this operation is O(log n).

Proof of degree bounds

The amortized performance of a Fibonacci heap depends on the degree (number of children)
of any tree root being O(log n), where n is the size of the heap. Here we show that the size of
the (sub)tree rooted at any node x of degree d in the heap must have size at least Fd+2,
where Fk is the kth Fibonacci number. The degree bound follows from this and the fact (easily

proved by induction) that for all integers , where . (We then have , and

taking the log to base of both sides gives as required.)

Consider any node x somewhere in the heap (x need not be the root of one of the main trees).
Define size(x) to be the size of the tree rooted at x (the number of descendants of x,
including x itself). We prove by induction on the height of x (the length of a longest simple
path from x to a descendant leaf), that size(x) ≥ Fd+2, where d is the degree of x.
Base case: If x has height 0, then d = 0, and size(x) = 1 = F2.
Properties of Fibonacci Heap:

1. It can have multiple trees of equal degrees, and each tree doesn't need to have 2^k nodes.
2. All the trees in the Fibonacci Heap are rooted but not ordered.
3. All the roots and siblings are stored in a separated circular-doubly-linked list.
4. The degree of a node is the number of its children. Node X -> degree = Number of X's
children.
5. Each node has a mark-attribute in which it is marked TRUE or FALSE. The FALSE
indicates the node has not any of its children. The TRUE represents that the node has
lost one child. The newly created node is marked FALSE.
6. The potential function of the Fibonacci heap is F(FH) = t[FH] + 2 * m[FH]
7. The Fibonacci Heap (FH) has some important technicalities listed below:
1. min[FH] - Pointer points to the minimum node in the Fibonacci Heap
2. n[FH] - Determines the number of nodes
3. t[FH] - Determines the number of rooted trees
4. m[FH] - Determines the number of marked nodes
5. F(FH) - Potential Function.
UNIT III – GRAPHS

Elementary Graph Algorithms: Representations of Graphs – Breadth-First Search – Depth-First

Search – Topological Sort – Strongly Connected Components- Minimum Spanning Trees:
Growing a Minimum Spanning Tree – Kruskal and Prim- Single-Source Shortest Paths: The
Bellman-Ford algorithm – Single-Source Shortest paths in Directed Acyclic Graphs –
Dijkstra„s Algorithm; Dynamic Programming - All-Pairs Shortest Paths: Shortest Paths and
Matrix Multiplication – The Floyd-Warshall Algorithm
Elementary Graph Algorithms:
This method for representing a graph and for searching a graph. Searching a graph means
systematically following the edges of the graph so as to visit the vertices of the graph. A
graph-searching algorithm can discover much about the structure of a graph. Many
algorithms begin by searching their input graph to obtain this structural information. Other
graph algorithms are organized as simple elaborations of basic graph-searching algorithms.
Techniques for searching a graph are at the heart of the field of graph algorithms.
Representations of graphs
There are two standard ways to represent a graph G = (V, E): as a collection of adjacency
lists or as an adjacency matrix.
The adjacency-list representation of a graph G = (V, E) consists of an array Adj of |V| lists,
one for each vertex in V. For each u V, the adjacency list Adj[u] contains (pointers to) all
the vertices v such that there is an edge (u,v) E. That is, Adj[u] consists of all the vertices
adjacent to u in G.

Figure 23.1 Two representations of an undirected graph. (a) An undirected graph

G having five vertices and seven edges. (b) An adjacency-list representation of G.
(c) The adjacency-matrix representation of G.
Figure 23.2 Two representations of a directed graph. (a) A directed graph G
having six vertices and eight edges. (b) An adjacency-list representation of G. (c)
The adjacency-matrix representation of G.

Adjacency lists can readily be adapted to represent weighted graphs, that is, graphs for which
each edge has an associated weight, typically given by a weight function w : E R. For
example, let G = (V, E) be a weighted graph with weight function w. The weight w(u,v) of the
edge (u,v) E is simply stored with vertex v in u's adjacency list. The adjacency-list
representation is quite robust in that it can be modified to support many other graph variants.

A potential disadvantage of the adjacency-list representation is that there is no quicker way to

determine if a given edge (u,v) is present in the graph than to search for v in the adjacency
list Adj[u]. This disadvantage can be remedied by an adjacency-matrix re presentation of the
graph, at the cost of using asymptotically more memory.

For the adjacency-matrix representation of a graph G = (V, E), we assume that the vertices are
numbered 1, 2, . . . , |V| in some arbitrary manner. The adjacency-matrix representation of a
graph G then consists of a |V| |V| matrix A = (aij) such that

Breadth-first search
Breadth-first search is one of the simplest algorithms for searching a graph and the archetype
for many important graph algorithms. Dijkstra's single-source shortest-paths algorithm
(Chapter 25) and Prim's minimum-spanning-tree algorithm (Section 24.2) use ideas similar
to those in breadth-first search.
BFS(G,s)
1 for each vertex u V[G] - {s}
2 do color[u] WHITE
3 d[u]
4 [u] NIL
5 color[s] GRAY
6 d[s] 0
7 [s] NIL
8 Q {s}
9 while Q
10 do u head[Q]
11 for each v Adj[u]
12 do if color[v] = WHITE
13 then color[v] GRAY
14 d[v] d[u] + 1
15 [v] u
16 ENQUEUE(Q,v)
17 DEQUEUE(Q)
18 color[u] BLACK

Figure 23.3 illustrates the progress of BFS on a sample graph.

The procedure BFS works as follows. Lines 1-4 paint every vertex white, set d [u]
to be infinity for every vertex u, and set the parent of every vertex to be NIL. Line 5
paints the source vertex s gray, since it is considered to be discovered when the
procedure begins. Line 6 initializes d[s] to 0, and line 7 sets the predecessor of the
source to be NIL. Line 8 initializes Q to the queue containing just the vertex s;
thereafter, Q always contains the set of gray vertices.

The main loop of the program is contained in lines 9-18. The loop iterates as long
as there remain gray vertices, which are discovered vertices that have not yet had
their adjacency lists fully examined. Line 10 determines the gray vertex u at the
head of the queue Q. The for loop of lines 11-16 considers each vertex v in the
adjacency list of u. If v is white, then it has not yet been discovered, and the
algorithm discovers it by executing lines 13-16. It is first grayed, and its
distance d[v] is set to d[u] + 1. Then, u is recorded as its parent. Finally, it is placed
at the tail of the queue Q. When all the vertices on u's adjacency list have been
examined, u is removed from Q and blackened in lines 17-18.

Figure 23.3 The operation of BFS on an undirected graph.

Analysis

Before proving all the various properties of breadth-first search, we take on the somewhat
easier job of analyzing its running time on an input graph G = (V,E). After initialization, no
vertex is ever whitened, and thus the test in line 12 ensures that each vertex is enqueued at most
once, and hence dequeued at most once. The operations of enqueuing and dequeuing take O(1)
time, so the total time devoted to queue operations is O(V). Because the adjacency list of each
vertex is scanned only when the vertex is dequeued, the adjacency list of each vertex is scanned
at most once. Since the sum of the lengths of all the adjacency lists is (E), at most O(E) time
is spent in total scanning adjacency lists. The overhead for initialization is O(V), and thus the
total running time of BFS is O(V + E). Thus, breadth-first search runs in time linear in the size
of the adjacency- list representation of G.

Depth-first search
As in breadth-first search, whenever a vertex v is discovered during a scan of the adjacency list
of an already discovered vertex u, depth-first search records this event by setting v's
predecessor field [v] to u. Unlike breadth-first search, whose predecessor subgraph forms a
tree, the predecessor subgraph produced by a depth-first search may be composed of several
trees, because the search may be repeated from multiple sources. The predecessor subgraph of
a depth-first search is therefore defined slightly differently from that of a breadth-first search.

DFS(G)
1 for each vertex u V[G]
2 do color[u] WHITE
3 [u] NIL
4 time 0
5 for each vertex u V[G]
6 do if color[u] = WHITE
7 then DFS-VISIT(u)

Figure 23.4 illustrates the progress of DFS on the graph shown in Figure 23.2.

Procedure DFS works as follows. Lines 1-3 paint all vertices white and initialize their fields
to NIL. Line 4 resets the global time counter. Lines 5-7 check each vertex in V in turn and,
when a white vertex is found, visit it using DFS-VISIT. Every time DFS-VISIT(u) is called in
line 7, vertex u becomes the root of a new tree in the depth-first forest. When DFS returns,
every vertex u has been assigned a discovery time d[u] and a finishing time â[u].

In each call DFS-VISIT(u), vertex u is initially white. Line 1 paints u gray, and line 2 records
the discovery time d[u] by incrementing and saving the global variable time. Lines 3-6 examine
each vertex v adjacent to u and recursively visit v if it is white. As each vertex v Adj[u] is
considered in line 3, we say that edge (u, v) is explored by the depth-first search. Finally, after
every edge leaving u has been explored, lines 7-8 paint u black and record the finishing time
in â[u].

Figure 23.4 The progress of the depth-first-search algorithm DFS on a directed

graph.

Topological sort
A topological sort of a dag G = (V, E) is a linear ordering of all its vertices such
that if G contains an edge (u, v), then u appears before v in the ordering. (If
the graph is not acyclic, then no linear ordering is possible.) A topological sort
of a graph can be viewed as an ordering of its vertices along a horizontal line
so that all directed edges go from left to right. Topological sorting is thus
different from the usual kind of "sorting".

The following simple algorithm topologically sorts a dag.

TOPOLOGICAL-SORT(G)
1 call DFS(G) to compute finishing times f[v] for each vertex v
2 as each vertex is finished, insert it onto the front of a linked list
3 return the linked list of vertices

A directed graph G is acyclic if and only if a depth-first search of G yields no back

edges.
Proof : Suppose that there is a back edge (u, v). Then, vertex v is an ancestor of
vertex u in the depth-first forest. There is thus a path from v to u in G, and the back
edge (u, v) completes a cycle.

Figure 23.8 A dag for topological sorting.

Strongly connected components

Strongly connected component of a directed graph G = (V, E) is a maximal set of
vertices U V such that for every pair of vertices u and v in U, we have
both that is, vertices u and v are reachable from each other.
Our algorithm for finding strongly connected components of a graph G = (V, E) uses the
transpose of G, which is defined in Exercise 23.1-3 to be the graph GT = (V, ET), where ET =
{(u, v): (v, u) E}. That is, ET consists of the edges of G with their directions reversed. Given
an adjacency-list representation of G, the time to create GT is O(V + E).

STRONGLY-CONNECTED-COMPONENTS(G)
1 call DFS(G) to compute finishing times f[u] for each vertex u
2 compute GT
3 call DFS(GT), but in the main loop of DFS, consider the vertices
in order of decreasing f[u] (as computed in line 1)
4 output the vertices of each tree in the depth-first forest of step 3 as a
separate strongly connected component

Minimum Spanning Tree:

What is a Minimum Spanning Tree?

The cost of the spanning tree is the sum of the weights of all the edges in the tree. There can be many
spanning trees. Minimum spanning tree is the spanning tree where the cost is minimum among all the
spanning trees. There also can be many minimum spanning trees.
Minimum spanning tree has direct application in the design of networks. It is used in algorithms
approximating the travelling salesman problem, multi-terminal minimum cut problem and minimum-cost
weighted perfect matching. Other practical applications are:
Cluster Analysis
Handwriting recognition
Image segmentation

There are two famous algorithms for finding the Minimum Spanning Tree:

Kruskal’s Algorithm

Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing spanning tree.
Kruskal's algorithm follows greedy approach as in each iteration it finds an edge which has least weight and
add it to the growing spanning tree.
Algorithm Steps:
Sort the graph edges with respect to their weights.
Start adding edges to the MST from the edge with the smallest weight until the edge of the largest weight.
Only add edges which doesn't form a cycle , edges which connect only disconnected components.
So now the question is how to check if 2 vertices are connected or not ?
This could be done using DFS which starts from the first vertex, then check if the second vertex is visited or
not. But DFS will make time complexity large as it has an order of �(�+�) where � is the number of
vertices, � is the number of edges. So the best solution is "Disjoint Sets":
Disjoint sets are sets whose intersection is the empty set so it means that they don't have any element in
common.
Consider following example:
In Kruskal’s algorithm, at each iteration we will select the edge with the lowest weight. So, we will start
with the lowest weighted edge first i.e., the edges with weight 1. After that we will select the second lowest
weighted edge i.e., edge with weight 2. Notice these two edges are totally disjoint. Now, the next edge will
be the third lowest weighted edge i.e., edge with weight 3, which connects the two disjoint pieces of the
graph. Now, we are not allowed to pick the edge with weight 4, that will create a cycle and we can’t have
any cycles. So we will select the fifth lowest weighted edge i.e., edge with weight 5. Now the other two
edges will create cycles so we will ignore them. In the end, we end up with a minimum spanning tree with
total cost 11 ( = 1 + 2 + 3 + 5).

Prim’s Algorithm
Prim’s Algorithm also use Greedy approach to find the minimum spanning tree. In Prim’s
Algorithm we grow the spanning tree from a starting position. Unlike an edge in Kruskal's,
we add vertex to the growing spanning tree in Prim's.

Algorithm Steps:

 Maintain two disjoint sets of vertices. One containing vertices that are in the growing
spanning tree and other that are not in the growing spanning tree.
 Select the cheapest vertex that is connected to the growing spanning tree and is not in
the growing spanning tree and add it into the growing spanning tree. This can be done
using Priority Queues. Insert the vertices, that are connected to growing spanning tree,
into the Priority Queue.
 Check for cycles. To do that, mark the nodes which have been already selected and
insert only those nodes in the Priority Queue that are not marked.

Consider the example below:

In Prim’s Algorithm, we will start with an arbitrary node (it doesn’t matter which one) and
mark it. In each iteration we will mark a new vertex that is adjacent to the one that we have
already marked. As a greedy algorithm, Prim’s algorithm will select the cheapest edge and
mark the vertex. So we will simply choose the edge with weight 1. In the next iteration we have
three options, edges with weight 2, 3 and 4. So, we will select the edge with weight 2 and mark
the vertex. Now again we have three options, edges with weight 3, 4 and 5. But we can’t choose
edge with weight 3 as it is creating a cycle. So we will select the edge with weight 4 and we
end up with the minimum spanning tree of total cost 7 ( = 1 + 2 +4).

Time Complexity:
The time complexity of the Prim’s Algorithm is because each edge is inserted in the
priority queue only once and insertion in priority queue take logarithmic time.

Single-Source Shortest Paths:

The problem is also sometimes called the single-pair shortest path problem, to distinguish it from the
following variations:

 The single-source shortest path problem, in which we have to find shortest paths from a source
vertex v to all other vertices in the graph.
 The single-destination shortest path problem, in which we have to find shortest paths from all
vertices in the directed graph to a single destination vertex v. This can be reduced to the single-source
shortest path problem by reversing the arcs in the directed graph.
 The all-pairs shortest path problem, in which we have to find shortest paths between every pair of
vertices v, v' in the graph.

The Bellman-Ford algorithm

The single source shortest path algorithm (for arbitrary weight positive or negative) is also
known Bellman-Ford algorithm is used to find minimum distance from source vertex to any
other vertex. The main difference between this algorithm with Dijkstra’s algorithm is, in
Dijkstra’s algorithm we cannot handle the negative weight, but here we can handle it easily.

Bellman-Ford algorithm finds the distance in bottom up manner. At first it finds those
distances which have only one edge in the path. After that increase the path length to find all
possible solutions.
Input − The cost matrix of the graph:
06∞7∞
∞ 0 5 8 -4
∞ -2 0 ∞ ∞
∞ ∞ -3 0 9
2∞7∞0
Output − Source Vertex: 2Vert: 0 1 2 3 4Dist: -4 -2 0 3 -6Pred: 4 2 -1 0 1The graph has no
negative edge cycle
Algorithm
bellmanFord(dist, pred, source)
Input − Distance list, predecessor list and the source vertex.
Output − True, when a negative cycle is found.
Begin
iCount := 1
maxEdge := n * (n - 1) / 2 //n is number of vertices
for all vertices v of the graph, do
dist[v] := ∞
pred[v] := ϕ
done
dist[source] := 0
eCount := number of edges present in the graph
create edge list named edgeList
while iCount < n, do
for i := 0 to eCount, do
if dist[edgeList[i].v] > dist[edgeList[i].u] + (cost[u,v] for edge i)
dist[edgeList[i].v] > dist[edgeList[i].u] + (cost[u,v] for edge i)
pred[edgeList[i].v] := edgeList[i].u
done
done
iCount := iCount + 1
for all vertices i in the graph, do
if dist[edgeList[i].v] > dist[edgeList[i].u] + (cost[u,v] for edge i), then
return true
done
return false
End
Single-Source Shortest paths in Directed Acyclic Graphs
Dijkstra’s Algorithm – Single Source Shortest Path Algorithm
Dijkstra’s Algorithm is also known as Single Source Shortest Path (SSSP) problem. It is used
to find the shortest path from source node to destination node in graph.

The graph is widely accepted data structure to represent distance map. The distance between
cities effectively represented using graph.
 Dijkstra proposed an efficient way to find the single source shortest path from the
weighted graph. For a given source vertex s, the algorithm finds the shortest path to
every other vertex v in the graph.
 Assumption : Weight of all edges is non-negative.
 Steps of the Dijkstra’s algorithm are explained here:
1. Initializes the distance of source vertex to zero and remaining all other vertices to
infinity.

2. Set source node to current node and put remaining all nodes in the list of unvisited
vertex list. Compute the tentative distance of all immediate neighbour vertex of the current
node.

3. If the newly computed value is smaller than the old value, then update it.

For example, C is the current node, whose distance from source S is dist (S, C) = 5.

 Consider N is the neighbour of C and weight of edge

(C, N) is 3. So the distance of N from source via C would be 8.
 If the distance of N from source was already computed and if it is greater than 8 then
relax edge (S, N) and update it to 8, otherwise don’t update it.

d(S, N) = 11
d(S, N) = 7
d(S, C) + d(C, N) < d(S, N) ⇒ Relax
d(S, C) + d(C, N) > d(S, N) ⇒ Don’t
edge (S, N)
update d(S, N)
Update d(S, N) = 8
Weight updating in Dijkstra’s algorithm
4. When all the neighbours of a current node are explored, mark it as visited. Remove it
from unvisited vertex list. Mark the vertex from unvisited vertex list with minimum distance
and repeat the procedure.

5. Stop when the destination node is tested or when unvisited vertex list becomes empty.
Dynamic programming
Dynamic Programming (DP) is defined as a technique that solves some particular
type of problems in Polynomial Time. Dynamic Programming solutions are faster than
the exponential brute method and can be easily proved their correctness.
Characteristics of Dynamic Programming Algorithm:
 In general, dynamic programming (DP) is one of the most powerful techniques for
solving a certain class of problems.
 There is an elegant way to formulate the approach and a very simple thinking process,
and the coding part is very easy.
 Essentially, it is a simple idea, after solving a problem with a given input, save the
result as a reference for future use, so you won’t have to re-solve it.. briefly ‘Remember
your Past’ :).
 It is a big hint for DP if the given problem can be broken up into smaller sub-problems,
and these smaller subproblems can be divided into still smaller ones, and in this
process, you see some overlapping subproblems.
 Additionally, the optimal solutions to the subproblems contribute to the optimal
solution of the given problem (referred to as the Optimal Substructure Property).
 The solutions to the subproblems are stored in a table or array (memoization) or in a
bottom-up manner (tabulation) to avoid redundant computation.
 The solution to the problem can be constructed from the solutions to the subproblems.
 Dynamic programming can be implemented using a recursive algorithm, where the
solutions to subproblems are found recursively, or using an iterative algorithm, where
the solutions are found by working through the subproblems in a specific order.
Dynamic programming works on following principles:
 Characterize structure of optimal solution, i.e. build a mathematical model of the
solution.
 Recursively define the value of the optimal solution.
 Using bottom-up approach, compute the value of the optimal solution for each possible
subproblems.
 Construct optimal solution for the original problem using information computed in the
previous step.

Applications:
ynamic programming is used to solve optimization problems. It is used to solve many real-
life problems such as,
(i) Make a change problem
(ii) Knapsack problem
(iii) Optimal binary search tree

Floyd–Warshall algorithm
The Floyd–Warshall algorithm compares many possible paths through the graph between each
pair of vertices. It is guaranteed to find all shortest paths and is able to do this

with comparisons in a graph, even though there may be edges in the graph. It does
so by incrementally improving an estimate on the shortest path between two vertices, until the
estimate is optimal.

Example

Algorithm
Step 1 − Construct an adjacency matrix A with all the costs of edges present in the graph. If
there is no path between two vertices, mark the value as ∞.
Step 2 − Derive another adjacency matrix A1 from A keeping the first row and first column of
the original adjacency matrix intact in A1. And for the remaining values, say A1[i,j],
if A[i,j]>A[i,k]+A[k,j] then replace A1[i,j] with A[i,k]+A[k,j]. Otherwise, do not change the
values. Here, in this step, k = 1 (first vertex acting as pivot).
Step 3 − Repeat Step 2 for all the vertices in the graph by changing the k value for every pivot
vertex until the final matrix is achieved.
Step 4 − The final adjacency matrix obtained is the final solution with all the shortest paths.
Analysis
From the pseudocode above, the Floyd-Warshall algorithm operates using three for loops to
find the shortest distance between all pairs of vertices within a graph. Therefore, the time
complexity of the Floyd-Warshall algorithm is O(n3), where ‘n’ is the number of vertices in
the graph. The space complexity of the algorithm is O(n2).
UNIT IV - ALGORITHM DESIGN TECHNIQUES
Dynamic Programming: Matrix-Chain Multiplication – Elements of Dynamic Programming –
Longest Common Subsequence- Greedy Algorithms: – Elements of the Greedy Strategy- An
Activity-Selection Problem - Huffman Coding.

Dynamic programming
Dynamic programming approach is similar to divide and conquer in breaking down the
problem into smaller and yet smaller possible sub-problems. But unlike divide and conquer,
these sub-problems are not solved independently. Rather, results of these smaller sub-problems
are remembered and used for similar or overlapping sub-problems.
Mostly, dynamic programming algorithms are used for solving optimization problems. Before
solving the in-hand sub-problem, dynamic algorithm will try to examine the results of the
previously solved sub-problems. The solutions of sub-problems are combined in order to
achieve the best optimal final solution. This paradigm is thus said to be using Bottom-up
approach.
So we can conclude that −
 The problem should be able to be divided into smaller overlapping sub-problem.
 Final optimum solution can be achieved by using an optimum solution of smaller sub-
problems.
 Dynamic algorithms use memorization.
Steps of Dynamic Programming Approach
Dynamic Programming algorithm is designed using the following four steps −
 Characterize the structure of an optimal solution.
 Recursively define the value of an optimal solution.
 Compute the value of an optimal solution, typically in a bottom-up fashion.
 Construct an optimal solution from the computed information.

Examples
 Fibonacci number series
 Knapsack problem
 Tower of Hanoi
 All pair shortest path by Floyd-Warshall and Bellman Ford
 Shortest path by Dijkstra
 Project scheduling
 Matrix Chain Multiplication

Matrix Chain Multiplication

Matrix Chain Multiplication is an algorithm that is applied to determine the lowest cost way
for multiplying matrices. The actual multiplication is done using the standard way of
multiplying the matrices, i.e., it follows the basic rule that the number of rows in one matrix
must be equal to the number of columns in another matrix. Hence, multiple scalar
multiplications must be done to achieve the product.
To brief it further, consider matrices A, B, C, and D, to be multiplied; hence, the multiplication
is done using the standard matrix multiplication. There are multiple combinations of the
matrices found while using the standard approach since matrix multiplication is associative.
For instance, there are five ways to multiply the four matrices given above −
 (A(B(CD)))
 (A((BC)D))
 ((AB)(CD))
 ((A(BC))D)
 (((AB)C)D)
Input: arr[] = {40, 20, 30, 10, 30}
Output: 26000
Explanation:There are 4 matrices of dimensions 40×20, 20×30, 30×10, 10×30.
Let the input 4 matrices be A, B, C and D.
The minimum number of multiplications are obtained by
putting parenthesis in following way (A(BC))D.
The minimum is 20*30*10 + 40*20*10 + 40*10*30
Input: arr[] = {1, 2, 3, 4, 3}
Output: 30
Explanation: There are 4 matrices of dimensions 1×2, 2×3, 3×4, 4×3.
Let the input 4 matrices be A, B, C and D.
The minimum number of multiplications are obtained by
putting parenthesis in following way ((AB)C)D.
The minimum number is 1*2*3 + 1*3*4 + 1*4*3 = 30

Elements of Dynamic Programming

 Optimal Substructure
 Overlapping Sub-problems
 Variant: Memoization
Optimal Substructure: OS holds if optimal solution contains within it optimal solutions to sub
problems. In matrix-chain multiplication optimally doing A1, A2, A3, ...,An required
A 1...k and A k+1 ...n to be optimal. It is often easy to show the optimal sub problem
property as follows:
Split problem into sub-problems
Sub-problems must be optimal, otherwise the optimal splitting would not have been optimal.
There is usually a suitable "space" of sub-problems. Some spaces are more "natural" than
others.
For matrix-chain multiply we chose sub-problems as sub chains. We could have chosen all
arbitrary products, but that would have been much larger than necessary! DP based on that
would have to solve too many sub-problems.
A general way to investigate optimal substructure of a problem in DP is to look at optimal
sub-, sub-sub, etc. problems for structure. When we noticed that sub problems of A1, A2, A3,
...,An consisted of sub-chains, it made sense to use sub-chains of the form Ai, ..., Aj as the
"natural" space for sub-problems.
Overlapping Sub-problems: Space of sub-problems must be small: recursive solution re-
solves the same sub-problem many times. Usually there are polynomially many sub-
problems, and we revisit the same ones over and over again: overlapping sub-problems.
Dynamic Programming Solution for Matrix Chain Multiplication using Memoization:
Below is the recursion tree for the 2nd example of the above recursive approach:

If observed carefully you can find the following two properties:

1) Optimal Substructure: In the above case, we are breaking the bigger groups into
smaller subgroups and solving them to finally find the minimum number of multiplications.
Therefore, it can be said that the problem has optimal substructure property.
2) Overlapping Subproblems: We can see in the recursion tree that the same subproblems
are called again and again and this problem has the Overlapping Subproblems property.
So Matrix Chain Multiplication problem has both properties of a dynamic
programming problem. So recomputations of same subproblems can be avoided by
constructing a temporary array dp[][] in a bottom up manner.
Follow the below steps to solve the problem:
 Build a matrix dp[][] of size N*N for memoization purposes.
 Use the same recursive call as done in the above approach:
 When we find a range (i, j) for which the value is already calculated, return
the minimum value for that range (i.e., dp[i][j]).
 Otherwise, perform the recursive calls as mentioned earlier.
 The value stored at dp[0][N-1] is the required answer.

Longest Common Subsequence (LCS)

A longest common subsequence (LCS) is defined as the longest subsequence which is
common in all given input sequences.
Examples:
Input: S1 = “AGGTAB”, S2 = “GXTXAYB”
Output: 4
Explanation: The longest subsequence which is present in both strings is “GTAB”.
Input: S1 = “BD”, S2 = “ABCD”
Output: 2
Explanation: The longest subsequence which is present in both strings is “BD”.
Recursive Approach for LCS:
Generate all the possible subsequences and find the longest among them that is present in
both strings using recursion.
Follow the below steps to implement the idea:
 Create a recursive function [say lcs()].
 Check the relation between the First characters of the strings that are not yet processed.

Greedy Algorithms

A greedy algorithm, as the name suggests, always makes the choice that seems to be the best
at that moment. This means that it makes a locally-optimal choice in the hope that this choice
will lead to a globally-optimal solution.

Assume that you have an objective function that needs to be optimized (either maximized or
minimized) at a given point. A Greedy algorithm makes greedy choices at each step to ensure
that the objective function is optimized. The Greedy algorithm has only one shot to compute
the optimal solution so that it never goes back and reverses the decision.

Greedy algorithms have some advantages and disadvantages:

1. It is quite easy to come up with a greedy algorithm (or even multiple greedy
algorithms) for a problem.
2. Analyzing the run time for greedy algorithms will generally be much easier than
for other techniques (like Divide and conquer). For the Divide and conquer technique,
it is not clear whether the technique is fast or slow. This is because at each level of
recursion the size of gets smaller and the number of sub-problems increases.
3. The difficult part is that for greedy algorithms you have to work much harder to
understand correctness issues. Even with the correct algorithm, it is hard to prove
why it is correct. Proving that a greedy algorithm is correct is more of an art than a
science. It involves a lot of creativity.

This is a simple Greedy-algorithm problem. In each iteration, you have to greedily select the
things which will take the minimum amount of time to complete while maintaining two
variables currentTime and numberOfThings. To complete the calculation, you must:

1. Sort the array A in a non-decreasing order.

2. Select each to-do item one-by-one.
3. Add the time that it will take to complete that to-do item into currentTime.
4. Add one to numberOfThings.

Repeat this as long as the currentTime is less than or equal to T.

Let A = {5, 3, 4, 2, 1} and T = 6

After sorting, A = {1, 2, 3, 4, 5}

After the 1st iteration:

 currentTime = 1
 numberOfThings = 1

After the 2nd iteration:

 currentTime is 1 + 2 = 3
 numberOfThings = 2

After the 3rd iteration:

 currentTime is 3 + 3 = 6
 numberOfThings = 3

After the 4th iteration, currentTime is 6 + 4 = 10, which is greater than T. Therefore, the
answer is 3.

Applications of Greedy Approach:

Greedy algorithms are used to find an optimal or near optimal solution to many real-life
problems. Few of them are listed below:
(1) Make a change problem
(2) Knapsack problem
(3) Minimum spanning tree
(4) Single source shortest path
(5) Activity selection problem
(6) Job sequencing problem
(7) Huffman code generation.
(8) Dijkstra’s algorithm
(9) Greedy coloring
(10) Minimum cost spanning tree
(11) Job scheduling
(12) Interval scheduling
(13) Greedy set cover
(14) Knapsack with fractions
Advantages of the Greedy Approach:
 The greedy approach is easy to implement.
 Typically have less time complexity.
 Greedy algorithms can be used for optimization purposes or finding close to
optimization in case of Hard problems.
 Greedy algorithms can produce efficient solutions in many cases, especially when the
problem has a substructure that exhibits the greedy choice property.
 Greedy algorithms are often faster than other optimization algorithms, such as dynamic
programming or branch and bound, because they require less computation and memory.
 The greedy approach is often used as a heuristic or approximation algorithm when an
exact solution is not feasible or when finding an exact solution would be too time-
consuming.
 The greedy approach can be applied to a wide range of problems, including problems in
computer science, operations research, economics, and other fields.
 The greedy approach can be used to solve problems in real-time, such as scheduling
problems or resource allocation problems, because it does not require the solution to be
computed in advance.
 Greedy algorithms are often used as a first step in solving optimization problems,
because they provide a good starting point for more complex optimization algorithms.
 Greedy algorithms can be used in conjunction with other optimization algorithms, such
as local search or simulated annealing, to improve the quality of the solution.
Disadvantages of the Greedy Approach:
 The local optimal solution may not always be globally optimal.
 Greedy algorithms do not always guarantee to find the optimal solution, and may
produce suboptimal solutions in some cases.
 The greedy approach relies heavily on the problem structure and the choice of criteria
used to make the local optimal choice. If the criteria are not chosen carefully, the
solution produced may be far from optimal.
 Greedy algorithms may require a lot of preprocessing to transform the problem into a
form that can be solved by the greedy approach.
 Greedy algorithms may not be applicable to problems where the optimal solution
depends on the order in which the inputs are processed.
 Greedy algorithms may not be suitable for problems where the optimal solution depends
on the size or composition of the input, such as the bin packing problem.
 Greedy algorithms may not be able to handle constraints on the solution space, such as
constraints on the total weight or capacity of the solution.
 Greedy algorithms may be sensitive to small changes in the input, which can result in
large changes in the output. This can make the algorithm unstable and unpredictable in
some cases.
Standard Greedy Algorithms :
 Prim’s Algorithm
 Kruskal’s Algorithm
 Dijkstra’s Algorithm

Elements of the Greedy Strategy

Optimal Substructure:
An optimal solution to the problem contains within it optimal solutions to sub-problems. A'
= A - {1} (greedy choice) A' can be solved again with the greedy algorithm. S' = { i � S,
si � fi }
When do you use DP versus a greedy approach? Which should be faster?
The 0 - 1 knapsack problem:
A thief has a knapsack that holds at most W pounds. Item i : ( vi, wi ) ( v = value, w = weight
) thief must choose items to maximize the value stolen and still fit into the knapsack. Each
item must be taken or left ( 0 - 1 ).
Fractional knapsack problem:
takes parts, as well as wholes
Both the 0 - 1 and fractional problems have the optimal substructure property: Fractional: vi /
wi is the value per pound. Clearly you take as much of the item with the greatest value per
pound. This continues until you fill the knapsack. Optimal (Greedy) algorithm takes O ( n lg
n ), as we must sort on vi / wi = di.
Consider the same strategy for the 0 - 1 problem:
W = 50 lbs. (maximum knapsack capacity)
w1 = 10 v1 = 60 d1.= 6
w2 = 20 v2 = 100 d2.= 5
w3 = 30 v3 = 120 d3 = 4
were d is the value density
Greedy approach: Take all of 1, and all of 2: v1+ v2 = 160, optimal solution is to take all of 2
and 3: v2 + v3= 220, other solution is to take all of 1 and 3 v1+ v3 = 180. All below 50 lbs.
When solving the 0 - 1 knapsack problem, empty space lowers the effective d of the load.
Thus each time an item is chosen for inclusion we must consider both
i included
i excluded
These are clearly overlapping sub-problems for different i's and so best solved by DP!
Huffman Coding.
Huffman coding is a lossless data compression algorithm. The idea is to assign variable-
length codes to input characters, lengths of the assigned codes are based on the frequencies
of corresponding characters.
The variable-length codes assigned to input characters are Prefix Codes, means the codes
(bit sequences) are assigned in such a way that the code assigned to one character is not the
prefix of code assigned to any other character. This is how Huffman Coding makes sure
that there is no ambiguity when decoding the generated bitstream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c
and d, and their corresponding variable length codes be 00, 01, 0 and 1. This coding leads
to ambiguity because code assigned to c is the prefix of codes assigned to a and b. If the
compressed bit stream is 0001, the de-compressed output may be “cccd” or “ccb” or “acd”
or “ab”.
See this for applications of Huffman Coding.
There are mainly two major parts in Huffman Coding
1. Build a Huffman Tree from input characters.
2. Traverse the Huffman Tree and assign codes to characters.

Algorithm:

The method which is used to construct optimal prefix code is called Huffman coding.
This algorithm builds a tree in bottom up manner. We can denote this tree by T
Let, |c| be number of leaves
|c| -1 are number of operations required to merge the nodes. Q be the priority queue which
can be used while constructing binary heap.
Algorithm Huffman (c)
{
n= |c|
Q=c
for i<-1 to n-1

do
{

temp <- get node ()

left (temp] Get_min (Q) right [temp] Get Min (Q)

a = left [templ b = right [temp]

F [temp]<- f[a] + [b]

insert (Q, temp)

return Get_min (0)

}
Steps to build Huffman Tree
Input is an array of unique characters along with their frequency of occurrences and output
is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min
Heap is used as a priority queue. The value of frequency field is used to compare two
nodes in min heap. Initially, the least frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.

3. Create a new internal node with a frequency equal to the sum of the two nodes
frequencies. Make the first extracted node as its left child and the other extracted node
as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the
root node and the tree is complete.
Let us understand the algorithm with an example:
character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree
with single node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node
with frequency 5 + 9 = 14.
Illustration of step 2

Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each,
and one heap node is root of tree with 3 elements
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with
frequency 12 + 13 = 25

Illustration of step 3

Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each,
and two heap nodes are root of tree with more than one nodes
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14
+ 16 = 30
Illustration of step 4

Now min heap contains 3 nodes.

character Frequency
Internal Node 25
Internal Node 30
f 45
Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 25
+ 30 = 55

Illustration of step 5

Now min heap contains 2 nodes.

character Frequency
f 45
Internal Node 55
Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45
+ 55 = 100
Illustration of step 6

Now min heap contains only one node.

character Frequency
Internal Node 100
Since the heap contains only one node, the algorithm stops here.
Steps to print codes from Huffman Tree:
Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving
to the left child, write 0 to the array. While moving to the right child, write 1 to the array.
Print the array when a leaf node is encountered.

Steps to print code from HuffmanTree

The codes are as follows:

character code-word
f 0
c 100
d 101
a 1100
b 1101
e 111
UNIT V - NP COMPLETE AND NP HARD
NP-Completeness: Polynomial Time – Polynomial-Time Verification – NP- Completeness and
Reducibility – NP-Completeness Proofs – NP-Complete Problems.

NP-Completeness:
Show that may of the problems with no polynomial time algorithms are computational time
algorithms are computationally related.
There are two classes of non-polynomial time problems
1. NP-Hard
2. NP-Complete
NP Hard and NP-Complete
A problem is in the class NPC if it is in NP and is as hard as any problem in NP. A problem
is NP-hard if all problems in NP are polynomial time reducible to it, even though it may not
be in NP itself.

If a polynomial time algorithm exists for any of these problems, all problems in NP would be
polynomial time solvable. These problems are called NP-complete. The phenomenon of NP-
completeness is important for both theoretical and practical reasons.
Definition of NP-Completeness
A language B is NP-complete if it satisfies two conditions
 B is in NP
 Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the language B is
known as NP-Hard. Informally, a search problem B is NP-Hard if there exists some NP-
Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a problem is
proved to be NPC, there is no need to waste time on trying to find an efficient algorithm for it.
Instead, we can focus on design approximation algorithm.
NP-Complete Problems
Following are some NP-Complete problems, for which no polynomial time algorithm is
known.

 Determining whether a graph has a Hamiltonian cycle

 Determining whether a Boolean formula is satisfiable, etc.
NP-Hard Problems
The following problems are NP-Hard

 The circuit-satisfiability problem

 Set Cover
 Vertex Cover
 Travelling Salesman Problem
In this context, now we will discuss TSP is NP-Complete
TSP is NP-Complete
The traveling salesman problem consists of a salesman and a set of cities. The salesman has to
visit each one of the cities starting from a certain one and returning to the same city. The
challenge of the problem is that the traveling salesman wants to minimize the total length of
the trip
Proof
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In TSP, we find
a tour and check that the tour contains each vertex once. Then the total cost of the edges of the
tour is calculated. Finally, we check if the cost is minimum. This can be completed in
polynomial time. Thus TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show
that Hamiltonian cycle ≤p TSP (as we know that the Hamiltonian cycle problem is
NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V, E'), where

NP Completeness

P, NP, NP-Complete and NP-Hard are sets of problems, defined as follows:

 P: problems whose solution is polynomial time in the size of their inputs.

 NP: problems whose solutions can be verified in polynomial time. (NP stands
for non-deterministic polynomial time).
 NP-Complete: A collection of problems in NP whose solutions may or may not
polynomial time. We don't know. However, if we can prove that one of them may be
solved in polynomial time, then all of them can.
 NP-Hard: A collection of problems that do not have to be in NP, whose solutions are
at least as hard as the NP-Complete problems. If a problem is in NP, and it's NP hard,
then it is also NP-Complete.
In this lecture, we are going to see what it takes to prove that problems belong to these sets.
Suppose you have a problem to solve, and you want to know its complexity class. This takes
two steps:

1. Prove that it is in NP. Typically the problem is couched as a yes or no problem

involving a data structure, such as ``does there exist a simple cycle through a given
directed graph that visits all the nodes?'' To prove it is in NP, you need to show that
a yes solution can be checked in polynomial time. In the above example, you can
check to see if a given path through the graph is indeed a simple cycle in linear time.
Therefore, the problem is in NP. You don't have to prove anything about
the no solutions, and you don't have to prove anything about how you'd calculate a
solution.
2. Transform a known NP-Complete problem to this one in polynomial time. Suppose
the problem in question is Q, and that L is a well-known NP-Complete problem like
the 3-satisfiability problem. You need to show that if you have any instance of
problem L, you can transform it into an instance of problem Q in polynomial time.
Thus, if you could solve problem Q in polynomial time, you could solve problem L in
polynomial time.

If you can do both of these things, then you have proved that a problem is NP-Complete. If
you can prove that either of these things cannot be done, then you have proved that a problem
is not NP-Complete. Sometimes you can't come up with good proofs, and you just don't
know.

The complexity classes P and NP-Hard may be put in terms of the above:

 P: If we can prove that the solution to a problem may be calculated in polynomial

time, then the problem is in P. All of the algorithms that we have studied in this class,
with the exception of enumeration, are in P.
 NP-Hard: These are problems that are at least as hard to solve as NP-Complete
problems. If they are in NP, then they are NP-complete problems. We prove that a
problem is NP-Hard by performing the transformation in step 2 of a known NP-
Complete problem the problem at hand. That is how we demonstrate that they are "at
least as hard to solve as NP-Complete problems."

3-SAT - A Canonical NP-Complete Problem

3-SAT is a very simple NP-Complete problem. You are given a boolean expression, which is
a big AND (∧) of clauses:

E = C0 ∧ C1 ∧ ... ∧ Cm-1

Each clause Ci is the OR (∨) of three literals, where a literal is either a variable xi or the
negation of a variable ¬ xi (or sometimes the negation of a is denoted a). Here is an example
with three clauses and three variables. To make it easier to read, I'm simply calling the
variables a, b and c .

E=(a∨b∨c)∧(a∨b∨c)∧(a∨b∨c)
Given this definition, 3-SAT is simple -- is there an assignment of the variables so that E is
true? In the above example, it's easy to find such an assignment. For example, set a and c to
TRUE and b to FALSE (I'm coloring the true statements red -- you can see that there is
always at least one TRUE in each clause).

E=(a∨b∨c)∧(a∨b∨c)∧(a∨b∨c)

In general, 3-SAT can be a very difficult problem to solve. Here's a harder example with
seven clauses and four variables.

E=(a∨b∨c)∧(a∨b∨d)∧(a∨c∨d)∧(b∨c∨d)∧(a∨b∨c)∧(b∨c∨d)∧(
b∨c∨d)

One correct assignment is setting a and c to FALSE, and b and d to TRUE:

E=(a∨b∨c)∧(a∨b∨d)∧(a∨c∨d)∧(b∨c∨d)∧(a∨b∨c)∧(b∨c∨d)∧
(b∨c∨d)

From our lecture notes on enumeration, we can answer whether an instance of 3-SAT is true
or false with a simple power set enumeration. That enumerates all possible true/false settings
of the literals, and for each setting, you can test to see whether the expression is true. Of
course, if there are n literals, the power set enumeration will enumerate 2n settings, so this is
definitely not polynomial time.

Is there a polynomial time solution? No one knows.

It is an easy matter to prove that 3-SAT is in NP. How many different clauses can there be?
(4/3) * n * (n-1) * (n-2) -- we'll go over that in class. That's a polynomial of n. If we have a
solution, we can test its validity by simply setting the variables and seeing if E is true. That
test is polynomial time, so 3-SAT is in NP.

As for proving that 3-SAT is NP-Complete, that is well beyond the scope of this class.
However, 3-SAT is a very popular problem for proving that other problems are NP-
Complete.

The yellow nodes are an independent set of size 5. There is no independent set of size 6.

Here's how we use 3-SAT to prove that ISDP is NP-Complete.

First, prove it's in NP: If you give me a set of k vertices, I can easily check to verify that there
are no edges between two nodes in k. That will be O(|E|) in the worst case, which is most
definitely polynomial in |V|.

Next, I need to figure out how to take an instance of 3-SAT, and convert it into an instance of
ISDP, so that if you can solve the ISDP instance in polynomial time, then you can solve the
instance of 3-SAT in polynomial time. Here's one way:

 Turn each clause into three nodes, and label the nodes with their literals (including the
not). Add an edge between each of these nodes.
 For every pair of nodes with the same, but negated, literals, add an edge between that
pair of nodes.
 Any independent set of size k=n will correspond to an assignment of the literals for
which the 3-SAT expression is true.

Here's the simple three-clause 3-SAT problem above, converted to a graph, with an example
3-node independent set colored magenta. You'll note that the set corresponds to a setting of
the variables that makes the 3-SAT equation true:

Below, I also convert the more complicated 7-node expression to a graph for the ISDP
problem. I have the clauses clumped together going clockwise around the graph, starting at
roughly 1:00. I also have colored inter-clause edges according to the literals that they
connect:
I've colored the nodes in the Independent Set gray. You should be able to verify that:

 The set is indeed independent.

 The assignment of literals makes the expression true.

Polynomial-Time Verification:

Before talking about the class of NP-complete problems, it is essential to introduce the notion
of a verification algorithm.

Many problems are hard to solve, but they have the property that it easy to authenticate the
solution if one is provided.

Hamiltonian cycle problem:-

Consider the Hamiltonian cycle problem. Given an undirected graph G, does G have a cycle
that visits each vertex exactly once? There is no known polynomial time algorithm for this
dispute.

Fig: Hamiltonian Cycle

Let us understand that a graph did have a Hamiltonian cycle. It would be easy for someone to
convince of this. They would similarly say: "the period is hv3, v7, v1....v13i.

We could then inspect the graph and check that this is indeed a legal cycle and that it visits all
of the vertices of the graph exactly once. Thus, even though we know of no efficient way to
solve the Hamiltonian cycle problem, there is a beneficial way to verify that a given cycle is
indeed a Hamiltonian cycle.

Definition of Certificate: - A piece of information which contains in the given path of a vertex
is known as certificate

Relation of P and NP classes

1. P contains in NP
2. P=NP

1. Observe that P contains in NP. In other words, if we can solve a problem in polynomial
time, we can indeed verify the solution in polynomial time. More formally, we do not
need to see a certificate (there is no need to specify the vertex/intermediate of the
specific path) to solve the problem; we can explain it in polynomial time anyway.
2. However, it is not known whether P = NP. It seems you can verify and produce an
output of the set of decision-based problems in NP classes in a polynomial time which
is impossible because according to the definition of NP classes you can verify the
solution within the polynomial time. So this relation can never be held.

Reductions:

The class NP-complete (NPC) problems consist of a set of decision problems (a subset of class
NP) that no one knows how to solve efficiently. But if there were a polynomial solution for
even a single NP-complete problem, then every problem in NPC will be solvable in polynomial
time. For this, we need the concept of reductions.

Suppose there are two problems, A and B. You know that it is impossible to solve problem A
in polynomial time. You want to prove that B cannot be explained in polynomial time. We
want to show that (A ∉ P) => (B ∉ P)

Consider an example to illustrate reduction: The following problem is well-known to be NPC:

3-color: Given a graph G, can each of its vertices be labeled with one of 3 different colors such
that two adjacent vertices do not have the same label (color).

Coloring arises in various partitioning issues where there is a constraint that two objects cannot
be assigned to the same set of partitions. The phrase "coloring" comes from the original
application which was in map drawing. Two countries that contribute a common border should
be colored with different colors.

It is well known that planar graphs can be colored (maps) with four colors. There exists a
polynomial time algorithm for this. But deciding whether this can be done with 3 colors is hard,
and there is no polynomial time algorithm for it.
Fig: Example of 3-colorable and non-3-colorable graphs.

Polynomial Time Reduction:

We say that Decision Problem L1 is Polynomial time Reducible to decision Problem

L2 (L1≤p L2) if there is a polynomial time computation function f such that of all x, xϵL1 if and
only if xϵL2.

NP-Completeness

A decision problem L is NP-Hard if

L' ≤p L for all L' ϵ NP.

Definition: L is NP-complete if

1. L ϵ NP and
2. L' ≤ p L for some known NP-complete problem L.' Given this formal definition, the
complexity classes are:

P: is the set of decision problems that are solvable in polynomial time.

NP: is the set of decision problems that can be verified in polynomial time.

NP-Hard: L is NP-hard if for all L' ϵ NP, L' ≤p L. Thus if we can solve L in polynomial time,
we can solve all NP problems in polynomial time.

NP-Complete L is NP-complete if

1. L ϵ NP and
2. L is NP-hard

If any NP-complete problem is solvable in polynomial time, then every NP-Complete problem
is also solvable in polynomial time. Conversely, if we can prove that any NP-Complete
problem cannot be solved in polynomial time, every NP-Complete problem cannot be solvable
in polynomial time.

Reductions

Concept: - If the solution of NPC problem does not exist then the conversion from one NPC
problem to another NPC problem within the polynomial time. For this, you need the concept
of reduction. If a solution of the one NPC problem exists within the polynomial time, then the
rest of the problem can also give the solution in polynomial time (but it's hard to believe). For
this, you need the concept of reduction.
Example: - Suppose there are two problems, A and B. You know that it is impossible to solve
problem A in polynomial time. You want to prove that B cannot be solved in polynomial time.
So you can convert the problem A into problem B in polynomial time.

Example of NP-Complete problem

NP problem: - Suppose a DECISION-BASED problem is provided in which a set of

inputs/high inputs you can get high output.

Criteria to come either in NP-hard or NP-complete.

1. The point to be noted here, the output is already given, and you can verify the
output/solution within the polynomial time but can't produce an output/solution in
polynomial time.
2. Here we need the concept of reduction because when you can't produce an output of
the problem according to the given input then in case you have to use an emphasis on
the concept of reduction in which you can convert one problem into another problem.

Advanced Data Structures & Algorithms Syllabus
No ratings yet
Advanced Data Structures & Algorithms Syllabus
43 pages
Characteristics of Algorithms Explained
100% (1)
Characteristics of Algorithms Explained
8 pages
Understanding Algorithm Design and Analysis
No ratings yet
Understanding Algorithm Design and Analysis
8 pages
Understanding Algorithms and Their Analysis
No ratings yet
Understanding Algorithms and Their Analysis
12 pages
Algorithm - Asymptotic Notation
No ratings yet
Algorithm - Asymptotic Notation
9 pages
Understanding Algorithm Basics and Analysis
No ratings yet
Understanding Algorithm Basics and Analysis
10 pages
Understanding Algorithms and Analysis
No ratings yet
Understanding Algorithms and Analysis
11 pages
Basic Algorithm Concepts Explained
No ratings yet
Basic Algorithm Concepts Explained
20 pages
Training Notes
No ratings yet
Training Notes
152 pages
Understanding Algorithms and Complexity
No ratings yet
Understanding Algorithms and Complexity
265 pages
Essentials of Algorithms and Data Structures
No ratings yet
Essentials of Algorithms and Data Structures
7 pages
Algorithm and Analysis
No ratings yet
Algorithm and Analysis
42 pages
Understanding Algorithm Analysis Basics
No ratings yet
Understanding Algorithm Analysis Basics
46 pages
Algorithm Class Note 1 - 14738177 - 2023 - 03 - 11 - 11 - 37
No ratings yet
Algorithm Class Note 1 - 14738177 - 2023 - 03 - 11 - 11 - 37
3 pages
Data Structures & Algorithms Overview
No ratings yet
Data Structures & Algorithms Overview
32 pages
Algorithm Analysis and Complexity Concepts
No ratings yet
Algorithm Analysis and Complexity Concepts
67 pages
Introduction to Algorithm Analysis
No ratings yet
Introduction to Algorithm Analysis
46 pages
ADS Notes: Algorithms & Complexity
No ratings yet
ADS Notes: Algorithms & Complexity
44 pages
Data Structures & Algorithms Overview
No ratings yet
Data Structures & Algorithms Overview
37 pages
Understanding Algorithms and Their Complexity
No ratings yet
Understanding Algorithms and Their Complexity
23 pages
Understanding Algorithm Fundamentals
No ratings yet
Understanding Algorithm Fundamentals
6 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
35 pages
Data Structures - Algorithms Basics
No ratings yet
Data Structures - Algorithms Basics
3 pages
Design and Analysis of Algorithms Notes 1,3,4,5
No ratings yet
Design and Analysis of Algorithms Notes 1,3,4,5
125 pages
Introduction to Algorithms and Analysis
No ratings yet
Introduction to Algorithms and Analysis
62 pages
Understanding Algorithms: Concepts & Analysis
No ratings yet
Understanding Algorithms: Concepts & Analysis
14 pages
Algorithm Design and Analysis Basics
No ratings yet
Algorithm Design and Analysis Basics
53 pages
Understanding Algorithms and Complexity
No ratings yet
Understanding Algorithms and Complexity
28 pages
Algorithm Design and Analysis Guide
No ratings yet
Algorithm Design and Analysis Guide
29 pages
Daa Book
No ratings yet
Daa Book
160 pages
Understanding Algorithms and Their Complexity
No ratings yet
Understanding Algorithms and Their Complexity
4 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
153 pages
Understanding Algorithms and Their Analysis
No ratings yet
Understanding Algorithms and Their Analysis
5 pages
Csc3311-Algorithms and Complexity Analysis
No ratings yet
Csc3311-Algorithms and Complexity Analysis
56 pages
Algorithms Notes1
No ratings yet
Algorithms Notes1
12 pages
MODULE1 Notes Part1
100% (1)
MODULE1 Notes Part1
14 pages
Fundamentals of Algorithm Analysis
No ratings yet
Fundamentals of Algorithm Analysis
4 pages
DAA Week 01 - 1
No ratings yet
DAA Week 01 - 1
35 pages
Advanced Data Structures & Algorithms
No ratings yet
Advanced Data Structures & Algorithms
21 pages
Algorithm Design and Analysis Guide
No ratings yet
Algorithm Design and Analysis Guide
13 pages
Understanding Algorithm Design Basics
No ratings yet
Understanding Algorithm Design Basics
419 pages
Understanding Algorithm Design Basics
No ratings yet
Understanding Algorithm Design Basics
163 pages
Understanding Algorithm Characteristics
No ratings yet
Understanding Algorithm Characteristics
14 pages
Analyzing Algorithm Complexity
No ratings yet
Analyzing Algorithm Complexity
19 pages
Algorithm Analysis and Complexity Basics
No ratings yet
Algorithm Analysis and Complexity Basics
191 pages
DAA Lecture Notes Overview
No ratings yet
DAA Lecture Notes Overview
200 pages
Algorithm Design and Performance Analysis
No ratings yet
Algorithm Design and Performance Analysis
48 pages
Data Structures: Algorithms Overview
No ratings yet
Data Structures: Algorithms Overview
8 pages
Design and Analysis of Algorithms Guide
No ratings yet
Design and Analysis of Algorithms Guide
11 pages
Algorithm Efficiency and Analysis
No ratings yet
Algorithm Efficiency and Analysis
98 pages
Algorithm Analysis and Complexity
No ratings yet
Algorithm Analysis and Complexity
6 pages
Algorithm Analysis-1
No ratings yet
Algorithm Analysis-1
12 pages
Understanding Algorithms and Their Analysis
No ratings yet
Understanding Algorithms and Their Analysis
4 pages
Understanding Algorithm Complexity
No ratings yet
Understanding Algorithm Complexity
23 pages
Day 3 Algorithms
No ratings yet
Day 3 Algorithms
12 pages
Binary Search & Red-Black Trees Assignment
No ratings yet
Binary Search & Red-Black Trees Assignment
3 pages
Python Data Structures Overview
No ratings yet
Python Data Structures Overview
35 pages
UGC NET Paper 1 Course Details
No ratings yet
UGC NET Paper 1 Course Details
146 pages
Understanding M-Way Search Trees
No ratings yet
Understanding M-Way Search Trees
4 pages
Hashing and Tree Structures Overview
No ratings yet
Hashing and Tree Structures Overview
3 pages
TCS NQT Interview Prep Guide
No ratings yet
TCS NQT Interview Prep Guide
4 pages
Advanced Data Structures Course Overview
No ratings yet
Advanced Data Structures Course Overview
48 pages
Data Structures Laboratory Experiments
100% (5)
Data Structures Laboratory Experiments
70 pages
Trees in Data Structures Explained
No ratings yet
Trees in Data Structures Explained
160 pages
Red-Black and B-Tree Operations Guide
No ratings yet
Red-Black and B-Tree Operations Guide
17 pages
Preuniversity h2 Computing
No ratings yet
Preuniversity h2 Computing
28 pages
Constructing AVL Trees from Sequences
No ratings yet
Constructing AVL Trees from Sequences
59 pages
Advanced Data Structures Lecture Notes
100% (2)
Advanced Data Structures Lecture Notes
142 pages
Capstone Exam Questions on Data Structures
No ratings yet
Capstone Exam Questions on Data Structures
24 pages
Data Structure Question Paper 2023
No ratings yet
Data Structure Question Paper 2023
35 pages
Minimum Operations in Fibonacci Heaps
No ratings yet
Minimum Operations in Fibonacci Heaps
144 pages
ECS 36C Homework 06: Trees
No ratings yet
ECS 36C Homework 06: Trees
3 pages
Data Structure Practical File
No ratings yet
Data Structure Practical File
21 pages
Parikh Jain DSA Pattern Sheet
0% (1)
Parikh Jain DSA Pattern Sheet
37 pages
Understanding Binary Search Trees and AVL Trees
No ratings yet
Understanding Binary Search Trees and AVL Trees
31 pages
RDBMS and Data Structures in C
No ratings yet
RDBMS and Data Structures in C
6 pages
B Tree Operations in Java
No ratings yet
B Tree Operations in Java
20 pages
BCS304 Data Structures Model Paper 2023
No ratings yet
BCS304 Data Structures Model Paper 2023
37 pages
Amazon Interview Questions and Answers
No ratings yet
Amazon Interview Questions and Answers
2 pages
Data Structures: Hashing & Trees Overview
No ratings yet
Data Structures: Hashing & Trees Overview
36 pages
Binary Search Tree Implementation Guide
No ratings yet
Binary Search Tree Implementation Guide
24 pages
3rd Semester CS Syllabus Overview
No ratings yet
3rd Semester CS Syllabus Overview
9 pages
Important Data Structures Questions
100% (6)
Important Data Structures Questions
22 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
27 pages