0% found this document useful (0 votes)
9 views69 pages

DDA - Module 1 - Introduction

The document provides an introduction to algorithms, detailing their definitions, properties, and the distinction between algorithms and programs. It also covers pseudocode conventions, various algorithm design techniques, and the process of analyzing algorithm performance through time and space complexity. Additionally, it includes examples of algorithms for common tasks such as searching, sorting, and recursion.

Uploaded by

Vinod Banoth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views69 pages

DDA - Module 1 - Introduction

The document provides an introduction to algorithms, detailing their definitions, properties, and the distinction between algorithms and programs. It also covers pseudocode conventions, various algorithm design techniques, and the process of analyzing algorithm performance through time and space complexity. Additionally, it includes examples of algorithms for common tasks such as searching, sorting, and recursion.

Uploaded by

Vinod Banoth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

J.B.

INSTITUTE OF ENGINEERING AND TECHNOLOGY


(UGC AUTONOMOUS)
Bhaskar Nagar, Moinabad Mandal, R.R. District, Hyderabad -500075

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

DESIGN AND ANALYSIS OF ALGORITHMS

Name of Faculty :[Link]

CHAPTER 1: INTRODUCTION TO ALGORITHMS

Introduction: Algorithm, Pseudo code for expressing algorithms, Performance Analysis –


Space Complexity, Time Complexity, Asymptotic Notation - Big-Oh Notation, Omega
notation, Theta Notation and Little-oh notation, Probabilistic analysis, amortized analysis.

Disjoint Sets – Disjoint set operations, Union and find algorithms, spanning trees, connected
components and bi-connected components.

1.1 ALGORITHM

An algorithm is any well defined computational procedure that takes some values as input and
produces some values as output. It is thus a sequence of computational steps that transform
input into output.

An algorithm is composed of finite steps each of which may require one or more operations.
Each operation may be characterized as either simple or complex.
An algorithm is a step by step procedure for performing some tasks in a finite amount of time.

According to [Link] a pioneer in the computer science discipline an algorithm must have
the following 5 basic properties

1
1. Input
2. Output
3. Finiteness
4. Definiteness
5. Effectiveness

Input: An algorithm has 0 or more inputs that are given to it initially before it begins (or)
dynamically as it runs.

Outputs: An algorithm has 1 or more outputs that have a specified relation with the inputs. An
algorithm produces at least one or more outputs.

Finiteness: An algorithm must always terminate after a finite number of steps.

Definiteness: Each operation specified in an algorithm must have definite meaning. Each step
of the algorithm must be precisely defined. Instructions such as “compute x/0 “ or “subtract 7
or 6 to x” are not permitted because it is not clear which of the two possibilities should be done
or what the result is.

Effectiveness: Each operation of an algorithm should be effective. i.e. The operation must be
able to be carried out in a finite amount of time. Tracing of each step should be possible.

Algorithm Vs Program: In computational theory, we distinguish between an algorithm and a


program. Program does not have to satisfy the finiteness condition. For example, we can think
of an operating system that continues in a “wait” loop until more jobs are entered. Such a
program does not terminate unless the system crashes. Since our programs always terminate,
we use “algorithm” and “program” interchangeably.

1.2 PSEUDO CODE

2
Pseudocode is a way of describe an algorithm or program in a natural language like English,
instead of a specific programming language. It is a blue print for writing code, and is used to
design, plan and evaluate algorithms and programs.

1.2. 1 Pseudo Code Conventions

Algorithm specification: Pseudo code conventions:

1) Comments begin with // and continue until the end of line.


2) Blocks are indicated with matching braces: { }. A compound statement can be
represented as a block.
3) All statements are separated or terminated by a semicolon ( ;).
4) An identifier begins with a letter followed by letters or digits or underscore.
5) Assignment of values to the variables is done using the assignment operator :=
6) There are two Boolean values true and false
7) Elements of single dimensional and multidimensional arrays.
8) Loops: while, do-while, repeat until, for

Ex1) while (condition) Ex2) do


{ {
statement1; statement1;
statement2; statement2;
----- -----
----- -----
} } while (condition);

Ex3) Repeat Ex4) for var:=value1 to value2 do


{ {
----- -----
------ -----
}until(condition); }

9) Conditional statements: if then else are written in the following form


Ex1) if (condition) then
statement;
Ex2) if (condition) then
statement1;

3
else
statement2;
10) Case statement
case(expression)
{
condition1: statement1;
condition2: statement2;
------
}
11) Input & Output
12) Structure: algorithm is a procedure consisting of a heading and body.
Algorithm name_of_algorithm(parameters)
{
----
----
}

Ex1) Algorithm to find sum of elements of a single dimensional array


Algorithm addition(a,n)
// a is an array of size n
{
sum:=0;
for i:= 1 to n do
sum:=sum+a[i];
return sum;
}

Ex 2) Algorithm to find largest number from the elements of a single dimensional array
Algorithm find_large(a,n)
// a is an array of size n
{
large:=a[1];
for i:= 2 to n do
if (a[i]>large) then large:=a[i];
return large;
}

4
Ex 3) Algorithm to find smallest number from the elements of a single dimensional array
Algorithm find_small(a,n)
// a is an array of size n
{
small := a[1];
for i:= 2 to n do
if (a[i]< small) then small := a[i];
return small;
}

Ex 4) Algorithm to find Factorial of a given integer number


Algorithm factorial(n)
{
fact:=1;
for i:= 1 to n do
fact:= fact*i;
return fact;
}

Ex 5) Algorithm to find ncr


Algorithm find_ncr(n,r)
{
fn:=1;
for i:= 1 to n do
fn:= fn*i;
fr:=1;
for i:= 1 to r do
fr:= fr*i;
k := n-r;
fk:=1;
for i:= 1 to k do
fk:= fk*i;
ncr:= fn / ( fr * fk);
return ncr;
}

Ex 6) Algorithm to search for a target element using linear search method

5
Algorithm linear_search(a,n,tar)
// a is an array of size n
{
pos:=0;
for i:= 1 to n do
if (a[i]=tar) then pos:=i;
return pos;
}

Ex 7) Algorithm to search for a target element using Binary search method


Algorithm binary_search(a,n,tar)
// a is an array of size n
{
low := 1;
high := n;
while ( low<=high)
{
mid:=(low+high)/2;
if (tar = a[mid]) then pos:=mid;
else
if (tar>a[mid]) then low:= mid+1;
else high := mid – 1;
}
return pos;
}

Ex 8) Algorithm to sort elements of a single dimensional array


Algorithm buble_sort(a,n)
// a is an array of size n
{
for i:=1 to n-1 do
for j:= 1 to n-i do
if ( a[j]>a[j+1]) then
{
temp:=a[j];
a[j]:= a[j+1];
a[j+1]:=temp;

6
}
}

Ex 9) Algorithm to sort elements of a single dimensional array using selection sort technique.
Algorithm selection_sort(a,n)
// a is an array of size n
{
for i := n downto 2 do
{
k:= 1;
for j:= 2 to i do
if (a[j]>a[k]) then k:=j;
temp:=a[i];
a[i]:= a[k];
a[k]:=temp;
}
}
1.3 RECURSION

If function all itself is called recursion


Ex 10) Recursive algorithm to add elements of an array.
Algorithm add(a,n)
// a is an array of size n
{
if (n=0) return 0;
else
return a[n]+add(a,n-1);
}

Ex 11) Recursive algorithm to find factorial of a given number.


Algorithm fact(n)
{
if (n<=1) then return 1;
else
return n*fact(n-1);
}

7
Ex 12) Recursive algorithm to find ncr
Algorithm ncr(n,r)
{
if (r=0) then return 1;
else
if (r=1) then return n;
else
if (r=n) then return 1;
else
return ( ncr(n-1,r)+ncr(n-1,r-1));
}

Ex 13) Algorithm to find nth Fibonacci number.


Algorthm fib(n)
{
if (n=1) then return 0;
else
if (n=2) then return 1;
else
return (fib(n-1)+fib(n-2));
}
Ex 14) Algorithm to solve Towers of Hanoi problem.
There are three towers named A,B and C. N disks were stacked on one tower(A) in decreasing
order of size from bottom to top. The problem is to move disks from tower A to tower b using
tower C as intermediate storage. Only one disk can be moved at a time. At any time bigger disk
cannot be placed on a smaller one.
Algorithm towersofhanoi(n,x,y,z)
// move the top n disks form tower x to tower y
{
if (n>=1) then
{
towersofhanoi(n-1,x,z,y);
write(“move disk “, n , “from tower”, x , “to tower”,y);
towersofhanoi(n-1,z,y,x);
}
}

8
1.4 DESIGN AND ANALYSIS OF ALGORITHMS

An algorithm is a sequence of steps to solve a problem. Design and analysis of algorithms is


the process of creating and evaluating instructions for solving problems using computers. It
involves creating an algorithm and then analyzing its efficiency. The main aim of designing an
algorithm is to provide optimal solution for a problem.

1.4.1 Process For Design and Analysis of Algorithms

1) Understand the Problem


The first thing you need to do before designing an algorithm is to understand problem
completely.

2) Decide on Exact Vs Approximate Solving


Solve the problem exactly if possible, otherwise use approximation methods. Even
though some problems are solvable by exact method, but they are not faster when
compared to approximation method. So in that situation, we will use approximation
method.

3) Algorithm design techniques.


For a given problem, there are many ways to design algorithms for it.
i) Divide & Conquer (D&C) - Ex Merge sort
ii) Greedy Method – Ex) knapsack problem, job scheduling
iii) Dynamic programming – Ex) Fibonacci – reuse of results of sub problem
iv) Backtracking – Ex- DFS
v) Branch and bound - Ex-travelling salesman problem
vi) Brute-force
vii) Decrease and conquer
viii) Transform and conquer
ix) Space and time Trade offs
Depending upon the problem, we will use suitable design method.

4) Prove Correctness
Once an algorithm has been specified, next we have to prove its correctness. Usually
testing is used for proving correctness.

9
5) Analyse Algorithm
Analysing the algorithm means studying the algorithm behavior, i.e. calculating the
time complexity and space complexity. If the time complexity is more, then we will
use one more designing technique such that time complexity should be minimum.

6)Coding an Algorithm
After completion of all phases successfully, then we will code an algorithm. Coding
should not depend on any programming language. We use general notation (pseudo
code) and English language statements.

7)Testing of Algorithms

Testing is a stage of implementation which is aimed at ensuring that the program


works accurately and efficiently before the live operations starts.

Testing a program consists of two phases, debugging and profiling (performance


measurement).

Debugging is the process of eliminating errors.

Profiling or performance measurement is the process of executing a correct program


on data sets and measuring the time and space it takes to compute results.

1.5 PERFORMANCE ANALYSIS

 Space Complexity
 Time Complexity

Algorithm analysis refers to the task of determining the computing time and storage
requirements of an algorithm. It is also known as performance analysis which enables us to
select an efficient algorithm.

We can analyze an algorithm by two ways.

10
1)By checking the correctness of an algorithm.
2)By measuring the time and space complexity of an algorithm.

To Compute the analysis of algorithm, two phases are required. They are
1)Priori analysis
2)Posteriori analysis

1)Priori analysis: This is one of the creative analysis of algorithms. In this we obtain a function
which bounds the algorithms computing time.
Suppose there is a statement in the middle of a program. We wish to determine the total time
that the statement will spend for the execution. This requires

i)The statements frequency count i.e. the number of times the statement will be executed
ii)The time taken for one execution
The product of these two numbers is the total time.

The priori analysis is mainly concerned with the order of determining the magnitude. The
notation used in priori analysis are Big-oh(O), Omega(Ω), theta(Θ) and small-oh(o). Priori
analysis of computing time ignores all of the factors, which are machine or programming
language dependent and only concentrates on determining the order of the magnitude of the
frequency of execution of the statements.

2)Posteriori Analysis: in this we will collect the actual statistics about the algorithm, in
conjunction of the time and space while executing. Once the algorithm is written it has to be
tested. Testing a program consists of two major phases.

i) Debugging: It is the process of executing programs on sample data sets that determine
whether we get proper results. If faulty results occur, it has to be corrected.

ii) Profiling: it is the process of executing a correct program on actual data sets and
measuring the time and space it takes to compute the results during execution. The
actual time taken by the algorithm to process the data is called profiling.

Space Complexity: The space complexity of an algorithm is the amount of memory it needs
to run to completion.

11
Time Complexity: The time complexity of an algorithm is the amount of computer time it
needs to run to completion.

1.5.1 Space Complexity

The space needed by the algorithms is seen to be the sum of the following components:

1) A fixed part that is independent of the characteristics of the inputs and outputs. This
part typically includes the instruction space, space for simple variables, space for
constants etc.

2) A variable part that consists of the space needed by component variables whose size
is dependent on the particular problem instance being solved; the space needed by
reference variables and the recursion stack space.

The space requirement S(P) of any algorithm P may therefore be written as S(P) = c +
Sp(instance characteristics).
----------------------------------------------------------------------------------------------------------------
Algorithm to find sum of two numbers
int findsum(a,b)
int a,b;
{
int sum;
sum=a+b;
return sum;
}

Calculate no of words or spaces needed by formal parameters and local parameters.


a needs 1 word
b needs 1 word
sum needs 1 word
Total 3 words are needed.
Sp (instance characteristics) = 0
Space requirement s[p] = 3+0 = 3
As 3 is constant the space complexity is considered as O(1)
----------------------------------------------------------------------------------------------------------------

12
float evaluate(a,b,c)
float a,b,c;
{
float sum
sum = a+b+b*c + (a+b-c)/(a+b)+4.0;
return (sum);
}
In this algorithm the problem instance is characterized by the specific values of a, b and c.
making the assumption that one word is adequate to store the values of each of a, b and c, and
the result, we see that space needed by evaluate function is independent of the instance
characteristics. Sp (instance characteristics)=0
Space requirement s[p] = 4+0 =4
One space for each a, b, c and sum i.e. total 4 words.
Space complexity is O(1)
--------------------------------------------------------------------------------------------------------------

Algorithm to find sum of n natural numbers using iteration (loop)


int findsum(n)
int n;
{
int i,sum;
sum=0;
for(i=1;i<=n;i++)
sum=sum+i;
return sum;
}

Sp (instance characteristics) =0
Space requirement s[p] = 3+0 =3
One space for each n, i, sum i.e. total 3 words.
Space complexity is O(1)
---------------------------------------------------------------------------------------
Algorithm to find sum of n natural numbers using recursion

int findsum(n)
int n;

13
{
if (n==0)
return ( 0);
else
return(n+findsum(n-1));
}

Fig.1.1 Sum of n natural numbers using recursion

In recursive functions we will have space for instance characteristics.


Every time it calls the function it will have space for n and return address. i.e. 2 words.

The function is called n+1 times.


Sp (instance characteristics) =2 for every function call
Total number of words needed are 2n+2
Space complexity is O(n)

Algorithm to find sum of elements of an array using iteration method


int addition(a,n)
int a[],n;
{
int sum,I; -------1
sum:=0.0; ---------1
for (i= 0;i< n;i++) ---------1
sum:=sum+a[i]; ---------n
return sum;
}

14
Space needed by n is one word; space needed by a is n words; space needed by I and sum one
word each.
S(addition)=3+n
Space complexity O(n)

Algorithm to find sum of elements of an array using Recursive method

int add(a,n)
int a[],n;
{
if (n==0)
return 0;
else
return (a[n-1]+add(a,n-1));
}

For example, assume that the array contains 5 elements. The recursive function will find sum
of elements as follows.

Fig.1.2 Recursive function for sum of elements in array

The instances are characterized by n. The recursion stack space includes space for the formal
parameters, the local variables, and the return address. Assume that the return address requires
only one word of memory. Each call to add function requires at least three words( n, return
address, pointer to a[]). Since the depth of recursion is n+1, the recursion stack space needed
is >=(3(n+1) words.
15
Recursion –stack space = 3(n+1) = 3n + 3 = O(n)

Algorithm to find factorial of a given number using iteration method.


Int factorial(int n)
{
int I,fn;
fn=1;
for(i=1;i<=n;i++)
fn=fn*I;
return(fn);
}

Space complexity: 1 word for n, 1 word for fn , 1 word for I


total 3 words space.
As 3 is constant so S(P)=O(1)

--------------------------------------------------------------------------------------------------------------
Algorithm to find factorial of a given number using recursive method.

Int factorial(int n)
{
if (n==1)
retrun(1);
else
return(n*factorial(n-1));
}

Space complexity of recursive factorial function is O(n)


Function calls same function again and again.
Each time the function is called it uses 2 words.
1 word for n and 1 word to return
The function will be called n times. Total words required are 2*n.
The S(p)=O(n)

Algorithm to find nth Fibonacci number

0 1 1 2 3 5 8 13 21 ------

16
int fibonacci(n)
int n;
{
int first, second, next, i;
first=0;
second=1;
for (i=3; i<=n; i++)
{
next = first + second;
first=second;
second=next;
}
return(next);
}

Calculate space needed for formal parameters and local parameters


n, first, second, next, i – all the variables need 1 word each
total 5 words. As 5 is constant the space complexity is derived as S(P)=O(1)

1.5.2 Time Complexity

The time taken by a program P is the sum of the compile time and the run time. The compile
time does not depend on the instance characteristics. Also we may assume that a compiled
program will run several times without recompilation. Consequently, we concern ourselves
with just the run time of a program. This runtime is denoted by tp(instance characteristics).

A program step is loosely defined as a syntactically or semantically meaningful segment of a


program that has an execution time that is independent of the instance characteristics.

Assignment statement is considered as executable instruction


comments count as 0 steps.
Variable declarations count as 0 steps
Return statement is 1 step
In an iterative statement such as the for, while, do-while and repeat-until statements, we
consider the step counts only for the control part of the statement.

17
-------------------------------------------------------------------------------------------------------
Algorithm to find sum of two numbers

int findsum(a,b)
int a,b;
{
int sum;
sum=a+b; --------------------------- 1
return sum; --------------------------- 1
}
Total 2 executable instructions
Time complexity is O(1)
--------------------------------------------------------------------------------------------------------
For Example: Let us consider the following algorithm.
float evaluate(a,b,c)
float a,b,c;
{
float sum
sum = a+b+b*c + (a+b-c)/(a+b)+4.0; -------------- 1
return (sum); -------------- 1
}
Total 2 executable instructions. As 2 is constant
Time complexity is O(1)
-----------------------------------------------------------------------------------------------------------

Algorithm to find sum of n natural numbers


int findsum(n)
int n;
{
int i,sum;
sum=0; ---------------- 1 time
for(i=1;i<=n;i++) ---------------- n+1 times
sum=sum+i; ---------------- n times
return sum; ----------------- 1 time
}
Total executable instructions = 2n+3
Time complexity is O(n)

18
Algorithm to find sum of n natural numbers using recursion
int findsum(n)
int n;
{
if (n==0) -------------------------- 1
return ( 0); ------------------------- 1
else
return(n+findsum(n-1)); ------------------------- 1 + x
}
This is a recursive function
Every recursive function it checks condition and returns something.
Every recursive call will have 2 executable instructions.
The function will be called n+1 times
So total executable instructions will be 2*(n+1) = 2n+2
Time complexity = O(n)

Fig.1.3 sum of n natural numbers using recursion

Algorithm to find sum of elements of an array using loop (iteration method)

Statement s/e frequency Total Steps


int addition(a,n) 0 - 0
int a[], n 0 - 0
{ 0 - 0
int sum; 0 0 0

19
sum = 0; 1 1 1
for (i=0;i<n;i++) 1 n+1 n+1
sum = sum+a[i]; 1 n n
return sum; 1 1 1
} 0 - 0
Total 2n+3

The number of steps per execution and the frequency of each of the statements in addition
algorithm have been listed. The total number of steps required by the algorithm is determined
to be 2n+3. It is important to note that the frequency of for statement is n+1 and not n. this is
so because i has to be incremented to n+1 before the for loop can terminate.

Algorithm to find sum of elements of an array using recursive method

int add(a,n) 0 - - 0 0
int a[], n; 0 - - 0 0
{ 0 - - 0 0
if (n=0) 1 1 1 1 1
return 0; 1 1 0 1 0
else 0 - - 0 0
return a[n]+add(a,n-1); 1+x 0 1 0 1+x
} 0 - - 0 0
Total 2 2+x

x=tadd(a,n-1)
Notice that under the s/e column, the else clause has been given a count of 1 + t add(a,n-1)
it includes all steps that get executed as a result of the invocation of add() from the else clause.
The frequency and total steps columns have been split into two parts: one for the case n=0 and
the other for the case n>0.

20
Fig.1.4 Sum of elements of array by using recursion

Algorithm to find sum of two matrices

Statement s/e frequency Total Steps


void addmat(a,b,c,m,n) 0 - 0
int a[][], b[][], c[][]; 0 - 0
int m,n; 0 - 0
{ 0 - 0
int i,j; 0 0 0
for (i=0;i<m;i++) 1 m+1 m+1
for (j=0;j<n;j++) 1 m(n+1) mn+m
c[i][j]=a[i][j]+b[i][j] 1 mn mn
} 0 - 0
Total 2mn+2m+1

Total executable instructions are = 2mn+2m+1


Time complexity = O(mn)
If matrices are square matrices Time complexity will be O(n2)

Example: The Fibonacci sequence of numbers starts as 0,1,1,2,3,5,8,13,21,34,55…..


Each new term is obtained by taking the sum of the two previous terms.
Algorithm to find nth Fibonacci number

21
int fibonacci(n) steps
int n;
{
int first, second, next, i;
first=0; --------------------------- 1
second=1; ---------------------------- 1
for (i=3; i<=n; i++) ---------------------------- n-2+1
{
next = first + second; ------------------------------ n-2
first=second; ----------------------------- n-2
second=next; ------------------------------n-2
}
return(next); ---------------------------- 1
}
Total steps = 2+ n-2+1 + n-2+n-2+n-2+1 = 4n-4 = O(n)
---------------------------------------------------------------------------------------------------------------
Algorithm to find factorial of a given number using iteration method.
int factorial(int n)
{
int i,fn;
fn=1; -------------------------------------- 1 statement
for(i=1;i<=n;i++) -------------------------------- n+1 statements
fn=fn*i; ---------------------------------- n statements
return(fn); -------------------------------- 1 statement
} ------------------------------total statements = 1+n+1+n+1 = 2n+3

Time complexity: Total statements are 2n+3


The time complexity will be O(n)
------------------------------------------------------------------------------------------------------------
Algorithm to find factorial of a given number using recursive method.

int factorial(int n)
{
if (n==1)
retrun(1);
else
return(n*factorial(n-1));

22
}

Fig.1.5 Factorial of a given number using recursive method.

Time Complexity:
Every time the function called it executes two statements
if statement and return statement.
The factorial function will be called repeatedly n times. So total statements that are executed
are 2n. The time complexity will be O(n).

1.6. ORDER OF GROWTH

The size of n is either increased or decreased. The time analysis may not be always linearly
proportional, because it depends upon what kind of basic operations we have in the algorithm.
Therefore, while analyzing an algorithm’s time efficiency the order of growth of n is also
important. We expect the algorithms must execute faster, but for as n varies the execution time
varies. To understand the meaning of the order of growth, let us list the most common
computing time functions.

[Link] Function Name


1 1 Constant
2 log n Logarithmic
3 n Linear
4 n log n n log n
5 n2 Quadratic

23
6 n3 Cubic
7 2n Exponential
8 n! Factorial

The functions shown are quite common in the analysis of algorithms. These functions have
strong significance in terms of their relative values. The growth of these common functions
typical values for n can be considered

n log n n log n n2 n3 2n n!
1 0 0 1 1 2 1
2 1 2 4 8 4 2
4 2 8 16 64 16 24
8 3 24 64 512 256 40320
16 4 64 256 4096 65536 20922789888000
32 5 160 1024 32768 4.2x109 2.6x1035
105 17 1.7x106 1010 1015 -
106 20 2.0x107 1012 1018 -

The function log n grows very slowly compared to any other function with respect to n. But
2n and n! grows very fast.

1.7 ASYMPTOTIC NOTATIONS

Asymptotic Notations are mathematical tools used to analyze the performance of


algorithms by understanding how their efficiency changes as the input size grows.

These notations provide a concise way to express the behavior of an algorithm’s time or
space complexity as the input size approaches infinity.

There are mainly four asymptotic notations:


 Big-Oh Notation (O-notation)
 Omega Notation (Ω-notation)
 Theta Notation (Θ-notation)
 Little-Oh Notation (o-notation)

24
1.7.1 Big-Oh Notation (O):

Definition: The function f(n) = O(g(n)) (read as f of n is big-oh of g of n) if and only if


there exists positive constants c and n0 such that f(n) <= c * g(n) for all n, n>= n0

Big-O notation represents the upper bound of the running time of an algorithm. Therefore,
it gives the worst-case complexity of an algorithm

It is the most widely used notation for Asymptotic analysis.

It specifies the upper bound of a function.

The maximum time required by an algorithm or the worst-case time complexity.

It returns the highest possible output value (Big-O) for a given input.

Big-O (Worst Case) It is defined as the condition that allows an algorithm to complete
statement execution in the longest amount of time possible.

If f(n) describes the running time of an algorithm, f(n) is O(g(n)) if there exist a positive
constant C and n0 such that, 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 . It is graphically shown in
fig.1.5.

It returns the highest possible output value (Big-O) for a given input.

The execution time serves as an upper bound on the algorithm’s time complexity.

25
Fig.1.6 Big-Oh Notation Graph

Mathematical Representation of Big-O Notation:

O(g(n)) = { f(n): there exist positive constants c and n0 such that

0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }

For example, Consider the case of Insertion Sort. It takes linear time in the best case
and quadratic time in the worst case. We can safely say that the time complexity of
the Insertion sort is O(n2).

Note: O(n2) also covers linear time.

U{ 100 , log (2000) , 104 } belongs to O(1)


U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to O(n)
U { (n2 +n) , (2n2) , (n2+log(n))} belongs to O(n2)

Big-oh notation gives an upper bound and a function f(n). The upper bound of f(n) indicates
that the function f(n) will be the worst case that it doesnot consume more than this
computing time.

Problems on O nation

Problem-1: Show that f(n)=O(n) where f(n)=5n+50

Step1: Select the dominate term from 5n+50

Dominate term = 5n, Let g(n)=n

Step 2: Choose c, let c=6 , c*g(n)=6*n

Step 3: Apply O nation to find 𝑛0

26
So 𝑐0 =50

5n+50<=6n for all n>=𝑛0 Hence f(n)=O(g(n)), Where c=6 and 𝑐0 =50

350
300
250
200
150
100
50
0
10 20 30 40 50 51

f(n) c*g(n)

Problem-2: Show that f(n)=O(n) where f(n)=3n+8

Step1: Select the dominate term from 3n+8

n 5n+50 6n Condition
(50n+50<=6n)
10 100 60 False
20 150 120 False
30 200 180 False
40 250 240 False
50 300 300 True
51 301 306 True
Dominate term = 3n , Let g(n)=n

Step 2: Choose c, let c=4 , c*g(n)=4*n

Step 3: Apply O nation to find 𝑛0

27
So 𝑐0 =8

n 𝑛2 +10 2*𝑛2 Condition


(𝑛2 +10<=2*𝑛2 )
1 11 2 False
2 14 8 False
3 19 18 False
4 26 32 True

3n+8<=4n for all n>=𝑛0 where c=4 and 𝑐0 =8


Hence f(n)=O(n)

Problem-3: Show that f(n)=O(𝑛2 ) where f(n)= 𝑛2 +10

Step1: Select the dominate term from 𝑛2 +10

Dominate term = 𝑛2 , Let g(n)= 𝑛2

Step 2: Choose c, let c=2 , c*g(n)=2*𝑛2


n 3n+8 4n Condition
(3n+8<=4*n)
7 29 28 False
8 32 32 True
9 35 36 True

Step 3: Apply O nation to find 𝑛0

So 𝑐0 =4

𝑛2 +10<=2*𝑛2 for all n>=𝑛0 where c=2 and 𝑐0 =4


Hence f(n)=O(g(n))

Problem-4: Show that f(n)=O(𝑛3 ) where f(n)= 2𝑛3 − 2𝑛2

28
Step1: Select the dominate term from 2𝑛3 − 2𝑛2

Dominate term = 𝑛3 , Let g(n)= 𝑛3

Step 2: Choose c, let c=2 , c*g(n)=2*𝑛3

Step 3: Apply O nation to find 𝑛0

So 𝑐0 =1

𝑛3 − 2𝑛2 ≤ 2*𝑛3 for all n>=𝑛0 where c=2 and 𝑐0 =1


Hence f(n)= O(𝑛3 )

Problem-5: Show that f(n)=O(2𝑛 ) where 𝑓(𝑛) = 2𝑛 + 3𝑛4


n 2𝑛 + 3𝑛4 2𝑛+1 2𝑛 + 3𝑛4 ≤ 2𝑛+1

10 4024 2048 False


11 6041 4096 False
13 14783 16384 True

Step1: Select the dominate term from 2𝑛 + 3𝑛4

Dominate term = 2𝑛 , Let g(n)= 2𝑛

Step 2: Choose c, let c=2 , c*g(n)=2*2𝑛 = 2𝑛+1

Step 3: Apply O nation to find 𝑛0

n 2𝑛3 − 2𝑛2 2*𝑛3 Condition


𝑛3 − 2𝑛2 ≤ 2*𝑛3
1 0 2 True
2 8 16 True
3 36 54 True
4 96 120 True

2𝑛 + 3𝑛4 <=2𝑛+1

29
𝑛0 = 13

n 𝑛4 + 100𝑛2 + 35 2*𝑛4 𝑛4 + 100𝑛2 + 35<=2*𝑛4

10 20035 20000 False


11 26776 29285 True
13 14783 16384 True
2𝑛 + 3𝑛4 ≤ 2𝑛+1 for all n>=𝑛0 where c=2 and 𝑐0 =13
Hence f(n)=O(2𝑛 )

Problem-6: Show that f(n)=O(𝑛4 ) where 𝑓(𝑛) = 𝑛4 + 100𝑛2 + 35

Step1: Select the dominate term from 𝑛4 + 100𝑛2 + 35

Dominate term = 𝑛4 , Let g(n)= 𝑛4

Step 2: Choose c, let c=2 , c*g(n)=2*𝑛4

Step 3: Apply O nation to find 𝑛0

.𝑛4 + 100𝑛2 + 35<=2*𝑛4 for all


𝑛0 = 11

𝑛3 − 2𝑛2 ≤ 2*𝑛3 for all n>=𝑛0 where c=2 and 𝑐0 =11


Hence f(n)= O(𝑛4 )

Other examples

Ex 1) Given f(n) = 3n+2, then prove that f(n) = O(n).


f(n) = 3n+2
= 3n+2 <= 5n for all n>=1
The above inequality can be satisfied using the definition of Big-oh-notation by
setting c=5, g(n)=n and n0 = 1.
Therefore f(n)= O(n).

30
Ex 2) Given f(n) = 20n3-3 then prove that f(n) = O(n3)
f(n) = 20n3-3
20n3 – 3 <= 20n3 for all n>=1
The above inequality can be satisfied using the definition of Big-oh notation by setting
c=20 and g(n)=n3, n0=1 Therefore f(n) = O(n3)

Ex 3) Given f(n) = 10n2 + 4n +3 then prove that f(n) = O(n2) .


f(n) = 10n2 + 4n +3
10n2 + 4n +3 <= 10n2 + 5n for all n>= 3
5n<=n2 for all n>= 5
10n2 + 4n +3 <= 10n2 + n2 for all n>= 5
10n2 + 4n +3 <= 11n2 for all n>= 5
The above inequality can be satisfied using the definition of Big-oh-notation by setting
c=11 g(n)=n2 and n0=5 Therefore f(n)=O(n2).

We write O(1) to mean a computing time that is a constant


O(n) is called linear
O(n2) is called quadratic
O(n3) is called cubic
O(2n) is called exponential
O(log n) is called logarithmic
O(n log n) is called n log n
The relation among these computing time is
O(1) < O(log n) < O(n) < O(n log n) < O(n2) < O(n3) < O(2n)

Ex 4) f(n) = n3 log n+ n2 log n + n log n


Time complexity is O ( n3 log n).
Ex 5) f(n) = n2 + n3 + 4n = O(4n)
The time complexity is O(4n)
Ex 6) for =1 to n div 2 do
for j=1 to n*n do
x=x+1
Time complexity = (n/2) n2 = n3/2 = O(n3)

1.7.2 Omega Notation Ω

31
Definition: f(n)=Ω(g(n)) iff there exists two positive constants c and n0 such that
f(n)>=c.g(n) for all n>= n0

Omega notation represents the lower bound of the running time of an algorithm. Thus, it
provides the best case complexity of an algorithm.

The execution time serves as a lower bound on the algorithm’s time complexity.

It is defined as the condition that allows an algorithm to complete statement execution in the
shortest amount of time.

Let g and f be the function from the set of natural numbers to itself. The function f is said to
be Ω(g), if there is a constant c > 0 and a natural number n0 such that c*g(n) ≤ f(n) for all n
≥ n0 . It is graphically shown in fig 1.6

Fig.1.7 Omega Notation graph

Mathematical Representation of Omega notation:


Ω(g(n)) = { f(n): there exist positive constants c and n0 such that
0 ≤ cg(n) ≤ f(n) for all n ≥ n0 }

Let us consider the same Insertion sort example here. The time complexity of Insertion Sort
can be written as Ω(n), but it is not very useful information about insertion sort, as we are
generally interested in worst-case and sometimes in the average case.

{ (n2+n) , (2n2) , (n2+log(n))} belongs to Ω( n2)


U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Ω(n)
U { 100 , log (2000) , 104 } belongs to Ω(1)
32
This notation is used to find the lower bound behavior of f(n). The lower bound implies that
below this time the algorithm cannot perform better. It is represented mathematically by
notation Ω.

n 10𝑛2+5 10*𝑛2 10𝑛2+5>=10*𝑛2

1 15 10 False
2 45 40 True
Problems on Omega Notation

Problem -1: Show that f(n)= Ω(n2), if f(n)=10𝑛2+5

Step1: Select the dominate term from 10𝑛2+5

Dominate term = 𝑛2 , Let g(n)= 𝑛2

Step 2: Choose c, let c=10 , c*g(n)=10*𝑛2

Step 3: Apply O nation to find 𝑛0

10𝑛2+5>=10*𝑛2

.
10𝑛2+5>=10*𝑛2 for all n>=𝑛0 and c=10 where 𝑛0 = 2

80
70
60
50
40
30
20
10
0
1 2 3 4 5
f(n) c*g(n)

33
Examples

Ex 1) Given f(n) = 3n + 2 then prove that f(n) = Ω(n).


3n+2 >= 3n for all n>=1
The above inequality is satisfied according to Ω definition by setting
c=3 g(n)=n n0=1
Therefore f(n)=Ω(n)

Ex 2) Given f(n) = 50n2+10n -5 then prove that f(n) = Ω(n2).


50n2+10n -5 >= 50n2 For all n>=1
The above inequality is satisfied according to Ω definition by setting
C=50 g(n)=n2 and n0=1 Therefore f(n) = Ω(n2)

Ex 3: f(n) = 100n+6, then prove that f(n) = Ω(n).

100n+6>=10n, c=10, 𝑛0 = 1

Ex4: 𝑓 (𝑛) = 10𝑛2 + 4𝑛 + 2 then prove that f(n) = Ω(𝑛2 ).

10𝑛2 + 4𝑛 + 2 ≥ 10𝑛2 where c=10 and 𝑛0 = 1

Ex5: 𝑓(𝑛) = 6 ∗ 2𝑛 + 𝑛2 then prove that Ω(2𝑛 ).

6 ∗ 2𝑛 + 𝑛2 ≥ 10 ∗ 2𝑛
c=6, , 𝑛0 = 1

Ex6: 𝑓(𝑛) = 𝑛4 + 3𝑛3 + 2𝑛 + 1, 𝑔(𝑛) = 𝑛3 + 4,

Prove f(n)= Ω(g(n)).

𝑛4 + 3𝑛3 + 2𝑛 + 1 ≥ 1 ∗ (𝑛3 + 4)

c=1 and 𝑛0 = 1

34
Ex 7: Following statements are false
3𝑛 + 3 = Ω(1).
10𝑛2 + 4𝑛 + 2 = Ω(n).

Ex 8: Prove that

𝑓(𝑛) = 7𝑛 + 4 𝑎𝑛𝑑 𝑔(𝑛) = 𝑛, 𝑠ℎ𝑜𝑤 𝑡ℎ𝑎𝑡 𝑓 (𝑛) = Ω(n).

𝑓(𝑛)
lim ≠0
𝑛→∞ 𝑔(𝑛)

4
7𝑛 + 4 7+𝑛
lim = lim =7≠0
𝑛→∞ 𝑛 𝑛→∞ 1

1
𝑁𝑜𝑡𝑒: =0

Hence 𝑓(𝑛) = Ω(n)

Ex 9: 𝑓 (𝑛) = 3𝑛 𝑎𝑛𝑑 𝑔(𝑛) = 𝑛2 , 𝑠ℎ𝑜𝑤 𝑡ℎ𝑎𝑡 𝑓(𝑛) ≠ Ω(𝑛2 ).

We have to show
𝑓(𝑛)
lim =0
𝑛→∞ 𝑔(𝑛)

3𝑛 3
lim = lim =0∗3 =0
𝑛→∞ 𝑛 2 𝑛→∞ 𝑛

Hence 𝑓 (𝑛) ≠ Ω(𝑛2 ).

1.7.3 Theta Notation(θ)

Definition: Let g and f be the function from the set of natural numbers to itself. The function
f is said to be Θ(g), if there are constants c1, c2 > 0 and a natural number n0 such that c1*
g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0
35
Theta notation encloses the function from above and below. Since it represents the upper and
the lower bound of the running time of an algorithm, it is used for analysing the average-
case complexity of an algorithm. It is graphically shown in fig 1.7

Theta (Average Case) You add the running times for each possible input combination and
take the average in the average case.

Fig.1.8 Theta Notation Graph

Mathematical Representation of Theta notation:

Note: Θ(g) is a set


The above expression can be described as if f(n) is theta of g(n), then the value f(n) is always
between c1 * g(n) and c2 * g(n) for large values of n (n ≥ n0). The definition of theta also
requires that f(n) must be non-negative for values of n greater than n0.

The execution time serves as both a lower and upper bound on the algorithm’s time
complexity.

It exists as both, most, and least boundaries for a given input value.

A simple way to get the Theta notation of an expression is to drop low-order terms and
ignore leading constants. For example, Consider the expression 3n3 + 6n2 + 6000 =
Θ(n3), the dropping lower order terms is always fine because there will always be a
number(n) after which Θ(n3) has higher values than Θ(n2) irrespective of the constants
involved.

36
{ 100 , log (2000) , 10^4 } belongs to Θ(1)
{ (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Θ(n)
{ (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Θ( n2)

Examples

1
Example :1 𝑓 (𝑛) = 2 𝑛2 − 3𝑛 𝑡ℎ𝑒𝑛 𝑠ℎ𝑜𝑤 𝑡ℎ𝑎𝑡 𝑓(𝑛) = Θ(𝑛2 )

c1* g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0

Part-I: To show c1* g(n) ≤ f(n)

1
Step 1: Select dominate term from 𝑛2 − 3𝑛
2

Dominate term=𝑛2 , g(n)= 𝑛2

Step 2:

1
𝑐1 ∗ 𝑛2 ≤ 2 𝑛2 − 3𝑛 ≤ 𝑐2 ∗ 𝑛2
1 3
𝑐1 ≤ − ≤ 𝑐2
2 𝑛

Step 3: Select c1

n 1 3 Value

2 𝑛
1 -ve
1 3

2 1
2 1 3 -ve

2 2

3 1 3 -ve

2 3

4 1 3 -ve

2 4

37
5 1 3 -ve

2 5

6 1 3 0

2 6

7 1 3 1

2 7 14

1
c1=14 and n0=7

Part-II

Find c2

1 3
− ≥ 𝐶2
2 𝑛
1
C2=2

1 1 3 1
≤ − ≤
14 2 𝑛 2

1 1
c1=14 , c2=2 and n0=7

Example 2: 𝑓 (𝑛) = 𝑛4 + 3𝑛2 +5n+1 and 𝑔(𝑛) = 𝑛4 + 1

Show that 𝑓 (𝑛) = Θ(𝑛4 )

c1* g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0

PART-1

38
C1=1

1*(𝑛4 + 1)<= 𝑛4 + 3𝑛2+5n+1 and n0=1

1 ∗ (1 + 1) ≤ 1 + 3 + 5 + 1
2<=10

PART-II

C2=10

𝑛4 + 3𝑛2 +5n+1 <=10*(𝑛4 + 1)

10<=10*2
10<=20
n0=1

C1=1 , C2=10 and n0=1

Example 3: show that f(n)= Θ(n) if f(n)=3n+2

3n<=3n+2<=4n

C1=2 , C2=4 and n0=2

3*2<=3+2<=4*2

6<=6<=8

For some functions the lower bound and the upper bound may be same. i.e. big Oh and omega
will have the same function. For example, to find minimum or maximum of an array of
elements the computing time is O(n) and Ω(n).

39
There exists a special notation to denote for functions having the same time complexity for
lower and upper bounds and this notation is called Theta θ Notation.
Def: f(n)=θ (g(n)) iff there exists three positive constants c 1 c2 and n0 with the constraint that
c1 g(n) <= f(n) <= c2 (g(n) for all n >= n0
Ex 1) f(n)=1/2n2-3n θ(n2)
Ex2 f(n)=3n+2 θ(n)
Ex3 f(n)=10n2+4n+2 θ(n2)
Ex 4 f(n)=10xlogn+4 θ(log n)

1.7.4 Little-Oh Notation (o) :

Definition: f(n) = o(g(n), iff f(n) < c.g(n) for any constants c > 0, n0>0 and n>n0.
or
f ( n)
lim 0
n  g ( n)
The asymptotic upper bound provided by O-notation may or may not be asymptotically tight.
The bound 2n2=O(n2) is asymptotically tight, but the bound 2n = O(n2) is not. We use o-
notation to denote an upper bound that is not asymptotically tight.

Ex1) Given f(n)=3n+2 The time complexity = o(n2)


Ex2) Given f(n)=2n, then prove that f(n)=o(n!).
2n
f (n)  lim =0
n  n !

Therefore f(n)=o(n!)

Big O is used as a tight upper bound on the growth of an algorithm’s effort (this effort is
described by the function f(n)), even though, as written, it can also be a loose upper bound.
“Little o” (o) notation is used to describe an upper bound that cannot be tight.

In the domain of algorithm analysis, the little o notation is a valuable tool used to describe
the behavior of functions as they approach certain limits. When we say that a
function f(n) is o(g(n)), we are essentially stating that f(n) grows slower than g(n) as n

40
approaches infinity. In simpler terms, if f(n) = o(g(n)), it means that g(n) grows faster
than f(n).

Fig.1.9 Little-Oh Notation (o)

Thus, Little o means loose upper-bound of f(n). Little o is a rough estimate of the
maximum order of growth whereas Big-O may be the actual order of growth.

1.8 BEST CASE, WORST CASE AND AVERAGE CASE


EFFICIENCIES

 Best Case: It is the minimum number of steps that can be executed for a given parameter.
 Worst case: It is the maximum number of steps that can be executed for a given parameter.
 Average case: It is the average number of steps executed for a given parameter.

Ex 1) Linear search (Sequential Search)


Best Case:
2 4 5 6 8 9 10 11 13
1 2 3 4 5 6 7 8 9
If we want to search an element 2, whether it is present in the array or not, first A[1] is compared
with 2. Match occurs. So the number of comparisons is only one. It is observed that search
takes minimum number of comparisons, so it come under best case.
Time complexity is O(1).

41
Average Case: If we want to search an element 8, whether it is present in the array or not .
first A[1] is compared with 8, no match occurs. Compare A[2],A[3] and A[4] with 8, no match
occurs. Up to now 4 comparisons taken place. Now compare A[5] and 8 so match occurs. The
number of comparisons are 5. It is observed that search takes average number of comparisons.
So it comes under average case. If there are n elements, then we require n/2 comparisons. Time
complexity is O(n/2) which is O(n). we can neglect constant.

Worst Case: If we want to search an element 13, whether it is present in the array or not. First
A[1] is compared with 13. No match occurs. Continue this process until element is found or
the list exhausted. The element is found at 9th comparison. So number of comparisons are 9. It
is observed that search takes maximum number of comparisons. So it comes under worst case.
Time complexity is O(n)

Note: If the element is not found in the list then we have to search entire list, so it comes under
worst case.

Ex 2) Binary Search: The array elements are in ascending/descending order. The target is
compared with middle element. If it is equal, we stop our search otherwise we repeat the same
process either in the first half or second half of the array depending on the value of target.

Best case time complexity: O(1) : If the target found in the first comparison then it is
called best case

Average case time complexity: O(log n) :Average number of comparisons

Worst case time complexity: O(log n) : maximum number of comparisons

Assume that the number of elements is considered as 2 m as every time the list is divided into
two halfs.
n = 2m
log (n) = log(2m )
log (n) = m log(2 )
m = log(n)/log(2)
m = log(2n) i.e. log n base 2
m = log (n)
Maximum number of comparisons will be m in a binary search.

42
1.9 PROBABILISTIC ANALYSIS AND AMORTIZED
ANALYSIS

[Link] Probabilistic Analysis Amortized analysis


1 Probabilistic Analysis is a technique Amortized analysis means finding the average
that uses probability theory to running time per operation over a worst case
estimate the average behavior of an sequence of operations.
algorithm under different scenarios.
2 Probabilistic analysis assumes Amortized analysis assumes worst case input
random choices. and typically does not allow random choices.
3 In average case analysis, we are whereas in amortized analysis we are
averaging over all possible inputs averaging over a sequence of operations.
4 A randomized algorithm or There are several techniques used in amortized
probabilistic algorithm is an analysis.
algorithm which employs a degree of 1)Aggregate analysis.: In this type of analysis
randomness as part of its logic. the upper bound T(n) on the total cost of a
Probabilistic analysis is based on sequence of n operations is decided, then the
assuming something about the set of average cost is calculated as T(n)/n.
all possible inputs. This assumption is
then used to design an efficient 2)Accounting Method: In this method the
algorithm. If the characteristics of the individual cost of each operation is
input to an algorithm is unknown, determined, by combining immediate
Example, we do not know the execution time and its influence on the running
distribution of the inputs, time of future operations.
probabilistic analysis cannot be used.
3)Potential method: it is like the accounting
method, but overcharges operations early to
compensate for undercharges later.
5 Probability is involved. Average Amortised analysis differs from average
performance is taken performance, in that probability is not
involved. Actually amortised analysis is an
analysis technique that provides tight bounds
on the worst case running time of an algorithm

43
1.10 DISJOINT SETS
SET:
Set is a collection of distinct elements.
For example a set s1 can be represented as S1={1,2,5,10}

Set Representation:
The set will be represented as the tree structure, where all children will store the address
of parent/root node. The root node will store NULL at the place of parent address. In the
given set of elements any element can be selected as root node. Generally, we select first
node as root node.

For example S1={1,7,8,9} S2={2,5,10} S3={3,4,6}


These sets can be represented as follows:

Fig.1.10 Possible tree reorientation of sets

DISJOINT SETS:
A pair of sets which does not have any common element are called disjoint sets.

For example set A = {2,3} and set B={4,5}. Set A and B are disjoint sets.

Set C={3,4,5} and set D={3,6,7}. Set C and D are not disjoint as both sets C and D are
having 3 as common elements.

Consider a set S={1,2,3,4,5,6,7,8,9,10}. These elements can be partitioned into three


disjoint sets.
A = {1,2,3,6} B= {4,5,7} C= {8,9,10}

44
Each set can be represented as a tree. Notice that for each set we have linked the nodes
from the children to the parent, rather than our usual method of linking from parent to
children.

A B C

Fig.1.11 Representation of the Sets A,B,C

These sets can be represented by storing every element of the set in the same array. Since
the set elements are numbered 1 through n, we represent the tree nodes using an array
P[1:n], where n is the maximum number of elements. The ith element of this array
represents the tree node that contains element i. This array element gives the parent
pointer of the corresponding tree node. Notice that the root nodes have parent as -1.

i [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
parent -1 1 1 -1 4 1 4 -1 8 8

Fig.1.12 Array representation of sets A,B,C

1.10.1 Operations on Disjoint Sets

The operations on disjoint sets are


1. UNION
2. FIND

UNION Operation:

UNION(i,j) it means the elements of Set i and elements of Set j are combined. If we want to
represent UNION operation in the form of a tree; the UNION(i,j) i is parent ; j is the child.

45
UNION of S1 and S2 𝑆1 U𝑆2

To obtain the union of two sets, all that has to be done is to set the parent field of one of the
roots to the other root. This can be accomplished easily if with each set name, we keep a pointer
to the root of the tree representing the set.

Algorithm UNION(i,j) //replace the disjoint sets with roots i and j.


{
integer i,j
PARENT(j):=i;
}
Example: S1={1,7,8,9} S2={2,5,10}
S1 U S2={1,2,5,7,8,9,10}

Fig.1.13 Trees representation of sets

Fig.1.14 Possible representation of 𝑆1 U𝑆2

FIND Operation:
FIND(i) implies that it finds the root node of ith node, in other words, it returns the name of the
set.

UNION(1,3)

46
FIND(1) ------------1
FIND(3)-------------1 since its parent is 1 (i.e root node is 1)

Algorithm FIND(i) //find the root for the tree containing elements
{
integer i,j
j = i;
while (parent[j]>0) do
j:=parent[j];
return j;
}

For Example: S1={1,7,8,9} S2={2,5,10} S3={3,4,6}

Then Find(4) = S3 Find(5) = S2 Find(7) = S1

To perform find operation along with the tree structure we need to maintain the name of each
set. So, we require one more data structure to store the set names. The data structure contains
two fields. One is the set name and the other is the pointer to root.

Fig.1.15 Data representation of for 𝑆1 ,𝑆2 and 𝑆3

1.10.2 Analysis of simple union and simple find

Although the SimpleUnion(i,j) and SimpleFind(i) algorithms are easy to state, their
performance characteristics are not very good. For example, consider the sets

47
Fig.1.16 Number of sets

Then if we want to perform the following sequence of operations


Union(1,2), Union(2,3), Union(3,4)-----Union(n-1,n) and
sequence of Find(1), Find(2), Find(3)-----Find(n)

The sequence of Union operations results the degenerate tree as below. Since the time taken
for union is constant, the n-1 sequence of union can be processed in time O(n).

Fig.1.17 Degenerate tree

and for the sequence of Find operations it will take O(n2).


We can improve the performance of union and find operations by avoiding the creation of
degenerative tree by applying weighting rule for union.

1.10.3 Weighting rules for union

The weighting rule for union is a technique for joining two trees in a disjoint set data structure.
It involves making the root of smaller tree point to the root of larger tree. This technique is also
known as weighted quick union.

If the number of nodes in a tree i is less than the number in tree j, then make j the parent of i,
otherwise make i the parent of j. Both arguments of UNION must be roots.
To implement the weighting rule, we need to know how many nodes are there in every tree. To
do this easily, we maintain a count field in the root of every tree. If i is a root node, then count[i]
48
equals the number of nodes in that tree. Since all the nodes other than the roots of trees have a
positive number in the P field, we can maintain the count in the P field of roots as a negative
number.
Using this convention, the time required to perform a union has increased somewhat but
still bounded by a constant. i.e O(1).

Assume that there are n sets initially

UNION (1,2) the number of nodes in tree 1 is equal to the number of nodes in tree 2

UNION (1,3) the number of nodes in tree 1 is 2. The number of nodes in tree3 is 1
So 2>1 make parent as 1 and 3 as child

Apply Union for all the n sets he final tree will be as follows

Fig. 1.18 Trees obtained using the weighted rule

49
Algorithm WeightedUNION(i,j)
{
// UNION sets with roots i and j, i ≠ j, using weighting rule
// P[i] = -count[i] and P[j] = -count[j]

temp:=p[i]+p[j];
if (p[i]>p[j] then
{
// i has fewer nodes
p[i]:=j;
p[j]:=temp;
}
else
{
// j has fewer nodes
p[j]:=i;
p[i]:=temp;
}
}

1.10.4 Collapsing Rule

The collapsing rule refers to a technique used during the FIND operation where, when
traversing the path from a node to its root, every node on that path is directly linked to the root,
effectively collapsing the path and improving the efficiency of future find operations on these
nodes. If a node is on the path to the root, set its parent to the root itself.

If j is a node on the path from i to its root then set Parent(j)←root(i). A more sophisticated
algorithm of FIND using collapsing rule is given below.

Algorithm Find(i)
{
j←i;
while (parent[j]>0)
{
j←parent[j];

50
}
k←i;
while(k≠j)
{
t←parent[k];
parent[k]←j;
k←t;
}
return(j);
}

1.10.5 UNION and FIND Operations

Implement the following UNION and FIND Operations

UNION(1,2) UNION(3,4) UNION(5,6) UNION(7,8)


UNION(1,3) UNION(5,7) UNION(1,5)

51
Fig.1.19 Trees achieving worst-case bond

Next let us we apply 8 FIND operations as follows


FIND(8), FIND(8), FIND(8), FIND(8),FIND(8),FIND(8),FIND(8), FIND(8)

If SimpleFind() is used each Find(8) requires going up three parent link fields upward to reach
the root node. Similarly for 8 FIND operations it will require 8*3=24 moves.

If we use FIND algorithm with collapsing rule, the first FIND(8) requires going up 3 link fields
(moves) and then resetting 3 links. Each of the remaining 7 FINDs requires going up only 1
move (link field), the total cost is now only 13 moves.

1.11 SPANNING TREES

A spanning tree for a connected undirected graph G=(V,E) is a sub group of G that is an
undirected tree and contains all the vertices of g. A spanning tree of a graph should include all
the vertices and a subset of edges.

A Spanning tree of a graph is any tree that includes every vertex in the graph.

52
A Spanning tree of a graph G is a sub graph of G that is a tree and contains all the vertices of
G containing no circuit or cycle.
An edge of a spanning tree is called a branch.
An edge in the graph that is not in the spanning tree is called a Chord.

You may notice that all the spanning trees differ from each other significantly, however for
each structure
i) The vertex set is same as that of graph G
ii) The edge set is a subset of G(E) and
iii) There is no cycle.
Such a structure is called spanning tree of graph. Take any vertex V as an initial partial tree
and add edges one by one so that each edge joins a new vertex to the partial tree.

It spans the graph, i.e. it includes every vertex of the graph. It is a minimum cost spanning tree
i.e. the total weights of all the edges is as low as possible.

Example 1) An Undirected graph and three of its spanning trees

Fig.1.20 Graph and its Spanning Trees

If a graph consisting of n vertices then the possible spanning trees are nn-2, for above example
n=3, i.e 33-2=3 spanning trees.

Example 2) Number of vertices =4


Number of spanning trees= 4(4-2) = 42 = 16

53
Fig.1.21 Graph and its Spanning Trees

Example 3: Consider the following graph and some of the tree structures for the graph.

54
A A

B C
D E

G
B C D E F H

G H

Fig.1.22 Graph

A A

B C
B C D E
D

E
F
F

G G H

H
Fig.1.22 Graph and some of the tree structures for the graph.

1.11.1 Minimal Spanning Tree (Minimum Cost Spanning Tree)

Frequently we encounter weighted graphs and we need to build a sub-graph that must include
every vertex in the graph. To construct such a graph with least weight or least cost we must not
have cycles in it.

55
Fig.1.23 Graph
Consider the above graph. The MST for this graph could be building a least cost
communication network.
We begin by first selecting an edge with least cost, it can between any
two vertices of graph G. Subsequently from the set of remaining edges,
we can select another least cost edge and so on. Each time an edge is
picked, we determine whether or not the inclusion of this edge into the spanning tree being
constructed creates a cycle. If it does this edge is discarded. If no cycle is created, this edge is
included in the spanning tree being constructed.
The minimum cost is BA.

Now we have
Vertices that may be EDGE COST
attached
BF 8
F
AF 7
G BG 9
C AC 4
D AD 5

The least cost is [Link], we choose AC. Now we have

Vertices that may EDGE COST


be attached
BF 8
F
AF 7
BG 9
G
CG 10
D AD 5
E CE 8

The least cost is AD. Therefore, we choose AD now we have

56
Vertices that may EDGE COST
be attached
BF 8
F AF 8
DF 10
BG 9
G
CG 10
CE 8
E
DE 9

AF is the minimum cost edge. Therefore, we add it to the partial tree. Now we have

Vertices that may EDGE COST


be attached
BG 9
G
CG 10
CE 8
E DE 9
FE 11
Obvious choice is CE.
The only vertes left is G and we have the minimum cost edge that connects it to the tree is
BG. Therefore we add it and the minimal spanning tree constructed would be of costs
9 + 3 + 7 + 5 + 4 + 8 = 36.

57
Fig.1.24 Minimal Spanning Tree

1.12 CONNECTED AND BICONNECTED COMPONENTS

CONNECTED COMPONENT
A connected component of graph G is said to be a maximally connected induced subgraph.
This means that it is a connected induced subgraph which is not a proper subgraph by itself of
any other connected subgraph of G.

Articulation Points and Biconnected Components:

Let G = (V, E) be a connected undirected graph. Consider the following definitions:

Articulation Point (or Cut Vertex): An articulation point in a connected graph is a vertex
(together with the removal of any incident edges) that, if deleted, would break the graph
into two or more pieces..

Bridge: Is an edge whose removal results in a disconnected graph.

Biconnected: A graph is biconnected if it contains no articulation points. In a biconnected


graph, two distinct paths connect each pair of vertices. A graph that is not biconnected
divides into biconnected components. This is illustrated in the following figure:

58
Fig 1.24 An example graph

Fig.1.25 Biconnected components of graph of Fig.1.24

Biconnected
Components

Articulation Point Bridge

Fig.1.26 Articulation Points and Bridges

59
Biconnected graphs and articulation points are of great interest in the design of network
algorithms, because these are the “critical" points, whose failure will result in the network
becoming disconnected.

Let us consider the typical case of vertex v, where v is not a leaf and v is not the root. Let
w1, w2, . . . . . . . wk be the children of v. For each child there is a subtree of the DFS tree
rooted at this child. If for some child, there is no back edge going to a proper ancestor of
v, then if we remove v, this subtree becomes disconnected from the rest of the graph, and
hence v is an articulation point.

On the other hand, if every one of the subtree rooted at the children of v have back edges
to proper ancestors of v, then if v is removed, the graph remains connected (the back edges
hold everything together). This leads to the following:

Observation 1: An internal vertex v of the DFS tree (other than the root) is an articulation
point if and only if there is a subtree rooted at a child of v such that there is no back edge
from any vertex in this subtree to a proper ancestor of v.

Observation 2: A leaf of the DFS tree is never an articulation point, since a leaf will not
have any subtrees in the DFS tree.

Thus, after deletion of a leaf from a tree, the rest of the tree remains connected, thus even
ignoring the back edges, the graph is connected after the deletion of a leaf from the DFS tree.

Observation 3: The root of the DFS is an articulation point if and only if it has two or more
children. If the root has only a single child, then (as in the case of leaves) its removal does not
disconnect the DFS tree, and hence cannot disconnect the graph in general.

1.12.1 Articulation Points by Depth First Search:

60
Determining the articulation turns out to be a simple extension of depth first search.
Consider a depth first spanning tree for this graph.

Fig.1.27 Graph

Observations 1, 2, and 3 provide us with a structural characterization of which


vertices in the DFS tree are articulation points.

Deleting node E does not disconnect the graph because G and D both have dotted links
(back edges) that point above E, giving alternate paths from them to F. On the other hand,
deleting G does disconnect the graph because there are no such alternate paths from L or
H to E (G’s parent).

A vertex ‘x’ is not an articulation point if every child ‘y’ has some node lower in the tree
connect (via a dotted link) to a node higher in the tree than ‘x’, thus providing an alternate
connection from ‘x’ to ‘y’. This rule will not work for the root node since there are no
nodes higher in the tree. The root is an articulation point if it has two or more children.

Depth First Spanning Tree for the above graph is:

61
Fig.1.28 Graph with DFS

By using the above observations, the articulation points of this graph are:

A : because it connects B to the rest of the graph.

H : because it connects I to the rest of the graph. J


: because it connects K to the rest of the graph.

G : because the graph would fall into three pieces if G is deleted.


Biconnected components are: {A, C, G, D, E, F}, {G, J, L, M}, B, H, I and K

This observation leads to a simple rule to identify articulation points. For each is
define L (u) as follows:

L (u) = min {DFN (u), min {L (w)  w is a child of u}, min {DFN (w)  (u,
w) is a back edge}}.

L (u) is the lowest depth first number that can be reached from ‘u’ using a path of
descendants followed by at most one back edge. It follows that, If ‘u’ is not the root then
‘u’ is an articulation point iff ‘u’ has a child ‘w’ such that:

L (w) ≥ DFN (u)

1.12.2 Algorithm for finding the Articulation points:

Pseudocode to compute DFN and L.

Algorithm Art (u, v)

// u is a start vertex for depth first search. V is its parent if any in the depth first
// spanning tree. It is assumed that the global array dfn is initialized to zero and that // the
global variable num is initialized to 1. n is the number of vertices in G.
{
dfn [u] := num; L [u] := num; num := num +
1; for each vertex w adjacent from u do
{
if (dfn [w] = 0) then
{

62
Art (w, u); // w is
unvisited. L [u] := min (L [u], L [w]);
}
else if (w ≠ v) then L [u] := min (L [u], dfn [w]);
}
}

1.12.3 Algorithm for finding the Biconnected Components:

Algorithm BiComp (u, v)

// u is a start vertex for depth first search. V is its parent if any in the depth first
// spanning tree. It is assumed that the global array dfn is initially zero and that the
// global variable num is initialized to 1. n is the number of vertices in G.
{
dfn [u] := num; L [u] := num; num := num +
1; for each vertex w adjacent from u do
{
if ((v ≠ w) and (dfn [w] < dfn [u])) then
add (u, w) to the top of a stack s;

if (dfn [w] = 0) then


{
if (L [w] > dfn [u]) then
{
write (“New bicomponent”);
repeat
{
Delete an edge from the top of stack s;
Let this edge be (x, y);

Write (x, y);


} until (((x, y) = (u, w)) or ((x, y) = (w, u)));
}

BiComp (w, u); // w is


unvisited. L [u] := min (L [u], L [w]);
}
63
else if (w ≠ v) then L [u] : = min (L [u], dfn [w]);
}
}

1.12.4 Problems

Example 1: For the following graph identify the articulation points and Biconnected
components:

Fig 1.29 Depth first spanning tree of the graph

L (u) = min {DFN (u), min {L (w) / w is a child of u}, min {DFN (w) / w is a vertex to
which there is back edge from u}}

L (1) = min {DFN (1), min {L (2)}} = min {1, L (2)} = min {1, 2} = 1

L (2) = min {DFN (2), min {L (3)}} = min {2, L (3)} = min {2, 3} = 2

L (3) = min {DFN (3), min {L (4), L (5), L (6)}} = min {3, min {6, 4, 5}} = 3

L (4) = min {DFN (4), min {L (7)} = min {6, L (7)} = min {6, 6} = 6

L (5) = min {DFN (5)} = 4

L (6) = min {DFN (6)} = 5

L (7) = min {DFN (7), min {L (8)}} = min {7, 6} = 6

L (8) = min {DFN (8), min {DFN (4)}} = min {8, 6} = 6

64
Therefore, L (1: 8) = {1, 2, 3, 6, 4, 5, 6, 6}

Finding the Articulation Points:

Check for the condition if L (w) > DFN (u) is true, where w is any child of u.

Vertex 1: Vertex 1 is not an articulation point.

It is a root node. Root is an articulation point if it has two or more child nodes.

Vertex 2: is an articulation point as L (3) = 3 and DFN (2) = 2.

So, the condition is true

Vertex 3: is an articulation Point as:

I. L (5) = 4 and DFN (3) = 3


II. L (6) = 5 and DFN (3) = 3 and
III. L (4) = 6 and DFN (3) = 3

So, the condition true in above cases

Vertex 4: is an articulation point as L (7) = 6 and DFN (4) = 6.

So, the condition is true

Vertex 7: is not an articulation point as L (8) = 6 and DFN (7) = 7.

So, the condition is False

Vertex 5, Vertex 6 and Vertex 8 are leaf nodes.


Therefore, the articulation points are {2, 3, 4}

Example 2: For the following graph identify the articulation points and Biconnected
components:

65
1 1

3
3 6
2 2
4

4 5 5 6
7
7

8
8

Fig.1.30 Depth first spanning tree of the graph

L (u) = min {DFN (u), min {L (w) / w is a child of u}, min {DFN (w) / w is a vertex to
which there is back edge from u}}

L (1) = min {DFN (1), min {L (2)}} = min {1, L (2)} = min {1, 2} = 1

L (2) = min {DFN (2), min {L (3)}} = min {2, L (3)} = min {2, 3} = 2

L (3) = min {DFN (3), min {L (4), L (5), L (6)}} = min {3, min {6, 4, 5}} = 3

L (4) = min {DFN (4), min {L (7)} = min {6, L (7)} = min {6, 6} = 6

L (5) = min {DFN (5)} = 4

L (6) = min {DFN (6)} = 5

L (7) = min {DFN (7), min {L (8)}} = min {7, 6} = 6

L (8) = min {DFN (8), min {DFN (4)}} = min {8, 6} = 6

Therefore, L (1: 8) = {1, 2, 3, 6, 4, 5, 6, 6}

Finding the Articulation Points:


Check for the condition if L (w) > DFN (u) is true, where w is any child of u.

66
Vertex 1: Vertex 1 is not an articulation point.

It is a root node. Root is an articulation point if it has two or more child nodes.

Vertex 2: is an articulation point as L (3) = 3 and DFN (2) = 2.

So, the condition is true

Vertex 3: is an articulation Point as:

IV. L (5) = 4 and DFN (3) = 3


V. L (6) = 5 and DFN (3) = 3 and
VI. L (4) = 6 and DFN (3) = 3

So, the condition true in above cases

Vertex 4: is an articulation point as L (7) = 6 and DFN (4) = 6.

So, the condition is true

Vertex 7: is not an articulation point as L (8) = 6 and DFN (7) = 7.

So, the condition is False

Vertex 5, Vertex 6 and Vertex 8 are leaf nodes.


Therefore, the articulation points are {2, 3, 4}.

Example 3: For the following graph identify the articulation points and Biconnected
components:

Fig.1.31 Depth first spanning tree of the graph

DFN (1: 8) = {1, 2, 3, 4, 5, 6, 8, 7}

L (u) = min {DFN (u), min {L (w) w is a child of u}, min {DFN (w) w is a vertex
to which there is back edge from u}}

L (1) = min {DFN (1), min {L (2)}}

67
= min {1, L (2)} = 1

L (2) = min {DFN (2), min {L (3)}} = min {2, L (3)} = min{2, 1}= 11

L (3) = min {DFN (3), min {L (4)}} = min {3, L (4)} = min {3, L (4)}

= min {3, 1} = 1

L (4) = min {DFN (4), min {L (5), L (7)}, min {DFN (1)}}

= min {4, min {L (5), L (7)}, 1} = min {4, min {1, 3}, 1}

= min {4, 1, 1} = 1

L (5) = min {DFN (5), min {L (6)}, min {DFN (1)}} = min {5, L (6), 1}

= min {5, 4, 1} = 1

L (6) = min {DFN (6), min {L (8)}, min {DFN (4)}} = min(6, L (8), 4}

= min(6, 4, 4} = 4

L (7) = min {DFN (7), min {DFN (3)}} = min {8, 3} = 3

L (8) = min {DFN (8), min {DFN (4)}} = min {7, 4} = 4

Therefore, L (1: 8) = {1, 1, 1, 1, 1, 4, 3, 4}

Finding the Articulation Points:


Check for the condition if L (w) > DFN (u) is true, where w is any child of u.
Vertex 1: is not an articulation point.

It is a root node. Root is an articulation point if it has two or more child

nodes.

Vertex 2: is not an articulation point. As L (3) = 1 and DFN (2) = 2.

So, the condition is False.

Vertex 3: is not an articulation Point as L (4) = 1 and DFN (3) = 3.

So, the condition is False.

Vertex 4: is not an articulation Point as:

L (3) = 1 and DFN (2) = 2 and

L (7) = 3 and DFN (4) = 4

So, the condition fails in both cases.


68
Vertex 5: is not an Articulation Point as L (6) = 4 and DFN (5) = 6.

So, the condition is False

Vertex 6: is not an Articulation Point as L (8) = 4 and DFN (6) = 7.

So, the condition is False


Vertex 7: is a leaf node.

Vertex 8: is a leaf node.

So they are no articulation points.

69

You might also like