Data Structure Notes
Data Structure Notes
Abstract Data Types (ADTs) – List ADT – array-based implementation – linked list
implementation –– singly linked lists- circularly linked lists- doubly-linked lists – applications of
lists –Polynomial Manipulation – All operation (Insertion, Deletion, Merge, Traversal)
An abstract data type is defined only by the operations that may be performed on it and by
mathematical pre-conditions and constraints on the effects (and possibly cost) of those operations.
For example, an abstract stack, which is a last-in-first-out structure, could be defined by three
operations: push, that inserts some data item onto the structure, pop, that extracts an item from it, and
peek or top, that allows data on top of the structure to be examined without removal.
An abstract queue data structure, which is a first-in-first-out structure, would also have three
operations, enqueue to join the queue; dequeue, to remove the first element from the queue; and front,
in order to access and serve the first element in the queue.
An ADT may be implemented by specific data types or data structures, in many ways
and in many programming languages; or described in a formal specification language. ADTs are often
implemented as modules: the module's interface declares procedures that correspond to the ADT
operations, sometimes with comments that describe the constraints. This information hiding strategy
allows the implementation of the module to be changed without disturbing the client programs.
An abstract data type is defined as a mathematical model of the data objects that make up a data
type as well as the functions that operate on these objects. There are no standard conventions for
defining them. A broad division may be drawn between "imperative" and "functional" definition styles.
Modules:
The programs can be spit in to smaller sub programs called modules. Each module is a logical
unit and does a specific job or process. The size of the module is kept small by calling other modules.
An ADT is a set of operations and is an extension of modular design. A useful tool for specifying
the logical properties of a data type is the abstract data type. ADT refers to the basic mathematical
concept that defines the data type. Eg. Objects such as list, set and graph along their operations can be
viewed as ADT's. (or) Abstract Data Type is some data, associated with a set of operations and
mathematical abstractions just as integer, Boolean etc.,
Basic idea
As computers become faster and faster the need for programs that can handle large amount of
inputs become more acute which in turn requires choosing suitable algorithm with more efficiency,
therefore the study of data structures became an important task which describes, method of organizing
large amounts of data and manipulating them in a better way.
DATA:
INFORMATION:
The meaning that is currently assigned to data by means of the conventions applied to those
data(i.e. processed data)
RECORD:
DATATYPE:
Set of elements that share common set of properties used to solve a program.
DATASTRUCTURE:
A way of organizing, storing, and retrieving data and their relationship with each other.
ALGORITHM:
operating systems
compiler design
statistical and numerical analysis
. database management systems
expert systems
network analysis
Data Structure is a systematic way of organizing, storing and retrieving data and their
relationship with each other.
Data Structures
Eg. Array, Linked list, Stack, queue etc. Eg. Trees, Graphs etc.
Primitive Data Structure: It is the basic Data structure which can be directly operated by the machine
instruction.
Non-Primitive Data Structure: It is the Data structure which emphasizes on structuring of a group of
homogenous or heterogeneous data items. It is further classified into two types. They are:
Linear Data Structures: It is a data structure which contains a linear arrangement of elements in the
memory.
A. list is a dynamic data structure. The no. of nodes on a list may vary dramatically as elements are
inserted and removed. The dynamic nature of list is implemented using linked list and the static nature
of list is implemented using array.
In general the list is in the form of elements A1, A2, …, AN, where N is the size of the list associated
with a set of operations:
Insert: add an element e.g. Insert(X,5)-Insert the element X after the position 5.
Delete: remove an element e.g. Delete(X)-The element X is deleted.
Find: find the position of an element (search) e.g. Find(X)-Returns the position of X
FindKth: find the kth element e.g. Find(L,6)- Returns the element present in the given position
of list L.
PrintList: Display all the elements from the list.
MakeEmpty: Make the list as empty list.
Implementation of List:
1. Array
2. Linked list
Array: A set of data elements of same data type is called array. Array is a static data structure i.e., the
memory should be allocated in advance and the size is fixed. This will waste the memory space when
used space is less than the allocated space. An array implementation allows the following operations.
a. Creation of a List.
b. Insertion of a data in the List
c. Deletion of a data from the List
d. Searching of a data in the list
Global Declaration:
Create Operation:
Procedure:
void create()
int n,i;
printf("\nEnter the [Link] elements to be added in the list");
.
scanf("%d",&n);
if(n<=maxsize)
for(i=0;i<n;i++)
scanf("%d",&list[i]);
index++;
else
printf("\nThe size is limited. You cannot add data into the list");
Explanation:
.
Insert Operation:
Procedure:
void insert()
int i,data,pos;
scanf("%d",&data);
scanf("%d",&pos);
if(index<maxsize)
for(i=index;i>=pos-1;i--)
list[i+1]=list[i];
index++;
list[pos-1]=data;
else
} Deletion Operation:
Explanation:
Procedure:
void del()
int i,pos;
scanf("%d",&pos);
for(i=pos;i<=index;i++)
list[i-1]=list[i];
index--;
}
Explanation:
Display Operation:
Procedure:
void display()
int i;
for(i=0;i<=index;i++)
printf("\t%d",list[i]);
Explanation:
Formulate a loop, where i value ranges from 0 to index (index denotes the index of the last
element in the array.
Display each element in the array.
An array size is fixed at the start of execution and can store only the limited number of elements.
Insertion and deletion of array are expensive. Since insertion is performed by pushing the entire
array one position down and deletion is performed by shifting the entire array one position up.
Linked data structure is a data structure which consists of a set of data records (nodes) linked
together and organized by references (links or pointers). The link between data can also be called a
connector.
In linked data structures, the links are usually treated as special data types that can only be
dereferenced or compared for equality. Linked data structures are thus contrasted with arrays and other
data structures that require performing arithmetic operations on pointers. This distinction holds even
when the nodes are actually implemented as elements of a single array, and the references are actually
array indices: as long as no arithmetic is done on those indices, the data structure is essentially a linked
one.
Linking can be done in two ways - Using dynamic allocation and using array index linking.
Linked lists
A linked list is a collection of structures ordered not by their physical placement in memory but
by logical links that are stored as part of the data in the structure itself. It is not necessary that it should
be stored in the adjacent memory locations. Every structure has a data field and an address field. The
Address field contains the address of its successor.
A list is a linear structure. Each item except the first (front, head) has a unique predecessor. Each
item except the last (end, tail) has a unique successor. First item has no predecessor, and last item has
no successor. An item within a list is specified by its position in the list.
Linked list can be singly, doubly or multiply linked and can either be linear or circular.
Singly linked list is the most basic linked data structure. In this the elements can be placed
anywhere in the heap memory unlike array which uses contiguous locations. Nodes in a linked list
are linked together using a next field, which stores the address of the next node in the next field of
the previous node i.e. each node of the list refers to its successor and the last node contains the
NULL reference. It has a dynamic size, which can be determined only at run time.
10 40 20 30 Null
. The node of a linked list is a structure with fields data (which stored the value of the node)
and*next (which is a pointer of type node that stores the address of the next node).
Two nodes *start (which always points to the first node of the linked list) and *temp (which is
used to point to the last node of the linked list) are initialized.
Initially temp = start and temp->next = NULL. Here, we take the first node as a dummy node.
The first node does not contain data, but it used because to avoid handling special cases in insert and
delete functions.
Functions –
1. Insert – This function takes the start node and data to be inserted as arguments. New node is
inserted at the end so, iterate through the list till we encounter the last node. Then, allocate
memory for the new node and put data in it. Lastly, store the address in the next field of the new
node as NULL.
2. Delete - This function takes the start node (as pointer) and data to be deleted as arguments.
Firstly, go to the node for which the node next to it has to be deleted, If that node points to
NULL (i.e. pointer->next=NULL) then the element to be deleted is not present in the
[Link], now pointer points to a node and the node next to it has to be removed, declare a
temporary node (temp) which points to the node which has to be removed.
Store the address of the node next to the temporary node in the next field of the node
pointer (pointer->next = temp->next). Thus, by breaking the link we removed the node which is
next to the pointer (which is also temp). Because we deleted the node, we no longer require the
memory used for it, free() will deallocate the memory.
3. Find - This function takes the start node (as pointer) and data value of the node (key) to be
found as arguments. First node is dummy node so, start with the second node. Iterate through
the entire linked list and search for the [Link] next field of the pointer is equal to NULL,
check if pointer->data = key. If it is then the key is found else, move to the next node and
search (pointer = pointer -> next). If key is not found return 0,else return 1.
4. Print - function takes the start node (as pointer) as an argument. If pointer = NULL, then there
is no element in the list. Else, print the data value of the node (pointer->data) and move to the
next node by recursively calling the print function with pointer->next sent as an argument.
Performance:
1. The advantage of a singly linked list is its ability to expand to accept virtually unlimited
number of nodes in a fragmented memory environment.
2. The disadvantage is its speed. Operations in a singly-linked list are slow as it uses
sequential search to locate a node.
Null
Insert(Start,10) - A new node with data 10 is inserted and the next field is updated to NULL. The
next field of previous node is updated to store the address of new node.
10 Null
Insert(Start,20) - A new node with data 20 is inserted and the next field is updated to NULL. The
next field of previous node is updated to store the address of new node.
10 20 Null
L
Insert(Start,30) - A new node with data 30 is inserted and the next field is updated to NULL. The
next field of previous node is updated to store the address of new node.
10 20 30 Null
.
L
P
Insert(Start,40) - A new node with data 40 is inserted and the next field is updated to NULL. The
next field of previous node is updated to store the address of new node.
10 40 20 30 Null
L
P
int IsEmpty(List L)
{
if (L->next==NULL)
return(1);
}
Null
if(p->next==NULL)
return(1);
}
.
10 40 20 30 Null
L
P
if the answer is 1 result is true
position p;
p=L->next;
p=p->next;
return(p);
Find(start,10) - To find an element start from the first node of the list and move to the next with the
help of the address stored in the next field.
10 40 20 30 Null
X
Find Previous
position p;
p=L;
p=p->next;
return P;
void count(list L)
p=L->next;
while(p!=NULL)
count++;
p=p->next;
print count;
10 40 20 30 Null
L
Routine to Delete an Element in the List:
Delete(start,20) - A node with data 20 is found and the next field is updated to store the NULL
value. The next field of previous node is updated to store the address of node next to the deleted node.
it delete the first occurrence of element X from the list L
.
void delete(int x , List L)
{
position p,Temp;
p=find previous(X,L);
if(!IsLast(p,L))
{
temp=p->next;
p->next=temp->next;
free(temp);
}
}
10 40 20 30 Null
P Temp
After Deletion
mpmp mpmp
10 40 30 Null
Deletet(start->next) - To print start from the first node of the list and move to the next
with the help of the address stored in the next field.
void delete list(List L)
.{
position P,temp;
P=L->next;
L->next=NULL;
while(P!=NULL)
temp=P->next;
free(P);
P=temp;
position P;
P=L->next;
P->next;
return P->next;
. Doubly-linked list is a more sophisticated form of linked list data structure. Each node of the
list contain two references (or links) – one to the previous node and other to the next node. The
previous link of the first node and the next link of the last node points to NULL. In comparison to
singly-linked list, doubly-linked list requires handling of more pointers but less information is
required as one can use the previous links to observe the preceding element. It has a dynamic size,
which can be determined only at run time.
10 20 30 40
NULL L NULL
Algorithm:
The node of a linked list is a structure with fields data (which stored the value of the node),
*previous (which is a pointer of type node that stores the address of the previous node) and *next
(Which is a pointer of type node that stores the address of the next node)?
Two nodes *start (which always points to the first node of the linked list) and *temp (which is
used to point to the last node of the linked list) are initialized.
Initially temp = start, temp->prevoius = NULL and temp->next = NULL. Here, we take
the first node as a dummy node. The first node does not contain data, but it used because to avoid
handling special cases in insert and delete functions.
Functions –
1. Insert – This function takes the start node and data to be inserted as arguments. New node is
inserted at the end so, iterate through the list till we encounter the last node. Then, allocate
memory for the new node and put data in it. Lastly, store the address of the previous node in
the previous field of the new node and address in the next field of the new node as NULL.
2. Delete - This function takes the start node (as pointer) and data to be deleted as arguments.
Firstly, go to the node for which the node next to it has to be deleted, If that node points to
NULL (i.e. pointer->next=NULL) then the element to be deleted is not present in the list.
Else, now pointer points to a node and the node next to it has to be removed, declare a
temporary node (temp) which points to the node which has to be removed.
Store the address of the node next to the temporary node in the next field of the node
pointer (pointer->next = temp->next) and also link the pointer and the node next to the node
to be deleted (temp->prev = pointer). Thus, by breaking the link we removed the node which
is next to the pointer (which is also temp). Because we deleted the node, we no longer require
the memory used for it, free() will deallocate the memory.
.
3. Find - This function takes the start node (as pointer) and data value of the node (key) to be
found as arguments. First node is dummy node so, start with the second node. Iterate through
the entire linked list and search for the key.
Until next field of the pointer is equal to NULL, check if pointer->data = key. If it is then the
key is found else, move to the next node and search (pointer = pointer -> next). If key is not
found return 0, else return 1.
4. Print - function takes the start node (as pointer) as an argument. If pointer = NULL, then there
is no element in the list. Else, print the data value of the node (pointer->data) and move to the
next node by recursively calling the print function with pointer->next sent as an argument.
Performance:
1. The advantage of a singly linked list is that we don‟t need to keep track of the previous
node for traversal or no need of traversing the whole list for finding the previous node.
2. The disadvantage is that more pointers needs to be handled and more link need to
updated.
Initially
NULL
L
Routine for insert an element:
void insert (int x,List L,position p)
{
.
position newnode;
newnode = malloc(sizeof(struct node));
if(newnode==NULL)
Fatal error (“out of space”);
else
{
newnode ->data = x;
newnode ->next =p->next;
p->next ->prev = newnode;
p->next= newnode;
newnode ->prev =p;
}
Insert(Start,10) - A new node with data 10 is inserted, the next field is updated to NULL and the
previous field is updated to store the address of the previous node. The next field of previous node
is updated to store the address of new node.
10
NULL L
Insert(Start,20) - A new node with data 20 is inserted, the next field is updated to NULL and the
previous field is updated to store the address of the previous node. The next field of previous node is
updated to store the address of new node.
10 20
NULL L NULL
Insert(Start,30) - A new node with data 30 is inserted, the next field is updated to NULL and the
previous field is updated to store the address of the previous node. The next field of previous node is
updated to store the address of new node.
10 20 30
NULL L NULL
Insert(Start,40) - A new node with data 40 is inserted, the next field is updated to NULL and the
previous field is updated to store the address of the previous node. The next field of previous node is
updated to store the address of new node.
.
10 20 30
NULL NULL P
L NULL
10 40 20 30
void display(List L)
{
p=L->next;
while (p!=NULL)
{
print p->data;
p=p->next;
}
print NULL
}
10 40 20 30
L NULL NULL
Routine for deleting an element:
.
Void delete (int x ,List L)
{
Position p , temp;
p=find(x,L);
If(isLast(p,L))
{
temp =p;
p->prev->next=NULL;
free(temp);
}
else
{
temp = p;
p->prev->next=p-next;
p-next-prev = p->prev;
free(temp);
}
Delete(start,10) - A node with data 10 is found, the next and previous fields are updated to store the
NULL value. The next field of previous node is updated to store the address of node next to the
deleted node. The previous field of the node next to the deleted one is updated to store the address of
the node that is before the deleted node.
10 40 20 30
NULL L NULL
40 20 30
NULL L
NULL
To print start from the first node of the list and move to the next with the help of the address stored
in the next field.
.
40 20 30
NULL L
NULL
Circular linked list is a more complicated linked data structure. In this the elements can be
placed anywhere in the heap memory unlike array which uses contiguous locations. Nodes in a linked
list are linked together using a next field, which stores the address of the next node in the next field of
the previous node i.e. each node of the list refers to its successor and the last node points back to the
first node unlike singly linked list. It has a dynamic size, which can be determined only at run time.
Algorithm:
The node of a linked list is a structure with fields data (which stored the value of the node)
and*next (which is a pointer of type node that stores the address of the next node).
Two nodes *start (which always points to the first node of the linked list) and *temp (which is used
to point to the last node of the linked list) are initialized.
Initially temp = start and temp->next = start. Here, we take the first node as a dummy
node. The first node does not contain data, but it used because to avoid handling special cases in
insert and delete functions.
Functions –
1. Insert – This function takes the start node and data to be inserted as arguments. New node is
inserted at the end so, iterate through the list till we encounter the last node. Then, allocate
memory for the new node and put data in it. Lastly, store the address of the first node (start) in
the next field of the new node.
2. Delete - This function takes the start node (as pointer) and data to be deleted as arguments.
Firstly, go to the node for which the node next to it has to be deleted, If that node points to
NULL (i.e. pointer->next=NULL) then the element to be deleted is not present in the list. Else,
now pointer points to a node and the node next to it has to be removed, declare a temporary
node (temp) which points to the node which has to be removed. Store the address of the node
next to the temporary node in the next field of the node pointer (pointer->next = temp->next).
. Thus, by breaking the link we removed the node which is next to the pointer (which is also
temp). Because we deleted the node, we no longer require the memory used for it, free() will
deallocate the memory.
3. Find - This function takes the start node (as pointer) and data value of the node (key) to be
found as arguments. A pointer start of type node is declared, which points to the head node
of the list (node *start = pointer). First node is dummy node so, start with the second node.
Iterate through the entire linked list and search for the key.
Until next field of the pointer is equal to start, check if pointer->data = key. If it is then the key
is found else, move to the next node and search (pointer = pointer -> next). If key is not found
return 0, else return 1.
4. Print - function takes the start node (as start) and the next node (as pointer) as arguments.
If pointer = start, then there is no element in the list. Else, print the data value of the node
(pointer->data) and move to the next node by recursively calling the print function with
pointer->next sent as an argument.
Performance:
1. The advantage is that we no longer need both a head and tail variable to keep track of
the list. Even if only a single variable is used, both the first and the last list elements can be
found in constant time. Also, for implementing queues we will only need one pointer namely
tail, to locate both head and tail.
2. The disadvantage is that the algorithms have become more complicated.
Example:
Initially
Insert(Start,2) - A new node with data 2 is inserted and the next field is updated to store the
address of start node. The next field of previous node is updated to store the address of new node.
start temp
Insert(Start,6) - A new node with data 6 is inserted and the next field is updated to store the
.address of start node. The next field of previous node is updated to store the address of new
node.
Start temp
Insert(Start,1) – A new node with data 1 is inserted and the next field is updated to store the address
of start node. The next field of previous node is updated to store the address of new node.
Start temp
Insert(Start,10) – A new node with data 6 is inserted and the next field is updated to store the address
of start node. The next field of previous node is updated to store the address of new node.
Start temp
Insert(Start,8) - A new node with data 6 is inserted and the next field is updated to store the address
of start node. The next field of previous node is updated to store the address of new node.
Start temp
Print(start->next) 2 ,6 ,1 ,10 ,8
Start temp
To print start from the first node of the list and move to the next with the help of the address stored
.
in the next field.
Delete(start,1) - A node with data 1 is found and the next field is updated to store the NULL value.
The next field of previous node is updated to store the address of node next to the deleted node.
Start temp
Start temp
To find, start from the first node of the list and move to the next with the help of the address stored
in the next field.
. Applications of List
Polynomial Manipulation
Polynomial manipulations such as addition, subtraction & differentiation etc.. can be performed
using linked list
struct poly
{
int coeff;
int power;
struct poly * Next;
}*list1,*list2,*list3;
poly create(poly*head1,poly*newnode1)
{
poly *ptr;
if(head1==NULL)
{
head1=newnode1;
return (head1);
}
else
{
ptr=head1;
while(ptr->next!=NULL)
ptr=ptr->next;
ptr->next=newnode1;
}
return(head1);
}
void add()
{
poly*ptr1,*ptr2,*newnode;
ptr1=list1;
ptr2=list2;
while(ptr1!=NULL&&ptr2!=NULL)
{
newnode=malloc(sizeof(struct poly));
if(ptr1->power==ptr2->power)
{
. newnode->coeff=ptr1-coeff+ptr2->coeff;
newnode->power=ptr1->power;
newnode->next=NULL;
list3=create(list3,newnode);
ptr1=ptr1->next;
ptr2=ptr2->next;
}
else
{
if(ptr1-power>ptr2-power)
{
newnode->coeff=ptr1->coeff;
newnode->power=ptr1->power;
newnode->next=NULL;
list3=create(list3,newnode);
ptr1=ptr1->next;
}
else
{
newnode->coeff=ptr2->coeff;
newnode->power=ptr2->power;
newnode->next=NULL;
list3=create(list3,newnode);
ptr2=ptr2->next;
}
}
}
void sub()
{
poly*ptr1,*ptr2,*newnode;
ptr1=list1;
ptr2=list2;
while(ptr1!=NULL&&ptr2!=NULL)
{
newnode=malloc(sizeof(struct poly));
if(ptr1->power==ptr2->power)
{
newnode->coeff=(ptr1-coeff)-(ptr2->coeff);
newnode->power=ptr1->power;
newnode->next=NULL;
. list3=create(list3,newnode);
ptr1=ptr1->next;
ptr2=ptr2->next;
}
else
{
if(ptr1-power>ptr2-power)
{
newnode->coeff=ptr1->coeff;
newnode->power=ptr1->power;
newnode->next=NULL;
list3=create(list3,newnode);
ptr1=ptr1->next;
}
else
{
newnode->coeff=-(ptr2->coeff);
newnode->power=ptr2->power;
newnode->next=NULL;
list3=create(list3,newnode);
ptr2=ptr2->next;
}
}
}
UNIT II
LINEAR DATA STRUCTURES – STACKS, QUEUES
Stack ADT – Evaluating arithmetic expressions- other applications- Queue ADT – circular queue
Implementation – Double ended Queues – applications of queues
Stack ADT
Stack
Concept of Stack
A Stack is data structure in which addition of new element or deletion of existing element always
takes place at a same end. This end is known as the top of the stack. That means that it is
possible to remove elements from a stack in reverse order from the insertion of elements into the
stack.
One other way of describing the stack is as a last in, first out (LIFO) abstract data type and linear
data structure.
Operations on Stack
The stack is basically performed two operations PUSH and POP. Push and pop are the operations
that are provided for insertion of an element into the stack and the removal of an element from
the stack, respectively.
PUSH: - PUSH operation performed for the adding item to the stack.
POP: - POP operation performed for removing an item from a stack.
pop routine
void pop(Stack S)
{
if(Top==-1)
Errro(“Empty stack! Deletion not possible”);
else
{
X=s[Top];
Top=Top-1;
}
}
Routine to return top Element of the stack
int TopElement(Stack S)
{
if(Top==-1)
{
Error(“Empty stack!!No elements”);
return 0;
}
else
return S[Top];
}
Linked list implementation of Stack
S
header
30 20 10 N
40
push routine /*Inserts element at front of the list
temp
40 30 20 10 N
40
TOP
HEADER
30 20 10 N
TOP
Output
C Operator Precedence Table
This page lists C operators in order of precedence (highest to lowest). Their associativity indicates in what order
operators of equal precedence in an expression are applied.
Operator Description Associativity
() Parentheses (function call) (see Note 1) left-to-right
[] Brackets (array subscript)
. Member selection via object name
-> Member selection via pointer
++ -- Postfix increment/decrement (see Note 2)
++ -- Prefix increment/decrement right-to-left
+- Unary plus/minus
!~ Logical negation/bitwise complement
(type) Cast (convert value to temporary value of type)
* Dereference
& Address (of operand)
sizeof Determine size in bytes on this implementation
* / % Multiplication/division/modulus left-to-right
+ - Addition/subtraction left-to-right
<< >> Bitwise shift left, Bitwise shift right left-to-right
< <= Relational less than/less than or equal to left-to-right
> >= Relational greater than/greater than or equal to
== != Relational is equal to/is not equal to left-to-right
& Bitwise AND left-to-right
^ Bitwise exclusive OR left-to-right
| Bitwise inclusive OR left-to-right
&& Logical AND left-to-right
|| Logical OR left-to-right
?: Ternary conditional right-to-left
= Assignment right-to-left
+= -= Addition/subtraction assignment
*= /= Multiplication/division assignment
%= &= Modulus/bitwise AND assignment
^= |= Bitwise exclusive/inclusive OR assignment
<<= >>= Bitwise shift left/right assignment
, Comma (separate expressions) left-to-right
Note 1:
Parentheses are also used to group sub-expressions to force a different precedence; such parenthetical
expressions can be nested and are evaluated from inner to outer.
Note 2:
Postfix increment/decrement have high precedence, but the actual increment or decrement of the operand is
delayed (to be accomplished sometime before the statement completes execution). So in the statement y = x
* z++; the current value of z is used to evaluate the expression (i.e., z++ evaluates to z) and z only
incremented after all else is done. See postinc.c for another example.
Stack Applications
Application of Stack:
1. Parsing
2. Recursive Function
3. Calling Function
4. Expression Evaluation
5. Expression Conversion
1. Infix to Postfix
2. Infix to Prefix
3. Postfix to Infix
4. Prefix to Infix
6. Towers of Hanoi
( (
( ((
A (( A
+ ((+ A
B ((+ AB
) (( AB+
/ (( / AB+
C (( / AB+C
) (( AB+C/
EXPRESSION CONVERSION
Infix to Postfix
Infix :- Operators appears between operands
Eg. A+B
A/B+C
AB+
AB/C+
+AB
+/ABC
A A
* A
B
+ AB
*
+ AB*
( AB*
(
+
C AB*C
(
+
- - AB*C
(
+
D - AB*CD
(
+
/
/
-
(
+
AB*CDE/-+
E
Eg:- A/B-C+D*E-A*C
)
AB/C-DE*+AC*-
# AB*CDE/-+
(
(
a a
(
a
+
+
(
b ab
+
(
) ab+
* ab+
c ab+c
/ ab+c*
*
/
d ab+c*d
ab+c*d/
+
+
ab+c*d/e
/ ab+c*d/e
/
+
f ab+c*d/ef
/
+
# ab+c*d/ef/+
TOWERS OF HANOI
Rule 2 : No larger disk could ever reside on a pillar on top of a smaller disk.
Rule 3 : A 3rd pillar could be used as an intermediate to store one or more disks, while they
A B C
RECURSIVE SOLUTION
Step 2. If N = 2, move the 1st disk from A to B. Then move the 2nd disk from A to C, The move
the 1st
disk from B to C.
Step 3. If N = 3, Repeat the step (2) to more the first 2 disks from A to B using C as
[Link] the
3rd disk is moved from A to C. Then repeat the step (2) to move 2 disks from B to C using
A as
intermediate.
In general, to move N disks. Apply the recursive technique to move N - 1 disks from A to
B using C as an intermediate. Then move the Nth disk from A to C. Then again apply the
recursive technique to move N - 1 disks from B to C using A as an intermediate.
Since disks are moved from each tower in a LIFO manner,each tower may be
considered as a [Link] Number of moves required to solve th eproblem according to
our alforithm is given by,
O(N)=O(N-1)+1+O(N-1) =2N-1
FUNCTION CALLS
When a call is made to a new function all the variables local to the calling routine need to
be saved, otherwise the new function will overwrite the calling routine variables. Similarly the
current location address in the routine must be saved so that the new function knows where to go
after it is completed.
Call
balance()
Call push()
Int fact(int n)
int S;
if(n==1)
return(1);
else
S=n*fact(n-1);
return(S)
}
BALANCING THE SYMBOLS
Compilers check the programs for errors, a lack of one symbol will cause an error.
A Program thet checks whether everything is balanced.
Every right paranthesis should have its left paranthesis.
Check for balancing the paranthesis brackets braces and ignore any other character
((B*B)-{4*A*C}/[2*A]) #
(
( (
(
)
(
{ {
(
}
(
[ [
(
]
(
)
Queues
Queues is a kind of abstract data type where items are inserted one end (rear end) known as
enqueue operation and deleted from the other end(front end) known as dequeue operation.
This makes the queue a First-In-First-Out (FIFO) data structure.
The queue performs the function of a buffer.
Implementation of Queues
Operation on Queues
Operation Description
initialize() Initializes a queue by adding the value of rear and font to -1.
empty() It returns true(1) if the queue is empty and return false(0) if the queue is not empty.
full() It returns true(1) if the queue is full and return false(0) if the queue is not full.
Operation on queue
0 1 2 3 4 5 6 Max = 7
Array
Rear = -1
Front = -1
0 1 2 3 4 5 6
7
0 1 2 3 4 5 6
7 8 9 10 11
Front = 0 Rear = 4
After Insertion
0 1 2 3 4 5 6
10
Front = 3 Rear=3
After Deletion
0 1 2 3 4 5 6
10 11 12
void delete ( )
{
if (Front < 0)
print (" Queue Underflow");
else
{
X = Queue [Front];
if (Front = = Rear)
{
Front = 0;
Rear = -1;
}
else
Front = Front + 1 ;
}}
Application of queues
Enqueue operation is performed at the end of the list and Dequeue operation is performed
at the front of the list.
QUEUE ADT
10 20 NULL
Struct Node;
typedef Struct Node * Queue;
int IsEmpty (Queue Q);
Queue CreateQueue (void);
void MakeEmpty (Queue Q);
void Enqueue (int X, Queue Q);
void Dequeue (Queue Q);
Struct Node
{
int Element;
Struct Node *Next;
}* Front = NULL, *Rear = NULL;
ROUTINE TO CHECK WHETHER THE QUEUE IS EMPTY
int IsEmpty (Queue Q) // returns boolean value /
{ // if Q is empty
return (1);
Struct CreateQueue ( )
{
Queue Q;
Q = Malloc (Sizeof (Struct Node));
if (Q = = NULL)
Error ("Out of Space");
MakeEmpty (Q);
return Q;
}
void MakeEmpty (Queue Q)
{
if (Q = = NULL)
Error ("Create Queue First");
else
while (! IsEmpty (Q)
Dequeue (Q);
i)
10 20 NULL
Dequeue
ii) 20 NULL
iii)
EmptyQueue
Data Next
i) 10 NULL
Newnode
Front Rear
ii)
10 1000 20 NULL
iii)
10 1000 20 3000 30 NULL
void Dequeue ( )
{
Struct node *temp;
if (Front = = NULL)
Error("Queue is underflow");
else
{
temp = Front;
if (Front = = Rear)
{
Front = NULL;
Rear = NULL;
}
else
Front = Front Next;
Print (tempdata);
free (temp);
}
}
Front Rear
Assign the front element as temp and frontnext as front to delete the temp data.
20 3000 30 NULL
Front Rear
Assign the front element as temp and frontnext as front to delete the temp data.
30 NULL
Front
Rear
If front and rear are equal assign to null and delete the corresponding front element
Front=rear=NULL
void insert();
void del();
void display();
void main()
{
int choice;
clrscr();
do
{
clrscr();
printf("\t Menu");
printf("\n 1. Insert");
printf("\n 2. Delete");
printf("\n 3. Display ");
printf("\n 4. Exit");
void insert()
{
int no;
printf("\n Enter No.:");
scanf("%d",&no);
void del()
{
if(front==-1)
{
printf("\nQueue Underflow");
return;
}
else
{
printf("\nDeleted Item:-->%d\n",q[front]);
}
if(front==rear)
{
front=-1;
rear=-1;
}
else
{
front=front+1;
}
}
void display()
{
int i;
if(front==-1)
{
printf("\nQueue is empty....");
return;
}
for(i=front;i<=rear;i++)
printf("\t%d",q[i]);
}
Dynamic Queue
struct link
int info;
}*front,*rear;
#include<stdio.h>
#include<conio.h>
struct link
{
int info;
struct link *next;
}*front,*rear;
int delete_q();
void display();
void main()
{
int ch,no;
rear=NULL;
front=NULL;
clrscr();
while(1)
{
printf("\n Dynemic Queue Menu");
printf("\n 1->Insert ");
printf("\n 2->Delete ");
printf("\n 3->Display");
printf("\n 4->Exit ");
printf("\n enter your choice : ");
scanf("%d",&ch);
switch(ch)
{
case 1:
{
printf("Enter The Element Value\n");
scanf("%d",&no);
insert_q(no);
printf("\n queue after insertion");
display();
break;
}
case 2:
{
no=delete_q();
printf("\n The deleted element = %d",no);
printf("\n queue after deletion");
display();
break;
}
case 3:
{
printf("\n Queue is as follow");
display();
break;
}
case 4:
{
printf("Exit....!!!!");
getch();
exit(0);
break;
}
default :
printf("\n Wrong choice...!!!! " );
int delete_q()
{
struct link *t;
int no;
if(front==NULL||rear==NULL)
{
printf("\n queue is Under Flow");
getch();
return;
else
{
t=front;
no=t->info;
front=front->next;
free(t);
}
return(no);
void display()
{
struct link *t;
t=front;
if(front==NULL||rear==NULL)
{
printf("\nQueue is empty");
getch();
exit(0);
}
while(t!=NULL)
{
printf("\n %d",t->info);
t=t->next;
}
}
Circular Queue
A circular queue is an abstract data type that contains a collection of data which allows addition
of data at the end of the queue and removal of data at the beginning of the queue. Circular queues
have a fixed size.
Circular queue follows FIFO principle. Queue items are added at the rear end and the items are
deleted at front end of the circular queue.
Empty Queue
F = -1
R = -1
F= -1 F,R
R= -1
Empty Queue
if(front==0&&rear==-1)
print("Queue is underflow”);
exit();
if(front>rear)
for(i=0;i<=rear;i++)
return CQ[i];
for(j=front;j<=max-1;j++)
return CQ[j];
return CQ[rear];
return CQ[front];
else
for(i=front;i<=rear;i++)
return CQ[i];
return Q[rear];
return Q[front];
A double-ended queue is an abstract data type similar to an simple queue, it allows you to insert
and delete from both sides means items can be added or deleted from the front or rear end.
In DEQUE, insertion and deletion operations are performed at both the ends.
Deletion 10 20 30 40 Insertion
Insertion Deletion
EXCEPTIONAL CONDITION
Here insertion is allowed at one end and deletion is allowed at both ends.
Insertion
Deletion
Deletion
Front Rear
OUTPUT RESTRICTED DEQUE
Here insertion is allowed at both ends and deletion is allowed at one end.
Insertion
Insertion
Deletion
Front Rear
If(REAR==Arraysize-1)
FRONT=FRONT+1;
REAR=REAR+1;
DQ[REAR] = X;
else
REAR=REAR+1;
DQ[REAR]=X;
}
1 2 3 4 5
F R
F R
1 2 3
F R
If(FRONT==0)
FRONT=FRONT+1;
REAR=REAR+1;
DQ[FRONT] = X;
else
FRONT=FRONT-1;
DQ[FRONT]=X;
1 2 3 4 5
F R
F R
1 2 3
F R
If(FRONT==-1)
Else if(FRONT==REAR)
X=DQ[FRONT];
FRONT=-1;
REAR=-1;
}
Else
X=DQ[FRONT];
FRONT=FRONT+1;
F R
F R
1 2 3 4 5
F R
If(REAR==-1)
else if(FRONT==REAR)
X=DQ[REAR];
FRONT=-1;
REAR=-1;
}
Else
X=DQ[REAR];
REAR=REAR-1;
F R
F R
1 2 3 4 5
F R
APPLICATIONS OF QUEUE
* Priority Queues can be used to sort the elements using Heap Sort.
* Simulation.
* Computer networks where the server takes the jobs of the client as per the queue strategy.
UNIT III
Non Linear DataStructures–Trees,Tree ADT,Ttree traversals,Binary Tree
ADT,Expression Trees Applications of Trees,Binary search tree ADT,Threaded Binary
Trees AVL Trees,B-Tre, B+Tree ,Heap,Applications of heap
Non Linear Data Structures – Trees
Tree ADT
Definition: A tree is a collection of nodes. The collection can be empty; otherwise, a tree
consists of a distinguished node r, called root, and zero or more nonempty (sub)trees T1, T2, …
Tk, each of whose roots are connected by a direct edge from r. A tree structure is shown in the
figure below:
Tree Terminologies
Node
Item or information.
Root
The first and top node in a tree is called the root node. i.e. A node which doesn’t have a parent
is called a root.
Parent
A node that has a child is called the child's parent node (or ancestor node)
Edge
An edge is a connection between two nodes.
Child
A node of a tree referred to by a parent node. Every node, except the root, is the child of some
parent. Here Node B is the child of node A, if A is the parent of B.
Sibling
Children of the same parent are said to be siblings. Here B, C, D, E are siblings of A.
Similarly I, J, K, L are siblings.
Path
A path in a tree is a list of distinct nodes in which successive nodes are connected by edges in
the tree. There is exactly only one path from each node to root.
Leaf
A node which doesn’t have children is called leaf or terminal node. (Or) Nodes at the
bottommost level of the tree are called leaf nodes . Here B,K,L,G,H,M,J are leaf nodes.
Ancestor
Node A is an ancestor of node B, if A is the parent of B.
Length
The length is defined as the number of edges on the path. The length for the path A to L is 3.
Degree
The number of subtrees of a node is called its degree.
Degree of A is 4
Degree of C is 2
Degree of D is 1
Degree of H is 0.
Level of a Node
Level of the root of a tree is 0, and the level of any other node in the tree is one more than the
level of its parent.
Depth of a Tree
The depth of a tree is the maximum level of any leaf in the tree (also called the height of the
tree).
For any node n, the depth of n is the length of the unique path from root to n.
The depth of the root is 0.
Depth of node L is 3.
Descendant
Node B is the descendant of node A, if A is an ancestor of node B
Subtree
Any node of a tree with all of its descendants is called its subtree.
Height
The height of a node is the length of the longest downward path to a leaf from that node. The
height of the root is the height of the tree.
For any node n, the height of the node n is the length of the longest path from n to the leaf.
The height of the leaf is always 0.
Here - height of node F is 1.
- height of L is 0.
-----------------------------------------------------------------------------------------------------------------
Tree Traversals
Traversing is the process of visiting every node in the tree exactly once. Therefore, a complete
traversal of a binary tree implies visiting the nodes of the tree in some linear sequence.
In a linear list nodes are visited from first to last, but a tree being a non linear one we need
definite rules. There are no. of ways to traverse a tree. All of them differ only in the order in
which they visit the nodes.
The three main methods of traversing a tree are
Postorder traversal strategy
Preorder traversal strategy
Inorder traversal strategy
Preorder Traversal (Depth-First Order)
1. Visit the root
2. Traverse the left subtree in preorder.
3. Traverse the right subtree in preorder.
Postorder Traversal
1. Traverse the left subtree in postorder
2. Traverse the right subtree in preorder
3. Visit the root
Inorder Traversal (Symmetric Order)
1. Traverse the left subtree in inorder
2. Visit the root
3. Traverse the right subtree in inorder
Example
Preorder Traversal: ABDECF
First, visit the Root A
Now visit the left subtree in preorder .For left subtree, B is the root. So visit root B, then
left of B and then visit right of B. It gives the traversing order BDE.
Start visiting the right subtree of A in preorder. For right subtree, C is the root. So visit
the root C, then left of C (In the given example there is no left of C), and then visit right of
C. Now the order is CF.
Finally, the preorder traversal of above tree is ABDECF.
Inorder Traversal: DBEACF
First visit the left subtree in inorder .For left subtree, B is the root. So visit left of B, then
root and then visit right of B. It gives the traversing order DBE.
Now visit the Root A.
Start visiting the right subtree of A in inorder. For right subtree, C is the root. So visit left
of C (In the given example there is no left of C), then root C and then visit right of C. Now
the order is CF.
Finally, the inorder traversal of above tree is DBEACF.
Postorder Traversal: DEBFCA
First visit the left subtree in postorder .For left subtree, B is the root. So visit the left of
B,then visit right of B and then visit root B. It gives the traversing order DEB.
Start visiting the right subtree of A in preorder. For right subtree, C is the root. So left of
C (In the given example there is no left of C), then visit right of C and then visit root C.
Now the order is FC.
Now visit the Root A
Finally, the postorder traversal of above tree is DEBFCA.
Example
For the example given below, the traversals are given below:
Preorder
+-/+AB*CD*E-FGH
Inorder :
A+B/C*D-E*F-G+H
Postorder:
AB+CD*/EFG-*-H+
Left Child Right Sibling Data Structure:
Implementation of Tree
Tree can be implemented using linked list concept. A structure can be declared with 3
elements. One element contains the data. The second element points to the first child of the tree.
The third element points to the next sibling of the first child. The syntax is given below:
Node Declaration for Trees
typedef struct TreeNode * PtrToNode;
struct TreeNode
{ ElementType element;
PtrToNode FirstChild;
PtrToNode NextSibling;
};
Example
Explanation
A is the root
B is the first child of A
C, D & E are the next siblings of B.
----------------------------------------------------------------------------------------------------------------
Binary Tree ADT
• A binary tree is a tree in which no node can have more than two children.
• A binary tree is called strictly binary tree if every nonleaf node in the tree has nonempty
left and right subtrees i.e., every nonleaf node has two children.
• A complete binary tree of depth d is a strictly binary tree with all leaf nodes at level d.
A tree is a finite set of nodes having a distinct node called root. Binary Tree is a tree which is
either empty or has at most two subtrees, each of the subtrees also being a binary tree. It means
each node in a binary tree can have 0, 1 or 2 subtrees. A left or right subtree can be empty.
A binary tree is made of nodes, where each node contains a "left" pointer, a "right" pointer,
and a data element. The "root" pointer points to the topmost node in the tree. The left and right
pointers point to smaller "subtrees" on either side. A null pointer represents a binary tree with no
elements -- the empty tree.
It has a distinct node called root i.e. 2. And every node has 0, 1 or 2 children. So it is a binary
tree as every node has a maximum of 2 children.
If A is the root of a binary tree & B the root of its left or right subtree, then A is the parent or
father of B and B is the left or right child of A. Those nodes having no children are leaf nodes.
Any node say, A is the ancestor of node B and B is the descendant of A if A is either the father of
B or the father of some ancestor of B. Two nodes having same father are called brothers or
Siblings.
Going from leaves to root is called climbing the tree & going from root to leaves is called
descending the tree.
A binary tree in which every non leaf node has non empty left & right subtrees is called a
strictly binary tree. The tree shown below is a strictly binary tree.
The no. of children a node has is called its degree. The level of root is 0 & the level of any
node is one more than its father. In the strictly binary tree shown above A is at level 0, B & C at
level 1, D & E at level 2 & F & g at level. The depth of a binary tree is the length of the longest
path from the root to any leaf. In the above tree, depth is 3.
1.1. Linked List Representation of Binary Tree
The structure of each node of a binary tree contains one data field and two pointers, each for
the right & left child. Each child being a node has also the same structure. The structure of a
node is shown below.
Value
Left Right
Binary trees can be represented by links where each node contains the address of the left child
and the right child. If any node has its left or right child empty then it will have in its respective
link field, a null value. A leaf node has null value in both of its links.
Pictorial Represention of a Binary Tree
Address of the Left Child Data Element Address of the Right Child
Representation of a Leaf
Note: Addresses of the Left Child and Right child are NULL.
Left and Right Skewed Trees
The tree in which each node is attached as a left child of pal left skewed tree. The tree in
which each node is attached as a i node then it is called right skewed tree.
Representation of Trees
There are two ways of representing the binary tree.
1. Sequential representation
2. Linked representation.
Let us see these representations one by one.
1. Sequential Representation of Binary Trees or Array Representation
Each node is sequentially arranged from top to bottom and understand this matter by
numbering each node. The numbering of root node and then remaining nodes will give ever
increasing number direction. The nodes on the same level will be numbered from left to right.
The numbering will be as shown below.
Linked List Implementation
Binary Tree: Array Implementation
Binary tree can be implemented using array. The structure is given below.
typedef struct TreeNode *PtrToNode;
typedef struct TreeNode *Tree;
struct TreeNode
{
ElementType Element;
Tree Left;
Tree Right;
};
-----------------------------------------------------------------------------------------------------------
Expression Tree
Expression tree is also a binary tree in which the leaf or terminal nodes are operands and non-
terminal or intermediate nodes are operators used for traversal.
2. Next + is read , so two pointers to trees are popped, a new tree is formed, a new tree is
formed, and a pointer to it is pushed onto the stack.
3. Now, c, d and e are read, and for each a one-node tree is created and a pointer to the
corresponding tree is pushed onto the stack.
4. Now ‘+’ is read, so two trees are merged.
5. Continuing, ‘*’ is read, so we pop two tree pointers and form a new tree with a ‘*’ as root.
6. Finally, the last symbol ‘*’ is read, two trees are merged, and a pointer to the final tree is
left on the stack.
---------------------------------------------------------------------------------------------------------------------
Applications of Trees
There are many applications for trees. One of the popular uses is the directory structure in
many common operating systems, including UNIX and DOS.
The root of this directory is /usr.
/usr has 3 children namely mark, alex and bill, which are directories.
./usr contains 3 directories and no regular files.
The filename /usr/mark/book/ch1.r is obtained by following the leftmost child 3
times.
Each / after the first name indicates an edge; the result is the full pathname.
Advantages of Hierarchical File System
1. It allows users to organize the data logically.
2. Two files in different directories can share the same name, because their path is different
from the root.
Routine to List a Directory in a Hierarchical File System
Static void listdir(directoryorfile D, int depth)
{
if(D is a legitimate entry)
{
printname(D, depth);
if(D is a directory)
for each child, c, of D
listdir(C, depth+1);
}
}
Explanation
This is executed by a recursive function call listdir.
This procedure starts with the directory name and the depth 0. (Note: The depth of the
root is 0).
D can be either a directory or a file.
It prints entry of D.
If D is a Directory, it prints the children C recursively.
This procedure terminates when the parameter for listdir is a not a valid parameter.
Routine which Call the Listdir Routine
void listdirectory(directoryorfile D)
{
listdir(D,0);
}
Explanation
This routine call listdir with two parameters D and 0, where 0 denotes the depth of the
root.
-----------------------------------------------------------------------------------------------------------------
Binary Search Tree ADT
For every node, X, in the tree, the values of all the keys in its left subtree are smaller than the
key value of X, and the values of all the keys in its right subtree are larger than the key value of
X.
BST: Implementation
Struct TreeNode;
typedef struct TreeNode *Position;
typedef struct TreeNode *SearchTree;
SearchTree MakeEmpty( SearchTree T );
Position Find (ElementType X, SearchTree T);
Position FindMin( SearchTree T );
Position FindMax( SearchTree T );
SearchTree Insert (ElementType X, SearchTree T );
SearchTree Delete (ElementType X, SearchTree T );
ElementType Retrieve (Position P);
Binary Search Tree Declaration
struct TreeNode
{
ElementType Element;
SearchTree Left;
SearchTree Right;
};
BST Implementation: MakeEmpty
This operation is mainly for initialization. The implementation follows recursive function call
technique.
Procedure
SearchTree MakeEmpty( SearchTree T )
{ if( T != NULL )
{
MakeEmpty( TLeft );
MakeEmpty( T Right );
free( T );
}
return NULL;
}
This function uses recursive function call.
This function initially makes the left subtrees empty.
Then the function makes the right subtrees empty.
BST Implementation: Find
This operation generally requires returning a pointer to the node in tree T that has key X, or
NULL if there is no such node.
If T is NULL, then we can just return NULL.
Otherwise, if the key stored at T is X, return T.
Otherwise make a recursive call on a subtree of T, either left or right.
If X is less than the key stored at T, then search Left subtree, otherwise search the right
tree recursively.
Procedure
Position Find( ElementType X, SearchTree T )
{ if( T == NULL )
return NULL;
if ( X < T Element )
return Find( X, T Left );
else if ( X > T Element )
return Find( X, T Right );
else
return T;
}
Note: Here, key denotes the data element.
The function compares the searching element with the element in the root.
If the element is less than the root, it searches the left subtree recursively.
Otherwise it searches the right subtree recursively.
This function returns the element if it is found in the tree.
Otherwise it returns NULL
BST Implementation: FindMin
This routine will return the position of the smallest elements in the tree. To perform the
FindMin start at the root and go left as long as there is a left child. The stopping point is the
smallest element, which will be always the left most child of the tree.
Procedure
Position FindMin( SearchTree T )
{ if ( T == NULL )
return NULL;
else if( T Left == NULL )
return T;
else
return FindMin( T Left );
}
In a Binary Search Tree, The minimum element is availabe in the left subtree.
Hence, this function searches the left subtree of the tree.
Searching takes place recursively.
The last node in the left subtree contains the minimum element. If T->left is NULL, then
it returns the element in that node.
If no tree is available, i.e T=NULL, then it returns NULL.
BST Implementation: FindMax
This routine will return the position of the largest elements in the tree. To perform the
FindMax start at the root and go right as long as there is a right child. The stopping point is the
largest element.
Position FindMax( SearchTree T )
{
if ( T != NULL )
while ( T Right != NULL )
T = T Right;
return T;
}
In a Binary Search Tree, The maximum element is availabe in the right subtree.
Hence, this function searches the right subtree of the tree.
If T->right is not equal to NULL, T moves to T->right. This is done till T->right is equal
to NULL.
The last node in the right subtree contains the maximum element. If T->right is NULL,
then it returns the element in that node.
If no tree is available, i.e T=NULL, then it returns NULL.
BST Implementation: Insert
To insert X in to tree T, proceed down the tree with a Find. If X is found do nothing
because BST will not have duplicate values. Otherwise, insert X at the appropriate spot on
the path traversed.
Output:
20
30
40
50
60
70
80
------------------------------------------------------------------------------------------------------------
Threaded Binary Tree
A binary search tree in which each node uses an otherwise-empty left child link to refer to the
node's in-order predecessor and an empty right child link to refer to its in-Order Successor.
In above binary tree, there are 8 null pointers & actual 6 pointers.
In all there are 14 pointers.
We can generalize it that for any binary tree with n nodes there will be (n+1) null pointers
and 2n total pointers.
The objective here to make effective use of these null pointers. J. perils & C. Thornton jointly
proposed idea to make effective use of these null pointers.
According to this idea we are going to replace all the null pointers by the appropriate pointer
values called threads.
And binary tree with such pointers are called threaded tree. In the memory representation of a
threaded binary tree, it is necessary to distinguish between a normal pointer and a thread.
Threaded Binary Tree: One-Way
We will use the right thread only in this case. To implement threads we need to use in-order
successor of the tree.
Inorder Traversal of The tree: D, B, E, A, F, C, G
Two Way Threaded Tree/Double Threads
Again two-way threading has left pointer of the first node and right pointer of the last node
will contain the null value. The header nodes is called two-way threading with header node
threaded binary tree.
AVL Tree
An AVL (Adelson-Velskii and Landis) tree is a binary search tree with a balance condition. For
every node in the AVL tree, the height of left subtree and the height of the right subtree can differ
by at most one.
Height of left subtree- Height of right subtree < = 1
The height of the left subtree minus the height of the right subtree of a node is called the balance
factor of the node. For an AVL tree, the balances of the nodes are always -1, 0 or 1.
The height of an empty tree is defined to be -1.
Given an AVL tree, if insertions or deletions are performed, the AVL tree may not remain
height balanced.
Balancing Trees
The tree becomes unbalanced whenever a node is inserted or deleted. The unbalanced tree is
balanced by rotating the tree to the left or right.
There are four cases that require rebalancing. All unbalanced trees fall into one of these four
cases:
1. Left of Left - A sub tree of a tree that is left high has become left high after the insertion
of an element.
Diagram
2. Right of right - A sub tree of a tree that is right high has become right high after inserting
an element.
Diagram
3. Right of Left - A sub tree of a tree that is left high has become right high after inserting an
element.
Diagram
4. Left of Right - A sub tree of a tree that is right high has become left high after inserting an
element.
Diagram
Rotation
An A VL tree becomes unbalanced when there is an insertion or deletion of a node from an
existing A VL tree.
To balance the tree, transformation of the tree is performed. This transformation is called
Rotation.
Rotation is a technique in which an unbalanced A VL tree is rotated either left or right side, so
that the tree becomes balanced again. i.e
| HL - HR|< = 1
There are two types of Rotation. They are:
Single Rotation
Double Rotation
Single Rotation: Rotation of the tree is performed only once either left or right side of the
unbalanced tree. Single Rotation is performed for the two cases namely left of left and right of
right
Double Rotation: Rotation of the tree is performed twice, one on the left side and the other
one on the right side and vice versa of the unbalanced tree. Double Rotation is performed for the
two cases namely left of right and right of left.
Single Rotation
Case 1: Left of Left: When the out-of-balance condition has been created by a left high
subtree of a left high tree, the tree is balanced by rotating the out-of-balance node to the right.
Diagram
Simple Right Rotation
Case 2: Right of Right:When the out-of-balance condition has been created by a right high
subtree of a right high tree, the tree is balanced by rotating the out-of-balance node to the left.
DoubleRotation
Case 3: Right of Left:Double rotation is needed for balancing the A VL tree in this case. The
tree is out of balance in which the root is left high and the left subtree is right high.
The left subtree which is right high is rotated first in the left side.
The root of the subtree which is left high is rotated second in the right side.
Basic Operations performed in an A VL Tree:
The basic operations performed in an A VL tree are as follows:
Inserting an element.
Deleting an element
Whenever these two operations are done, the balance condition has to be checked after the
operation. If the balance condition is not satisfied then rotation is performed. Single or double
rotation is performed depending upon the violation in the balance factor.
AVL Trees: Implementation
A VL tree is implemented using linked list concept. The node declaration for A VL trees is
given below:
struct AvlNode;
typedef struct AvlNode *Position;
typedefstruct AvlNode * AvlTree;
AvlTree MakeEmpty( AvlTree T);
Position Find(ElementType X, AvlTree T);
Position FindMin( AvlTree T );
Position FindMax( AvlTree T);
AvlTree Insert( ElementType X, AvlTree T);
AvlTree Delete( ElementType X, AvlTree T);
ElementType Retrieve( Position P );
struct AvlNode
{
ElementType Elemcnt; A vlTree Left; AvlTree Right;
int Height;
};
A structure AvlNode is created which contains four data elements. They are:
The data to be stored in the node.
Address of the left child
Address of the right child
Height of the node
Procedure to Compute the Height of a Node
static int Height( Position P )
{
if( P == NULL)
return -1 ; else
return P->Height;
}
Explanation
This function has a parameter P, which contains the address of the node.
If P is NULL, it implies that there is no node, hence the height is returned as -1.
Otherwise it returns the height of the node stored in the structure.
Procedure to Insert an Element Into the Tree
AvlTree lnsert( ElementType X, AvlTree T)
{
if ( T = NULL)
{
/* Create and return a one-node tree * /
T = malloc( sizeof( struct A vi Node ) ); if( T == NULL)
FatalError( "Out ofspace!!!" );
else
{ T->Element = X; T->Height- 0;
T->Left = T->Right = NULL;
}
}
else
if ( X < T ->Element )
{ T->Left = Insert( X, T->Left)
if ( Height( T ->Left ) - Height( T -> Right) = 2 )
if ( X < T ->Left->Element )
T = SingleRotateWithLeft( T)
else
T = DoubleRotate WithLeft( T );
}
else
if (X> T->Element)
{ T->Right = Insert( X, T->Right);
if( Height( T->Right ) - Height( T->Left ) == 2 )
if( X > T->Right->Element)
T = SingleRotateWithRight( T);
else
T = DoubleRotatcWithRight( T);
}
/* Else X is in the tree already; we'll do nothing * /
T->Height = Max( Height( T->Left), Height{ T>Right» + 1;
return T;
}
Explanation
1. This function has two parameters namely, the clement X to be inserted and T, the address
of the root of the tree.
2. 1fT is NULL, a new node is create p and the address is stored in T, else go to 5.
3. 1f T is NULL, it implies that memory allocation is not done and terminates.
4. Otherwise the new element X is inserted into the node T. The left and right pointer are
made NULL. The height of the node is initialized to O. Go to step 8.
5. The element X is compared with the element present in T. If X < T->element, then the
element is to be inserted in the left. A recursive function call is made with two parameters
X and T->left.
6. Otherwise if X > T->element, then the element is to be inserted in the right. A recursive
function call is made with two parameters X and T ->right.
7. Otherwise, itimplies that X is already present in the A VL. Hence the element cannot be
inserted.
8. After the element is inserted in the appropriate position, the balance factor is checked. If
the Balance factor is equal to 2, then it implies that the tree is not balanced. Hence,
rotation is to be performed.
9. The element is compared with T->left->elcment or T->right->element. If X is less, then
single rotation is performed. Otherwise, double rotation is performed.
10. After the rotation is performed, the height of each node is calculated again and the address
of the new tree is returned back to the main function.
Procedure for SingleRotation with Left
static Position SingleRotateWithLeft( Position K2 )
Position KI;
KI = K2->Left;
K2->Left = KI->Right;
Kl->Right = K2;
K2->Height = Max( Height(K2->LeH), Height(K2->Right) )+1;
Kl->Height = Max( Height(KI->Lell), K2->Height) + 1;
return Kl; /* New root */
}
Explanation
1. This function takes a parameter K2, which is the address of the node where rotation is to
be done.
2. The rotation takes place on the right side.
3. K1 becomes the root node and K2 becomes the right child ofKl.
4. The height of KI and K2 are calculated and the address of Kl is returned to the calling
function.
Procedure for Sine:le Rotation with Rie:ht:
static Position SingleRotateWithRight( Position K2 )
{
Position K 1 ;
KI = K2->right;
K2->right = K I->Ieft;
KI->left = K2:
K2->Height = Max( Height(K2->Left), Hcight(K2->Right) ) + 1;
KI->Height = Max( Height(KI->Left), K2->Hcight).oj. I;
return KI; 1* New root */
Explanation
1. This function takes a parameter K2, which is the address of the node where rotation is to
be done.
2. The rotation takes place on the left side.
3. KI becomes the root node and K2 becomes the left child of KI.
4. The height of K I and K2 are calculated and the address of K 1 is returned to the calling
function.
Proeedure for Double Rotation with Left
static Position DoubleRotateWithLcft( Position K3 )
{
/* Rotate between K I and K2 * /
K3->Left = SingleRotateWithRight( K3->Left );
/* Rotate between K3 and K2 * /
return SingleRotateWithLeft( K3 );
}
Explanation
1. This function takes a parameter K3, which is the address of the node
2. The first rotation takes place on thc len side with KJ->Ieft (K I )as Ihl.: parameter.
3. The rotation takes place between K I and K2.
4. K2 becomes the root node and Kl becomes the left child of K2.
5. The height of Kl and K2 are calculated and the address of K2 is returned to the calling
function.
6. The address of K2 is assigned to K3->left.
7. The second rotation takes between K3 and K2. The rotation takes place on the right side.
8. K2 becomes the root node and K3 becomes the right child of K2.
9. The height of K3 and K2 are calculated and the address of K2 is returned to the calling
function.
Procedure for Double Rotation with Rieht
static Position DoubleRotate WithRight( Position K 3)
{
/* Rotate between K I and K2 * /
K3->Right = SingleRotateWithRight( K3->Right );
/* Rotate between K3 and K2 >I< /
return SingleRotatcWithRight( K3 );
}
Explanation
1. This function takes a parameter K3, which is the address of the node.
2. The first rotation takes place on the right side with K3->right (Kl) as the parameter.
3. The rotation takes place between Kl and K2.
4. K2 becomes the root node and Kl becomes the right child ofK2.
5. The height of KI and K2 are calculated and the address of K2 is returned to the calling
function.
6. The address ofK2 is assigned to K3->right.
7. The second rotation takes between K3 and K2. The rotation takes place on the left side.
8. K2 becomes the root node and K3 becomes the left child of K2.
The height of K3 and K2 at'e calculated and the address of K2 is returned to the calling function.
---------------------------------------------------------------------------------------------------------------------
B-Tree
B-Tree is a self-balancing search tree. B Trees are multi-way trees. That is each node contains
a set of keys and pointers. A B Tree with four keys and five pointers represents the minimum size
of a B Tree node. B Trees are dynamic. That is, the height of the tree grows and contracts as
records are added and deleted.
In most of the other self-balancing search trees (like AVL and Red Black Trees), it is assumed
that everything is in main memory. To understand use of B-Trees, we must think of huge amount
of data that cannot fit in main memory. When the number of keys is high, the data is read from
disk in the form of blocks. Disk access time is very high compared to main memory access time.
The main idea of using B-Trees is to reduce the number of disk accesses. Most of the tree
operations (search, insert, delete, max, min, ..etc ) require O(h) disk accesses where h is height of
the tree. B-tree is a fat tree. Height of B-Trees is kept low by putting maximum possible keys in a
B-Tree node. Generally, a B-Tree node size is kept equal to the disk block size. Since h is low for
B-Tree, total disk accesses for most of the operations are reduced significantly compared to
balanced Binary Search Trees like AVL Tree, Red Black Tree, ..etc.
Properties of B-Tree
All leaves are at same level.
A B-Tree is defined by the term minimum degree ‘t’. The value of t depends upon disk
block size.
Every node except root must contain at least t-1 keys. Root may contain minimum 1 key.
All nodes (including root) may contain at most 2t – 1 keys.
Number of children of a node is equal to the number of keys in it plus 1.
All keys of a node are sorted in increasing order. The child between two keys k1 and k2
contains all keys in range from k1 and k2.
B-Tree grows and shrinks from root which is unlike Binary Search Tree. Binary Search
Trees grow downward and also shrink from downward.
Like other balanced Binary Search Trees, time complexity to search, insert and delete is
O(Logn).
Following is an example B-Tree of minimum degree 3. Note that in practical B-Trees, the
value of minimum degree is much more than 3.
Search
Search is similar to search in Binary Search Tree. Let the key to be searched be k. We start
from root and recursively traverse down. For every visited non-leaf node, if the node has key, we
simply return the node. Otherwise we recur down to the appropriate child (The child which is
just before the first greater key) of the node. If we reach a leaf node and don’t find k in the leaf
node, we return NULL.
Traverse
Traversal is also similar to Inorder traversal of Binary Tree. We start from the leftmost child,
recursively print the leftmost child, then repeat the same process for remaining children and
keys. In the end, recursively print the rightmost child.
1.2. B+ Trees
A B+ Tree combines features of B Trees. It contains index pages and data pages. The data
pages always appear as leaf nodes in the tree. The root node and intermediate nodes are always
index pages. The index pages in a B+ tree are constructed through the process of inserting and
deleting records.
Thus, B+ trees grow and contract like their B Tree counterparts. The contents and the number
of index pages reflects this growth and shrinkage. B+ Trees and B Trees use a "fill factor" to
control the growth and the shrinkage. A 50% fill factor would be the minimum for any B+ or B
tree.
As our example we use the smallest page structure. This means that our B+ tree conforms to
the following guidelines.
As this table indicates each page must have a minimum of two keys. The root page may
violate this rule. The following table shows a B+ tree. As the example illustrates this tree does
not have a full index page. (We have room for one more key and pointer in the root page.) In
addition, one of the data pages contains empty slots.
Adding a record when the leaf page is full but the index page is not.
Next, we're going to insert a record with a key value of 70 into our B+ tree. This record
should go in the leaf page containing 50, 55, 60, and 65. Unfortunately this page is full. This
means that we must split the page as follows:
The middle key of 60 is placed in the index page between 50 and 75. The following table
shows the B+ tree after the addition of 70.
Adding a record when both the leaf page and the index page are full.
As our last example, we're going to add a record containing a key value of 95 to our B+ tree.
This record belongs in the page containing 75, 80, 85, and 90. Since this page is full we split it
into two pages:
The middle key, 85, rises to the index page. Unfortunately, the index page is also full, so we
split the index page:
The following table illustrates the addition of the record containing 95 to the B+ tree.
Rotation
B+ trees can incorporate rotation to reduce the number of page splits. A rotation occurs when
a leaf page is full, but one of its sibling pages is not full. Rather than splitting the leaf page, we
move a record to its sibling, adjusting the indices as necessary. Typically, the left sibling is
checked first (if it exists) and then the right sibling. As an example, consider the B+ tree before
the addition of the record containing a key of 70. As previously stated this record belongs in the
leaf node containing 50 55 60 65. Notice that this node is full, but its left sibling is not.
Using rotation we shift the record with the lowest key to its sibling. Since this key appeared in
the index page we also modify the index page. The new B+ tree appears in the following table.
---------------------------------------------------------------------------------------------------------------
Heap
Heap data structure is a specialized binary tree based data structure. Heap is a binary tree with
special characteristics. In a heap data structure, nodes are arranged based on thier value. A heap
data structure, some time called as Binary Heap.
There are two types of heap data structures and they are as follows.
Max Heap
Min Heap
Every heap data structure has the following properties...
Property #1 (Ordering): Nodes must be arranged in a order according to values based on Max
heap or Min heap.
Property #2 (Structural): All levels in a heap must full, except last level and nodes must be
filled from left to right strictly.
Max heap data structure is a specialized full binary tree data structure except last leaf node
can be alone. In a max heap nodes are arranged based on node value.
Max heap is defined as follows...
Max heap is a specialized full binary tree in which every parent node contains greater or equal
value than its child nodes. And last leaf node can be alone.
Example
Above tree is satisfying both Ordering property and Structural property according to the Max
Heap data structure.
Operations on Max Heap
The following operations are performed on a Max heap data structure...
Finding Maximum
Insertion
Deletion
Finding Maximum Value Operation in Max Heap
Finding the node which has maximum value in a max heap is very simple. In max heap, the
root node has the maximum value than all other nodes in the max heap. So, directly we can
display root node value as maximum value in max heap.
Insertion Operation in Max Heap
Insertion Operation in max heap is performed as follows.
Step 1: Insert the newNode as last leaf from left to right.
Step 2: Compare newNode value with its Parent node.
Step 3: If newNode value is greater than its parent, then swap both of them.
Step 4: Repeat step 2 and step 3 until newNode value is less than its parent nede (or)
newNode reached to root.
Example
Consider the above max heap. Insert a new node with value 85.
Step 1: Insert the newNode with value 85 as last leaf from left to right. That means newNode
is added as a right child of node with value 75. After adding max heap is as follows:
Step 2: Compare newNode value (85) with its Parent node value (75). That means 85 >
75.
Step 3: Here new Node value (85) is greater than its parent value (75), then swap both of
them. After swapping, max heap is as follows.
Step 4: Now, again compare newNode value (85) with its parent nede value (89).
Here, newNode value (85) is smaller than its parent node value (89). So, we stop insertion
process. Finally, max heap after insetion of a new node with value 85 is as follows
Step 2: Delete last node. Here node with value 90. After deleting node with value 90 from
heap, max heap is as follows.
Step 3: Compare root node (75) with its left child (89).
Here, root value (75) is smaller than its left child value (89). So, compare left child (89) with
its right sibling (70).
Step 4: Here, left child value (89) is larger than its right sibling (70), So, swap root (75) with
left child (89).
Here, node with value 75 is larger than its left child. So, we compare node with value 75 is
compared with its right child 85.
Step 6: Here, node with value 75 is smaller than its right child (85). So, we swap both of
them. After swapping max heap is as follows.
Step 7: Now, compare node with value 75 with its left child (15).
Here, node with value 75 is larger than its left child (15) and it does not have right child. So
we stop the process.
Finally, max heap after deleting root node (90) is as follows...
UNIT IV
Non Linear Data Structures – Graphs
Definition – Representation of Graph – Types of graph - Breadth-first traversal - Depth-first
traversal – Topological Sort – Bi-connectivity – Cut vertex – Euler circuits – Applications of
graphs.
Definition
Graph is a nonlinear data structure.
It contains a set of points known as nodes (or vertices) and set of links known as edges (or
Arcs) which connects the vertices.
Generally, a graph G is represented as G = ( V , E ), where V is set of vertices and E is set
of edges.
Example
The following is a graph with 5 vertices and 6 edges.
This graph G can be defined as G = ( V , E )
Where V = {A,B,C,D,E} and E = {(A,B),(A,C)(A,D),(B,D),(C,D),(B,E),(E,D)}.
Graph Terminology
We use the following terms in graph data structure...
Vertex
A individual data element of a graph is called as Vertex. Vertex is also known as node. In
above example graph, A, B, C, D & E are known as vertices.
Edge
An edge is a connecting link between two vertices. Edge is also known as Arc. An edge is
represented as (startingVertex, endingVertex). For example, in above graph, the link between
vertices A and B is represented as (A,B). In above example graph, there are 7 edges (i.e., (A,B),
(A,C), (A,D), (B,D), (B,E), (C,D), (D,E)).
Edges are three types:
Undirected Edge - An undirected egde is a bidirectional edge. If there is a undirected edge
between vertices A and B then edge (A , B) is equal to edge (B , A).
Directed Edge - A directed egde is a unidirectional edge. If there is a directed edge between
vertices A and B then edge (A , B) is not equal to edge (B , A).
Weighted Edge - A weighted egde is an edge with cost on it.
Undirected Graph
A graph with only undirected edges is said to be undirected graph.
Directed Graph
A graph with only directed edges is said to be directed graph.
Mixed Graph
A graph with undirected and directed edges is said to be mixed graph.
End vertices or Endpoints
The two vertices joined by an edge are called the end vertices (or endpoints) of the edge.
Origin
If an edge is directed, its first endpoint is said to be origin of it.
Destination
If an edge is directed, its first endpoint is said to be origin of it and the other endpoint is said
to be the destination of the edge.
Adjacent
If there is an edge between vertices A and B then both A and B are said to be adjacent. In
other words, Two vertices A and B are said to be adjacent if there is an edge whose end vertices
are A and B.
Incident
An edge is said to be incident on a vertex if the vertex is one of the endpoints of that edge.
Outgoing Edge
A directed edge is said to be outgoing edge on its orign vertex.
Incoming Edge
A directed edge is said to be incoming edge on its destination vertex.
Degree
Total number of edges connected to a vertex is said to be degree of that vertex.
Indegree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
Parallel Edges or Multiple Edges
If there are two undirected edges to have the same end vertices, and for two directed edges to
have the same origin and the same destination. Such edges are called parallel edges or multiple
edges.
Self-Loop
An edge (undirected or directed) is a self-loop if its two endpoints coincide.
Simple Graph
A graph is said to be simple if there are no parallel and self-loop edges.
Path
A path is a sequence of alternating vertices and edges that starts at a vertex and ends at a
vertex such that each edge is incident to its predecessor and successor vertex.
-----------------------------------------------------------------------------------------------------------------
--------------------
Graph Representations
Graph data structure is represented using following representations...
Adjacency Matrix
Adjacency List
Adjacency Matrix
Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph.
Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to
vertex j. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is also
used to represent weighted graphs. If adj[i][j] = w, then there is an edge from vertex i to vertex j
with weight w.
For example, consider the following undirected graph representation...
---------------------------------------------------------------------------------------------------------------------
---------------------
Types of Graphs
There are various types of graphs depending upon the number of vertices, number of edges,
interconnectivity, and their overall structure. We will discuss only a certain few important types
of graphs in this chapter.
Null Graph
A graph having no edges is called a Null Graph.
Example
In the above graph, there are three vertices named ‘a’, ‘b’, and ‘c’, but there are no edges
among them. Hence it is a Null Graph.
Trivial Graph
A graph with only one vertex is called a Trivial Graph.
Example
In the above shown graph, there is only one vertex ‘a’ with no other edges. Hence it is a
Trivial graph.
Non-Directed Graph
A non-directed graph contains edges but the edges are not directed ones.
Example
In this graph, ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’ are the vertices, and ‘ab’, ‘bc’, ‘cd’, ‘da’, ‘ag’, ‘gf’,
‘ef’ are the edges of the graph. Since it is a non-directed graph, the edges ‘ab’ and ‘ba’ are same.
Similarly other edges also considered in the same way.
Directed Graph
In a directed graph, each edge has a direction.
Example
In the above graph, we have seven vertices ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, and ‘g’, and eight edges
‘ab’, ‘cb’, ‘dc’, ‘ad’, ‘ec’, ‘fe’, ‘gf’, and ‘ga’. As it is a directed graph, each edge bears an arrow
mark that shows its direction. Note that in a directed graph, ‘ab’ is different from ‘ba’.
Simple Graph
A graph with no loops and no parallel edges is called a simple graph.
The maximum number of edges possible in a single graph with ‘n’ vertices
is nC2 where nC2 = n(n – 1)/2.
The number of simple graphs possible with ‘n’ vertices = 2nc2 = 2n(n-1)/2.
Example
In the following graph, there are 3 vertices with 3 edges which is maximum excluding the
parallel edges and loops. This can be proved by using the above formulae.
Connected Graph
A graph G is said to be connected if there exists a path between every pair of vertices.
There should be at least one edge for every vertex in the graph. So that we can say that it is
connected to some other vertex at the other side of the edge.
Example
In the following graph, each vertex has its own edge connected to other edge. Hence it is a
connected graph.
Disconnected Graph
A graph G is disconnected, if it does not contain at least two connected vertices.
Example 1
The following graph is an example of a Disconnected Graph, where there are two
components, one with ‘a’, ‘b’, ‘c’, ‘d’ vertices and another with ‘e’, ’f’, ‘g’, ‘h’ vertices.
The two components are independent and not connected to each other. Hence it is called
disconnected graph.
Example 2
In this example, there are two independent components, a-b-f-e and c-d, which are not
connected to each other. Hence this is a disconnected graph.
Regular Graph
A graph G is said to be regular, if all its vertices have the same degree. In a graph, if the
degree of each vertex is ‘k’, then the graph is called a ‘k-regular graph’.
Example
In the following graphs, all the vertices have the same degree. So these graphs are called
regular graphs.
In both the graphs, all the vertices have degree 2. They are called 2-Regular Graphs.
Complete Graph
A simple graph with ‘n’ mutual vertices is called a complete graph and it is denoted by ‘Kn’.
In the graph, a vertex should have edges with all other vertices, then it called a complete
graph.
In other words, if a vertex is connected to all other vertices in a graph, then it is called a
complete graph.
Example
In the following graphs, each vertex in the graph is connected with all the remaining vertices
in the graph except by itself.
In graph I,
a b c
a Not Connected Connected Connected
b Connected Not Connected Connected
c Connected Connected Not Connected
In graph II,
p q r s
p Not Connected Connected Connected Connected
q Connected Not Connected Connected Connected
r Connected Connected Not Connected Connected
s Connected Connected Connected Not Connected
Cycle Graph
A simple graph with ‘n’ vertices (n >= 3) and ‘n’ edges is called a cycle graph if all its edges
form a cycle of length ‘n’.
If the degree of each vertex in the graph is two, then it is called a Cycle Graph.
Notation − Cn
Example
Take a look at the following graphs:
Graph I has 3 vertices with 3 edges which is forming a cycle ‘ab-bc-ca’.
Graph II has 4 vertices with 4 edges which is forming a cycle ‘pq-qs-sr-rp’.
Graph III has 5 vertices with 5 edges which is forming a cycle ‘ik-km-ml-lj-ji’.
In the above example graph, we have two cycles a-b-c-d-a and c-f-g-e-c. Hence it is called a
cyclic graph.
Acyclic Graph
A graph with no cycles is called an acyclic graph.
Example
In the above example graph, we do not have any cycles. Hence it is a non-cyclic graph.
Bipartite Graph
A simple graph G = (V, E) with vertex partition V = {V 1, V2} is called a bipartite graph if
every edge of E joins a vertex in V1 to a vertex in V2.
In general, a Bipertite graph has two sets of vertices, let us say, V 1 and V2, and if an edge is
drawn, it should connect any vertex in set V1 to any vertex in set V2.
Example
In this graph, you can observe two sets of vertices − V 1 and V2. Here, two edges named ‘ae’
and ‘bd’ are connecting the vertices of two sets V1 and V2.
Complete Bipartite Graph
A bipartite graph ‘G’, G = (V, E) with partition V = {V 1, V2} is said to be a complete bipartite
graph if every vertex in V1 is connected to every vertex of V2.
In general, a complete bipartite graph connects each vertex from set V1 to each vertex from set
V2.
Example
The following graph is a complete bipartite graph because it has edges connecting each vertex
from set V1 to each vertex from set V2.
If |V1| = m and |V2| = n, then the complete bipartite graph is denoted by Km, n.
Km,n has (m+n) vertices and (mn) edges.
Km,n is a regular graph if m=n.
In general, a complete bipartite graph is not a complete graph.
Km,n is a complete graph if m=n=1.
The maximum number of edges in a bipartite graph with n vertices is
[
n2 4
]
If n=10, k5, 5= ⌊n2 4⌋ = ⌊102 4⌋ = 25
Similarly K6, 4=24
K7, 3=21
K8, 2=16
K9, 1=9
If n=9, k5, 4 = ⌊n2 4⌋ = ⌊92 4⌋ = 20
Similarly K6, 3=18
K7, 2=14
K8, 1=8
‘G’ is a bipartite graph if ‘G’ has no cycles of odd length. A special case of bipartite graph is
a star graph.
Star Graph
A complete bipartite graph of the form K 1, n-1 is a star graph with n-vertices. A star graph is a
complete bipartite graph if a single vertex belongs to one set and all the remaining vertices
belong to the other set.
Example
In the above graphs, out of ‘n’ vertices, all the ‘n–1’ vertices are connected to a single vertex.
Hence it is in the form of K1, n-1 which are star graphs.
Complement of a Graph
Let 'G−' be a simple graph with some vertices as that of ‘G’ and an edge {U, V} is present
in 'G−', if the edge is not present in G. It means, two vertices are adjacent in 'G−' if the two
vertices are not adjacent in G.
If the edges that exist in graph I are absent in another graph II, and if both graph I and graph II
are combined together to form a complete graph, then graph I and graph II are called
complements of each other.
Example
In the following example, graph-I has two edges ‘cd’ and ‘bd’. Its complement graph-II has
four edges.
Note that the edges in graph-I are not present in graph-II and vice versa. Hence, the
combination of both the graphs gives a complete graph of ‘n’ vertices.
Note − A combination of two complementary graphs gives a complete graph.
If ‘G’ is any simple graph, then
|E(G)| + |E('G-')| = |E(Kn)|, where n = number of vertices in the graph.
Example
Let ‘G’ be a simple graph with nine vertices and twelve edges, find the number of edges
in 'G-'.
You have, |E(G)| + |E('G-')| = |E(Kn)|
12 + |E('G-')| =
9(9-1) / 2 = 9C2
12 + |E('G-')| = 36
|E('G-')| = 24
‘G’ is a simple graph with 40 edges and its complement 'G−' has 38 edges. Find the number
of vertices in the graph G or 'G−'.
Let the number of vertices in the graph be ‘n’.
We have, |E(G)| + |E('G-')| = |E(Kn)|
40 + 38 = n(n-1)2
156 = n(n-1)
13(12) = n(n-1)
n = 13
----------------------------------------------------------------------------------------------------------------
-------------------
Graph Traversals
Graph traversal is technique used for searching a vertex in a graph. The graph traversal is also
used to decide the order of vertices to be visit in the search process.
A graph traversal finds the egdes to be used in the search process without creating loops that
means using graph traversal we visit all verticces of graph without getting into looping path.
There are two graph traversal techniques and they are as follows...
1. DFS (Depth First Search)
2. BFS (Breadth First Search)
Depth First Search
The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves
exhaustive searches of all the nodes by going ahead, if possible, else by backtracking.
Here, the word backtrack means that when you are moving forward and there are no more
nodes along the current path, you move backwards on the same path to find nodes to traverse. All
the nodes will be visited on the current path till all the unvisited nodes have been traversed after
which the next path will be selected.
This recursive nature of DFS can be implemented using stacks. The basic idea is as follows:
Pick a starting node and push all its adjacent nodes into a stack.
Pop a node from stack to select the next node to visit and push all its adjacent nodes into a
stack. Repeat this process until the stack is empty.
However, ensure that the nodes that are visited are marked. This will prevent you from
visiting the same node more than once.
If you do not mark the nodes that are visited and you visit the same node more than once, you
may end up in an infinite loop.
Pseudocode
Example
2
Mark S as visited and put it onto the stack. Explore any unvisited
adjacent node from S. We have three nodes and we can pick any of
them. For this example, we shall take the node in an alphabetical
order.
3
Mark A as visited and put it onto the stack. Explore any unvisited
adjacent node from A. Both Sand D are adjacent to A but we are
concerned for unvisited nodes only.
4 Visit D and mark it as visited and put onto the stack. Here, we
have B and C nodes, which are adjacent to D and both are
unvisited. However, we shall again choose in an alphabetical
order.
5
We choose B, mark it as visited and put onto the stack.
Here Bdoes not have any unvisited adjacent node. So, we
pop Bfrom the stack.
6 We check the stack top for return to the previous node and check if
it has any unvisited nodes. Here, we find D to be on the top of the
stack.
7
Only unvisited adjacent node is from D is C now. So we visit C,
mark it as visited and put it onto the stack.
As C does not have any unvisited adjacent node so we keep popping the stack until we find a
node that has an unvisited adjacent node. In this case, there's none and we keep popping until the
stack is empty.
Implementation in C
#include <stdio.h>
#include <stdlib.h>
/* ADJACENCY MATRIX */
int source,V,E,time,visited[20],G[20][20];
void DFS(int i)
{
int j;
visited[i]=1;
printf(" %d->",i+1);
for(j=0;j<V;j++)
{
if(G[i][j]==1&&visited[j]==0)
DFS(j);
}
}
int main()
{
int i,j,v1,v2;
printf("\t\t\tGraphs\n");
printf("Enter the no of edges:");
scanf("%d",&E);
printf("Enter the no of vertices:");
scanf("%d",&V);
for(i=0;i<V;i++)
{
for(j=0;j<V;j++)
G[i][j]=0;
}
/* creating edges :P */
for(i=0;i<E;i++)
{
printf("Enter the edges (format: V1 V2) : ");
scanf("%d%d",&v1,&v2);
G[v1-1][v2-1]=1;
}
for(i=0;i<V;i++)
{
for(j=0;j<V;j++)
printf(" %d ",G[i][j]);
printf("\n");
}
printf("Enter the source: ");
scanf("%d",&source);
DFS(source-1);
return 0;
}
mark s as visited.
while ( Q is not empty)
//Removing that vertex from queue,whose neighbour will be visited
now
v = [Link]( )
Example
3
We then see an unvisited adjacent node from S. In this example,
we have three nodes but alphabetically we choose A, mark it as
visited and enqueue it.
4
Next, the unvisited adjacent node from S is B. We mark it as
visited and enqueue it.
5
Next, the unvisited adjacent node from S is C. We mark it as
visited and enqueue it.
6
Now, S is left with no unvisited adjacent nodes. So, we dequeue
and find A.
7
From A we have D as unvisited adjacent node. We mark it as
visited and enqueue it.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm we
keep on dequeuing in order to get all unvisited nodes. When the queue gets emptied, the program
is over.
Implementation in C
#include<stdio.h>
int G[20][20],q[20],visited[20],n,front = 1, rear = 0 ;
void bfs(int v)
{
int i;
visited[v] = 1;
for(i=1;i<=n;i++)
if(G[v][i] && !visited[i])
q[++rear]=i;
if(front <= rear)
bfs(q[front++]);
}
int main()
{
int v,i,j;
printf("\n Enter the number of vertices:");
scanf("%d",&n);
for(i=1;i<=n;i++)
{
q[i]=0;
visited[i]=0;
}
printf("\n Enter graph data in matrix form:\n");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++)
scanf("%d",&G[i][j]);
printf("\n Enter the starting vertex:");
scanf("%d",&v);
bfs(v);
printf("\n The nodes which are reachable are:\n");
for(i=1;i<=n;i++)
if(visited[i])
printf("%d\t",i);
else
printf("\n %d is not reachable",i);
return 0;
}
-------------------------------------------------------------------------------------------------------------------
Topological Sorting
Definition
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such
that for every directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a
graph is not possible if the graph is not a DAG.
For example, a topological sorting of the following graph is “5 4 2 3 1 0”. There can be more
than one topological sorting for a graph. For example, another topological sorting of the
following graph is “4 5 2 3 1 0”. The first vertex in topological sorting is always a vertex with in-
degree as 0 (a vertex with no in-coming edges).
Topological Sort Example
Consider the following graph “D”. Find the topological ordering of this graph “G”.
int main(){
int i,j,k,n,a[10][10],indeg[10],flag[10],count=0;
for(i=0;i<n;i++){
indeg[i]=0;
flag[i]=0;
}
for(i=0;i<n;i++)
for(j=0;j<n;j++)
indeg[i]=indeg[i]+a[j][i];
while(count<n){
for(k=0;k<n;k++){
if((indeg[k]==0) && (flag[k]==0)){
printf("%d ",(k+1));
flag [k]=1;
}
for(i=0;i<n;i++){
if(a[i][k]==1)
indeg[k]--;
}
}
count++;
}
return 0;
}
Output
Enter the no of vertices:
4
Enter the adjacency matrix:
Enter row 1
0110
Enter row 2
0001
Enter row 3
0001
Enter row 4
0000
The topological order is:1 2 3 4
---------------------------------------------------------------------------------------------------------------------
------------------
Bi-Connectivity
Connectivity
Connectivity in an undirected graph means that every vertex can reach every other vertex via
any path. If the graph is not connected the graph can be broken down into Connected
Components.
Strong Connectivity
Strong Connectivity applies only to directed graphs. A directed graph is strongly connected if
there is a directed path from any vertex to every other vertex. This is same as connectivity in an
undirected graph, the only difference being strong connectivity applies to directed graphs and
there should be directed paths instead of just paths. Similar to connected components, a directed
graph can be broken down into Strongly Connected Components.
If in the above graph, vertex 1 and all the edges associated with it, i.e. the edges 1-0, 1-2 and
1-3 are removed, there will be no path to reach any of the vertices 2, 3 or 4 from the vertices 0
and 5, that means the graph will split into two separate components. One consisting of the
vertices 0 and 5 and another one consisting of the vertices 2, 3 and 4 as shown in the following
figure.
Likewise removing the vertex 0 will disconnect the vertex 5 from all other vertices. Hence the
given graph has two articulation points: 0 and 1.
---------------------------------------------------------------------------------------------------------------------
-----------------------
Euler Circuits
Euler Path
An Euler path is a path that uses every edge of a graph exactly once.
Euler Circuit
An Euler circuit is a circuit that uses every edge of a graph exactly once.
An Euler path starts and ends at different vertices. An Euler circuit starts and ends at the same
vertex.
-------------------------------------------------------------------------------------------------------------------------
----------------------
Applications of Graphs
Graphs are nothing but connected nodes(vertex). So any network related, routing, finding
relation, path etc related real life applications use graphs.
Since they are powerful abstractions, graphs can be very important in modeling data. In fact,
many problems can be reduced to known graph problems. Here we outline just some of the many
applications of graphs.
1. Social network graphs: to tweet or not to tweet. Graphs that represent who knows whom,
who communicates with whom, who influences whom or other relationships in social
structures. An example is the twitter graph of who follows whom. These can be used to
determine how information flows, how topics become hot, how communities develop, or
even who might be a good match for who, or is that whom.
2. Transportation networks. In road networks vertices are intersections and edges are the
road segments between them, and for public transportation networks vertices are stops and
edges are the links between them. Such networks are used by many map programs such as
Google maps, Bing maps and now Apple IOS 6 maps (well perhaps without the public
transport) to find the best routes between locations. They are also used for studying traffic
patterns, traffic light timings, and many aspects of transportation.
3. Utility graphs. The power grid, the Internet, and the water network are all examples of
graphs where vertices represent connection points, and edges the wires or pipes between
them. Analyzing properties of these graphs is very important in understanding the
reliability of such utilities under failure or attack, or in minimizing the costs to build
infrastructure that matches required demands.
4. Document link graphs. The best known example is the link graph of the web, where each
web page is a vertex, and each hyperlink a directed edge. Link graphs are used, for
example, to analyze relevance of web pages, the best sources of information, and good
link sites.
5. Protein-protein interactions graphs. Vertices represent proteins and edges represent
interactions between them that carry out some biological function in the cell. These graphs
can be used, for example, to study molecular pathways—chains of molecular interactions
in a cellular process. Humans have over 120K proteins with millions of interactions
among them.
6. Network packet traffic graphs. Vertices are IP (Internet protocol) addresses and edges are
the packets that flow between them. Such graphs are used for analyzing network security,
studying the spread of worms, and tracking criminal or non-criminal activity.
7. Scene graphs. In graphics and computer games scene graphs represent the logical or
spacial relationships between objects in a scene. Such graphs are very important in the
computer games industry.
8. Finite element meshes. In engineering many simulations of physical systems, such as the
flow of air over a car or airplane wing, the spread of earthquakes through the ground, or
the structural vibrations of a building, involve partitioning space into discrete elements.
The elements along with the connections between adjacent elements forms a graph that is
called a finite element mesh.
9. Robot planning. Vertices represent states the robot can be in and the edges the possible
transitions between the states. This requires approximating continuous motion as a
sequence of discrete steps. Such graph plans are used, for example, in planning paths for
autonomous vehicles.
10. Neural networks. Vertices represent neurons and edges the synapses between them. Neural
networks are used to understand how our brain works and how connections change when
we learn. The human brain has about 1011 neurons and close to 1015 synapses.
11. Graphs in quantum field theory. Vertices represent states of a quantum system and the
edges the transitions between them. The graphs can be used to analyze path integrals and
summing these up generates a quantum amplitude (yes, I have no idea what that means).
12. Semantic networks. Vertices represent words or concepts and edges represent the
relationships among the words or concepts. These have been used in various models of
how humans organize their knowledge, and how machines might simulate such an
organization.
13. Graphs in epidemiology. Vertices represent individuals and directed edges the transfer of
an infectious disease from one individual to another. Analyzing such graphs has become
an important component in understanding and controlling the spread of diseases.
14. Graphs in compilers. Graphs are used extensively in compilers. They can be used for type
inference, for so called data flow analysis, register allocation and many other purposes.
They are also used in specialized compilers, such as query optimization in database
languages.
15. Constraint graphs. Graphs are often used to represent constraints among items. For
example the GSM network for cell phones consists of a collection of overlapping cells.
Any pair of cells that overlap must operate at different frequencies. These constraints can
be modeled as a graph where the cells are vertices and edges are placed between cells that
overlap.
16. Dependence graphs. Graphs can be used to represent dependences or precedences among
items. Such graphs are often used in large projects in laying out what components rely on
other components and used to minimize the total time or cost to completion while abiding
by the dependences.
UNIT IV
Non Linear Data Structures – Graphs
Definition – Representation of Graph – Types of graph - Breadth-first traversal - Depth-first
traversal – Topological Sort – Bi-connectivity – Cut vertex – Euler circuits – Applications of
graphs.
Definition
Graph is a nonlinear data structure.
It contains a set of points known as nodes (or vertices) and set of links known as edges (or
Arcs) which connects the vertices.
Generally, a graph G is represented as G = ( V , E ), where V is set of vertices and E is set
of edges.
Example
The following is a graph with 5 vertices and 6 edges.
This graph G can be defined as G = ( V , E )
Where V = {A,B,C,D,E} and E = {(A,B),(A,C)(A,D),(B,D),(C,D),(B,E),(E,D)}.
Graph Terminology
We use the following terms in graph data structure...
Vertex
A individual data element of a graph is called as Vertex. Vertex is also known as node. In
above example graph, A, B, C, D & E are known as vertices.
Edge
An edge is a connecting link between two vertices. Edge is also known as Arc. An edge is
represented as (startingVertex, endingVertex). For example, in above graph, the link between
vertices A and B is represented as (A,B). In above example graph, there are 7 edges (i.e., (A,B),
(A,C), (A,D), (B,D), (B,E), (C,D), (D,E)).
Edges are three types:
Undirected Edge - An undirected egde is a bidirectional edge. If there is a undirected edge
between vertices A and B then edge (A , B) is equal to edge (B , A).
Directed Edge - A directed egde is a unidirectional edge. If there is a directed edge between
vertices A and B then edge (A , B) is not equal to edge (B , A).
Weighted Edge - A weighted egde is an edge with cost on it.
Undirected Graph
A graph with only undirected edges is said to be undirected graph.
Directed Graph
A graph with only directed edges is said to be directed graph.
Mixed Graph
A graph with undirected and directed edges is said to be mixed graph.
End vertices or Endpoints
The two vertices joined by an edge are called the end vertices (or endpoints) of the edge.
Origin
If an edge is directed, its first endpoint is said to be origin of it.
Destination
If an edge is directed, its first endpoint is said to be origin of it and the other endpoint is said
to be the destination of the edge.
Adjacent
If there is an edge between vertices A and B then both A and B are said to be adjacent. In
other words, Two vertices A and B are said to be adjacent if there is an edge whose end vertices
are A and B.
Incident
An edge is said to be incident on a vertex if the vertex is one of the endpoints of that edge.
Outgoing Edge
A directed edge is said to be outgoing edge on its orign vertex.
Incoming Edge
A directed edge is said to be incoming edge on its destination vertex.
Degree
Total number of edges connected to a vertex is said to be degree of that vertex.
Indegree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
Parallel Edges or Multiple Edges
If there are two undirected edges to have the same end vertices, and for two directed edges to
have the same origin and the same destination. Such edges are called parallel edges or multiple
edges.
Self-Loop
An edge (undirected or directed) is a self-loop if its two endpoints coincide.
Simple Graph
A graph is said to be simple if there are no parallel and self-loop edges.
Path
A path is a sequence of alternating vertices and edges that starts at a vertex and ends at a
vertex such that each edge is incident to its predecessor and successor vertex.
-----------------------------------------------------------------------------------------------------------------
--------------------
Graph Representations
Graph data structure is represented using following representations...
Adjacency Matrix
Adjacency List
Adjacency Matrix
Adjacency Matrix is a 2D array of size V x V where V is the number of vertices in a graph.
Let the 2D array be adj[][], a slot adj[i][j] = 1 indicates that there is an edge from vertex i to
vertex j. Adjacency matrix for undirected graph is always symmetric. Adjacency Matrix is also
used to represent weighted graphs. If adj[i][j] = w, then there is an edge from vertex i to vertex j
with weight w.
For example, consider the following undirected graph representation...
---------------------------------------------------------------------------------------------------------------------
---------------------
Types of Graphs
There are various types of graphs depending upon the number of vertices, number of edges,
interconnectivity, and their overall structure. We will discuss only a certain few important types
of graphs in this chapter.
Null Graph
A graph having no edges is called a Null Graph.
Example
In the above graph, there are three vertices named ‘a’, ‘b’, and ‘c’, but there are no edges
among them. Hence it is a Null Graph.
Trivial Graph
A graph with only one vertex is called a Trivial Graph.
Example
In the above shown graph, there is only one vertex ‘a’ with no other edges. Hence it is a
Trivial graph.
Non-Directed Graph
A non-directed graph contains edges but the edges are not directed ones.
Example
In this graph, ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’ are the vertices, and ‘ab’, ‘bc’, ‘cd’, ‘da’, ‘ag’, ‘gf’,
‘ef’ are the edges of the graph. Since it is a non-directed graph, the edges ‘ab’ and ‘ba’ are same.
Similarly other edges also considered in the same way.
Directed Graph
In a directed graph, each edge has a direction.
Example
In the above graph, we have seven vertices ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, and ‘g’, and eight edges
‘ab’, ‘cb’, ‘dc’, ‘ad’, ‘ec’, ‘fe’, ‘gf’, and ‘ga’. As it is a directed graph, each edge bears an arrow
mark that shows its direction. Note that in a directed graph, ‘ab’ is different from ‘ba’.
Simple Graph
A graph with no loops and no parallel edges is called a simple graph.
The maximum number of edges possible in a single graph with ‘n’ vertices
is nC2 where nC2 = n(n – 1)/2.
The number of simple graphs possible with ‘n’ vertices = 2nc2 = 2n(n-1)/2.
Example
In the following graph, there are 3 vertices with 3 edges which is maximum excluding the
parallel edges and loops. This can be proved by using the above formulae.
Connected Graph
A graph G is said to be connected if there exists a path between every pair of vertices.
There should be at least one edge for every vertex in the graph. So that we can say that it is
connected to some other vertex at the other side of the edge.
Example
In the following graph, each vertex has its own edge connected to other edge. Hence it is a
connected graph.
Disconnected Graph
A graph G is disconnected, if it does not contain at least two connected vertices.
Example 1
The following graph is an example of a Disconnected Graph, where there are two
components, one with ‘a’, ‘b’, ‘c’, ‘d’ vertices and another with ‘e’, ’f’, ‘g’, ‘h’ vertices.
The two components are independent and not connected to each other. Hence it is called
disconnected graph.
Example 2
In this example, there are two independent components, a-b-f-e and c-d, which are not
connected to each other. Hence this is a disconnected graph.
Regular Graph
A graph G is said to be regular, if all its vertices have the same degree. In a graph, if the
degree of each vertex is ‘k’, then the graph is called a ‘k-regular graph’.
Example
In the following graphs, all the vertices have the same degree. So these graphs are called
regular graphs.
In both the graphs, all the vertices have degree 2. They are called 2-Regular Graphs.
Complete Graph
A simple graph with ‘n’ mutual vertices is called a complete graph and it is denoted by ‘Kn’.
In the graph, a vertex should have edges with all other vertices, then it called a complete
graph.
In other words, if a vertex is connected to all other vertices in a graph, then it is called a
complete graph.
Example
In the following graphs, each vertex in the graph is connected with all the remaining vertices
in the graph except by itself.
In graph I,
a b c
a Not Connected Connected Connected
b Connected Not Connected Connected
c Connected Connected Not Connected
In graph II,
p q r s
p Not Connected Connected Connected Connected
q Connected Not Connected Connected Connected
r Connected Connected Not Connected Connected
s Connected Connected Connected Not Connected
Cycle Graph
A simple graph with ‘n’ vertices (n >= 3) and ‘n’ edges is called a cycle graph if all its edges
form a cycle of length ‘n’.
If the degree of each vertex in the graph is two, then it is called a Cycle Graph.
Notation − Cn
Example
Take a look at the following graphs:
Graph I has 3 vertices with 3 edges which is forming a cycle ‘ab-bc-ca’.
Graph II has 4 vertices with 4 edges which is forming a cycle ‘pq-qs-sr-rp’.
Graph III has 5 vertices with 5 edges which is forming a cycle ‘ik-km-ml-lj-ji’.
In the above example graph, we have two cycles a-b-c-d-a and c-f-g-e-c. Hence it is called a
cyclic graph.
Acyclic Graph
A graph with no cycles is called an acyclic graph.
Example
In the above example graph, we do not have any cycles. Hence it is a non-cyclic graph.
Bipartite Graph
A simple graph G = (V, E) with vertex partition V = {V 1, V2} is called a bipartite graph if
every edge of E joins a vertex in V1 to a vertex in V2.
In general, a Bipertite graph has two sets of vertices, let us say, V 1 and V2, and if an edge is
drawn, it should connect any vertex in set V1 to any vertex in set V2.
Example
In this graph, you can observe two sets of vertices − V 1 and V2. Here, two edges named ‘ae’
and ‘bd’ are connecting the vertices of two sets V1 and V2.
Complete Bipartite Graph
A bipartite graph ‘G’, G = (V, E) with partition V = {V 1, V2} is said to be a complete bipartite
graph if every vertex in V1 is connected to every vertex of V2.
In general, a complete bipartite graph connects each vertex from set V1 to each vertex from set
V2.
Example
The following graph is a complete bipartite graph because it has edges connecting each vertex
from set V1 to each vertex from set V2.
If |V1| = m and |V2| = n, then the complete bipartite graph is denoted by Km, n.
Km,n has (m+n) vertices and (mn) edges.
Km,n is a regular graph if m=n.
In general, a complete bipartite graph is not a complete graph.
Km,n is a complete graph if m=n=1.
The maximum number of edges in a bipartite graph with n vertices is
[
n2 4
]
If n=10, k5, 5= ⌊n2 4⌋ = ⌊102 4⌋ = 25
Similarly K6, 4=24
K7, 3=21
K8, 2=16
K9, 1=9
If n=9, k5, 4 = ⌊n2 4⌋ = ⌊92 4⌋ = 20
Similarly K6, 3=18
K7, 2=14
K8, 1=8
‘G’ is a bipartite graph if ‘G’ has no cycles of odd length. A special case of bipartite graph is
a star graph.
Star Graph
A complete bipartite graph of the form K 1, n-1 is a star graph with n-vertices. A star graph is a
complete bipartite graph if a single vertex belongs to one set and all the remaining vertices
belong to the other set.
Example
In the above graphs, out of ‘n’ vertices, all the ‘n–1’ vertices are connected to a single vertex.
Hence it is in the form of K1, n-1 which are star graphs.
Complement of a Graph
Let 'G−' be a simple graph with some vertices as that of ‘G’ and an edge {U, V} is present
in 'G−', if the edge is not present in G. It means, two vertices are adjacent in 'G−' if the two
vertices are not adjacent in G.
If the edges that exist in graph I are absent in another graph II, and if both graph I and graph II
are combined together to form a complete graph, then graph I and graph II are called
complements of each other.
Example
In the following example, graph-I has two edges ‘cd’ and ‘bd’. Its complement graph-II has
four edges.
Note that the edges in graph-I are not present in graph-II and vice versa. Hence, the
combination of both the graphs gives a complete graph of ‘n’ vertices.
Note − A combination of two complementary graphs gives a complete graph.
If ‘G’ is any simple graph, then
|E(G)| + |E('G-')| = |E(Kn)|, where n = number of vertices in the graph.
Example
Let ‘G’ be a simple graph with nine vertices and twelve edges, find the number of edges
in 'G-'.
You have, |E(G)| + |E('G-')| = |E(Kn)|
12 + |E('G-')| =
9(9-1) / 2 = 9C2
12 + |E('G-')| = 36
|E('G-')| = 24
‘G’ is a simple graph with 40 edges and its complement 'G−' has 38 edges. Find the number
of vertices in the graph G or 'G−'.
Let the number of vertices in the graph be ‘n’.
We have, |E(G)| + |E('G-')| = |E(Kn)|
40 + 38 = n(n-1)2
156 = n(n-1)
13(12) = n(n-1)
n = 13
----------------------------------------------------------------------------------------------------------------
-------------------
Graph Traversals
Graph traversal is technique used for searching a vertex in a graph. The graph traversal is also
used to decide the order of vertices to be visit in the search process.
A graph traversal finds the egdes to be used in the search process without creating loops that
means using graph traversal we visit all verticces of graph without getting into looping path.
There are two graph traversal techniques and they are as follows...
1. DFS (Depth First Search)
2. BFS (Breadth First Search)
Depth First Search
The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves
exhaustive searches of all the nodes by going ahead, if possible, else by backtracking.
Here, the word backtrack means that when you are moving forward and there are no more
nodes along the current path, you move backwards on the same path to find nodes to traverse. All
the nodes will be visited on the current path till all the unvisited nodes have been traversed after
which the next path will be selected.
This recursive nature of DFS can be implemented using stacks. The basic idea is as follows:
Pick a starting node and push all its adjacent nodes into a stack.
Pop a node from stack to select the next node to visit and push all its adjacent nodes into a
stack. Repeat this process until the stack is empty.
However, ensure that the nodes that are visited are marked. This will prevent you from
visiting the same node more than once.
If you do not mark the nodes that are visited and you visit the same node more than once, you
may end up in an infinite loop.
Pseudocode
Example
2
Mark S as visited and put it onto the stack. Explore any unvisited
adjacent node from S. We have three nodes and we can pick any of
them. For this example, we shall take the node in an alphabetical
order.
3
Mark A as visited and put it onto the stack. Explore any unvisited
adjacent node from A. Both Sand D are adjacent to A but we are
concerned for unvisited nodes only.
4 Visit D and mark it as visited and put onto the stack. Here, we
have B and C nodes, which are adjacent to D and both are
unvisited. However, we shall again choose in an alphabetical
order.
5
We choose B, mark it as visited and put onto the stack.
Here Bdoes not have any unvisited adjacent node. So, we
pop Bfrom the stack.
6 We check the stack top for return to the previous node and check if
it has any unvisited nodes. Here, we find D to be on the top of the
stack.
7
Only unvisited adjacent node is from D is C now. So we visit C,
mark it as visited and put it onto the stack.
As C does not have any unvisited adjacent node so we keep popping the stack until we find a
node that has an unvisited adjacent node. In this case, there's none and we keep popping until the
stack is empty.
Implementation in C
#include <stdio.h>
#include <stdlib.h>
/* ADJACENCY MATRIX */
int source,V,E,time,visited[20],G[20][20];
void DFS(int i)
{
int j;
visited[i]=1;
printf(" %d->",i+1);
for(j=0;j<V;j++)
{
if(G[i][j]==1&&visited[j]==0)
DFS(j);
}
}
int main()
{
int i,j,v1,v2;
printf("\t\t\tGraphs\n");
printf("Enter the no of edges:");
scanf("%d",&E);
printf("Enter the no of vertices:");
scanf("%d",&V);
for(i=0;i<V;i++)
{
for(j=0;j<V;j++)
G[i][j]=0;
}
/* creating edges :P */
for(i=0;i<E;i++)
{
printf("Enter the edges (format: V1 V2) : ");
scanf("%d%d",&v1,&v2);
G[v1-1][v2-1]=1;
}
for(i=0;i<V;i++)
{
for(j=0;j<V;j++)
printf(" %d ",G[i][j]);
printf("\n");
}
printf("Enter the source: ");
scanf("%d",&source);
DFS(source-1);
return 0;
}
mark s as visited.
while ( Q is not empty)
//Removing that vertex from queue,whose neighbour will be visited
now
v = [Link]( )
Example
3
We then see an unvisited adjacent node from S. In this example,
we have three nodes but alphabetically we choose A, mark it as
visited and enqueue it.
4
Next, the unvisited adjacent node from S is B. We mark it as
visited and enqueue it.
5
Next, the unvisited adjacent node from S is C. We mark it as
visited and enqueue it.
6
Now, S is left with no unvisited adjacent nodes. So, we dequeue
and find A.
7
From A we have D as unvisited adjacent node. We mark it as
visited and enqueue it.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm we
keep on dequeuing in order to get all unvisited nodes. When the queue gets emptied, the program
is over.
Implementation in C
#include<stdio.h>
int G[20][20],q[20],visited[20],n,front = 1, rear = 0 ;
void bfs(int v)
{
int i;
visited[v] = 1;
for(i=1;i<=n;i++)
if(G[v][i] && !visited[i])
q[++rear]=i;
if(front <= rear)
bfs(q[front++]);
}
int main()
{
int v,i,j;
printf("\n Enter the number of vertices:");
scanf("%d",&n);
for(i=1;i<=n;i++)
{
q[i]=0;
visited[i]=0;
}
printf("\n Enter graph data in matrix form:\n");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++)
scanf("%d",&G[i][j]);
printf("\n Enter the starting vertex:");
scanf("%d",&v);
bfs(v);
printf("\n The nodes which are reachable are:\n");
for(i=1;i<=n;i++)
if(visited[i])
printf("%d\t",i);
else
printf("\n %d is not reachable",i);
return 0;
}
-------------------------------------------------------------------------------------------------------------------
Topological Sorting
Definition
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such
that for every directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a
graph is not possible if the graph is not a DAG.
For example, a topological sorting of the following graph is “5 4 2 3 1 0”. There can be more
than one topological sorting for a graph. For example, another topological sorting of the
following graph is “4 5 2 3 1 0”. The first vertex in topological sorting is always a vertex with in-
degree as 0 (a vertex with no in-coming edges).
Topological Sort Example
Consider the following graph “D”. Find the topological ordering of this graph “G”.
int main(){
int i,j,k,n,a[10][10],indeg[10],flag[10],count=0;
for(i=0;i<n;i++){
indeg[i]=0;
flag[i]=0;
}
for(i=0;i<n;i++)
for(j=0;j<n;j++)
indeg[i]=indeg[i]+a[j][i];
while(count<n){
for(k=0;k<n;k++){
if((indeg[k]==0) && (flag[k]==0)){
printf("%d ",(k+1));
flag [k]=1;
}
for(i=0;i<n;i++){
if(a[i][k]==1)
indeg[k]--;
}
}
count++;
}
return 0;
}
Output
Enter the no of vertices:
4
Enter the adjacency matrix:
Enter row 1
0110
Enter row 2
0001
Enter row 3
0001
Enter row 4
0000
The topological order is:1 2 3 4
---------------------------------------------------------------------------------------------------------------------
------------------
Bi-Connectivity
Connectivity
Connectivity in an undirected graph means that every vertex can reach every other vertex via
any path. If the graph is not connected the graph can be broken down into Connected
Components.
Strong Connectivity
Strong Connectivity applies only to directed graphs. A directed graph is strongly connected if
there is a directed path from any vertex to every other vertex. This is same as connectivity in an
undirected graph, the only difference being strong connectivity applies to directed graphs and
there should be directed paths instead of just paths. Similar to connected components, a directed
graph can be broken down into Strongly Connected Components.
If in the above graph, vertex 1 and all the edges associated with it, i.e. the edges 1-0, 1-2 and
1-3 are removed, there will be no path to reach any of the vertices 2, 3 or 4 from the vertices 0
and 5, that means the graph will split into two separate components. One consisting of the
vertices 0 and 5 and another one consisting of the vertices 2, 3 and 4 as shown in the following
figure.
Likewise removing the vertex 0 will disconnect the vertex 5 from all other vertices. Hence the
given graph has two articulation points: 0 and 1.
---------------------------------------------------------------------------------------------------------------------
-----------------------
Euler Circuits
Euler Path
An Euler path is a path that uses every edge of a graph exactly once.
Euler Circuit
An Euler circuit is a circuit that uses every edge of a graph exactly once.
An Euler path starts and ends at different vertices. An Euler circuit starts and ends at the same
vertex.
-------------------------------------------------------------------------------------------------------------------------
----------------------
Applications of Graphs
Graphs are nothing but connected nodes(vertex). So any network related, routing, finding
relation, path etc related real life applications use graphs.
Since they are powerful abstractions, graphs can be very important in modeling data. In fact,
many problems can be reduced to known graph problems. Here we outline just some of the many
applications of graphs.
1. Social network graphs: to tweet or not to tweet. Graphs that represent who knows whom,
who communicates with whom, who influences whom or other relationships in social
structures. An example is the twitter graph of who follows whom. These can be used to
determine how information flows, how topics become hot, how communities develop, or
even who might be a good match for who, or is that whom.
2. Transportation networks. In road networks vertices are intersections and edges are the
road segments between them, and for public transportation networks vertices are stops and
edges are the links between them. Such networks are used by many map programs such as
Google maps, Bing maps and now Apple IOS 6 maps (well perhaps without the public
transport) to find the best routes between locations. They are also used for studying traffic
patterns, traffic light timings, and many aspects of transportation.
3. Utility graphs. The power grid, the Internet, and the water network are all examples of
graphs where vertices represent connection points, and edges the wires or pipes between
them. Analyzing properties of these graphs is very important in understanding the
reliability of such utilities under failure or attack, or in minimizing the costs to build
infrastructure that matches required demands.
4. Document link graphs. The best known example is the link graph of the web, where each
web page is a vertex, and each hyperlink a directed edge. Link graphs are used, for
example, to analyze relevance of web pages, the best sources of information, and good
link sites.
5. Protein-protein interactions graphs. Vertices represent proteins and edges represent
interactions between them that carry out some biological function in the cell. These graphs
can be used, for example, to study molecular pathways—chains of molecular interactions
in a cellular process. Humans have over 120K proteins with millions of interactions
among them.
6. Network packet traffic graphs. Vertices are IP (Internet protocol) addresses and edges are
the packets that flow between them. Such graphs are used for analyzing network security,
studying the spread of worms, and tracking criminal or non-criminal activity.
7. Scene graphs. In graphics and computer games scene graphs represent the logical or
spacial relationships between objects in a scene. Such graphs are very important in the
computer games industry.
8. Finite element meshes. In engineering many simulations of physical systems, such as the
flow of air over a car or airplane wing, the spread of earthquakes through the ground, or
the structural vibrations of a building, involve partitioning space into discrete elements.
The elements along with the connections between adjacent elements forms a graph that is
called a finite element mesh.
9. Robot planning. Vertices represent states the robot can be in and the edges the possible
transitions between the states. This requires approximating continuous motion as a
sequence of discrete steps. Such graph plans are used, for example, in planning paths for
autonomous vehicles.
10. Neural networks. Vertices represent neurons and edges the synapses between them. Neural
networks are used to understand how our brain works and how connections change when
we learn. The human brain has about 1011 neurons and close to 1015 synapses.
11. Graphs in quantum field theory. Vertices represent states of a quantum system and the
edges the transitions between them. The graphs can be used to analyze path integrals and
summing these up generates a quantum amplitude (yes, I have no idea what that means).
12. Semantic networks. Vertices represent words or concepts and edges represent the
relationships among the words or concepts. These have been used in various models of
how humans organize their knowledge, and how machines might simulate such an
organization.
13. Graphs in epidemiology. Vertices represent individuals and directed edges the transfer of
an infectious disease from one individual to another. Analyzing such graphs has become
an important component in understanding and controlling the spread of diseases.
14. Graphs in compilers. Graphs are used extensively in compilers. They can be used for type
inference, for so called data flow analysis, register allocation and many other purposes.
They are also used in specialized compilers, such as query optimization in database
languages.
15. Constraint graphs. Graphs are often used to represent constraints among items. For
example the GSM network for cell phones consists of a collection of overlapping cells.
Any pair of cells that overlap must operate at different frequencies. These constraints can
be modeled as a graph where the cells are vertices and edges are placed between cells that
overlap.
16. Dependence graphs. Graphs can be used to represent dependences or precedences among
items. Such graphs are often used in large projects in laying out what components rely on
other components and used to minimize the total time or cost to completion while abiding
by the dependences.
UNIT V
Sorting algorithms: Insertion sort - Selection sort - Shell sort -Bubble sort - Quick sort - Merge
sort - Radix sort –Searching: Linear search–Binary Search. Hashing: Hash Functions–Separate
Sorting:
Types of Sorting:
Internal Sorting
External Sorting
Internal Sorting:
Internal Sorting takes place in the main memory of the computer. It is applicable when
the number of elements in the list is small.
E.g. Bubble Sort, Inserting Sort, Shell Sort, Quick Sort.
External Sorting:
External Sorting takes place in the Secondary memory of the computer. It is applicable
when the number of elements in the list is large.
E.g. Merge Sort, Multiway Merge Sort.
Insertion sort
Selection sort
Shell sort
Bubble sort
Quick sort
Merge sort
Radix sort
Procedure:
Step 1: The second element of an array is compared with the elements that appear before
it (only first element in this case). If the second element is smaller than first element,
second element is inserted in the position of first element. After first step, first two
elements of an array will be sorted.
Step 2: The third element of an array is compared with the elements that appears before it
(first and second element). If third element is smaller than first element, it is inserted in
the position of first element. If third element is larger than first element but, smaller than
second element, it is inserted in the position of second element. If third element is larger
than both the elements, it is kept in the position as it is. After second step, first three
elements of an array will be sorted.
Step 3: Similarly, the fourth element of an array is compared with the elements that
appear before it (first, second and third element) and the same procedure is applied and
that element is inserted in the proper position. After third step, first four elements of an
array will be sorted. If there are n elements to be sorted. Then, this procedure is repeated
n-1 times to get sorted list of array.
Selection sort algorithm starts by comparing first two elements of an array and swapping
if necessary, i.e., if you want to sort the elements of array in ascending order and if the first
element is greater than second then, you need to swap the elements but, if the first element is
smaller than second, leave the elements as it is. Then, again first element and third element are
compared and swapped if necessary. This process goes on until first and last element of an array
is compared. This completes the first step of selection sort.
If there are n elements to be sorted then, the process mentioned above should be repeated
n-1 times to get required result. But, for better performance, in second step, comparison starts
from second element because after first step, the required number is automatically placed at the
first (i.e., In case of sorting in ascending order, smallest element will be at first and in case of
sorting in descending order, largest element will be at first.). Similarly, in third step, comparison
starts from third element and so on.
A figure is worth 1000 words. This figure below clearly explains the working of selection sort
algorithm.
Shell Sort
Shell sort works by comparing elements that are distant rather than adjacent elements in
an array or list where adjacent elements are compared. Shell sort uses an increment sequence.
The increment size is reduced after each pass until the increment size is 1. With an increment
size of 1, the sort is a basic insertion sort, but by this time the data is guaranteed to be almost
sorted, which is insertion sort's "best case". The distance between comparisons decreases as the
sorting algorithm runs until the last phase in which adjacent elements are compared hence, it is
also known as diminishing increment sort.
Consider a list has nine items. If we use an increment of three, there are three sublist,
each of which can be sorted by an insertion sort. After completing these sorts, we get the list
shown below. Although this list is not completely sorted, something very interesting has
happened. By sorting the sublist, we have moved the items closer to where they actually belong.
A Shell Sort with Increments of Three
Shows a final insertion sort using an increment of one; in other words, a standard
insertion sort. Note that by performing the earlier sublist sorts, we have now reduced the total
number of shifting operations necessary to put the list in its final order. For this case, we need
only four more shifts to complete the process.
ShellSort: A Final Insertion Sort with Increment of 1
We said earlier that the way in which the increments are chosen is the unique feature of the shell
sort. The function shown in ActiveCode 1 uses a different set of increments. In this case, we
begin with n2 sublist. On the next pass, n4 sublist is sorted. Eventually, a single list is sorted with
the basic insertion sort. Above figure shows the first sublist for our example using this
increment.
Bubble sort algorithm starts by comparing the first two elements of an array and
swapping if necessary, i.e., if you want to sort the elements of array in ascending order and if the
first element is greater than second then, you need to swap the elements but, if the first element is
smaller than second, you mustn't swap the element. Then, again second and third elements are
compared and swapped if it is necessary and this process go on until last and second last element
is compared and swapped. This completes the first step of bubble sort.
If there are n elements to be sorted then, the process mentioned above should be repeated
n-1 times to get required result. But, for better performance, in second step, last and second last
elements are not compared because; the proper element is automatically placed at last after first
step. Similarly, in third step, last and second last and second last and third last elements are not
compared and so on.
A figure is worth a thousand words so; acknowledge this figure for better understanding of
bubble sort.
Here, there are 5 elements to be sorted. So, there are 4 steps.
1. Find a ―pivot‖ item in the array. This item is the basis for comparison for a single round.
2. Start a pointer (the left pointer) at the first item in the array.
3. Start a pointer (the right pointer) at the last item in the array.
4. While the value at the left pointer in the array is less than the pivot value, move the left
pointer to the right (add 1). Continue until the value at the left pointer is greater than or
equal to the pivot value.
5. While the value at the right pointer in the array is greater than the pivot value, move the
right pointer to the left (subtract 1). Continue until the value at the right pointer is less
than or equal to the pivot value.
6. If the left pointer is less than or equal to the right pointer, then swap the values at these
locations in the array.
7. Move the left pointer to the right by one and the right pointer to the left by one.
8. If the left pointer and right pointer don’t meet, go to step 1.
The quick sort uses divide and conquer to gain the same advantages as the merge
sort, while not using additional storage.
A quick sort first selects a value, which is called the pivot value. Although there are
many different ways to choose the pivot value, we will simply use the first item in the
list.
The role of the pivot value is to assist with splitting the list. The actual position where
the pivot value belongs in the final sorted list, commonly called the split point, will
be used to divide the list for subsequent calls to the quick sort.
Below figure shows that 54 will serve as our first pivot value. Since we have looked at
this example a few times already, we know that 54 will eventually end up in the position
currently holding 31. The partition process will happen next. It will find the split point and at
the same time move other items to the appropriate side of the list, either less than or greater than
the pivot value.
Partitioning begins by locating two position markers—let’s call them leftmark and
rightmark—at the beginning and end of the remaining items in the list (positions 1 and 8 in). The
goal of the partition process is to move items that are on the wrong side with respect to the pivot
value while also converging on the split point. Below figure shows this process as we locate the
position of 54.
We begin by incrementing leftmark until we locate a value that is greater than the pivot
value. We then decrement rightmark until we find a value that is less than the pivot value. At this
point we have discovered two items that are out of place with respect to the eventual split point.
For our example, this occurs at 93 and 20. Now we can exchange these two items and then repeat
the process again.
Finding the Split Point for 54
At the point where rightmark becomes less than leftmark, we stop. The position of
rightmark is now the split point. The pivot value can be exchanged with the contents of the split
point and the pivot value is now in place. In addition, all the items to the left of the split point are
less than the pivot value, and all the items to the right of the split point are greater than the pivot
value. The list can now be divided at the split point and the quick sort can be invoked recursively
on the two halves.
Completing the Partition Process to Find the Split Point for 54
Merge sort is a recursive algorithm that continually splits a list in half. If the list has more
than one item, we split the list and recursively invoke a merge sort on both halves. Once the two
halves are sorted, the fundamental operation, called a merge, is performed. Merging is the
process of taking two smaller sorted lists and combining them together into a single, sorted, new
list. Below figure shows our familiar example list as it is being split by mergeSort. Below figure
shows the simple lists, now sorted, as they are merged back together.
int center;
}
void merge( int a[ ], int temp[ ], int n, int left, int center, int right )
{
int i = 0, j, left_end = center, center = center+1;
Step2: Consider the LSB (Least Significant Bit) of each number (numbers in the one’s
Step3: place the elements in their respective buckets according to the LSB of each number
Step5: repeat the same process with the digits in the 10’s place (e.g. In 43 MSB =4)
Step6: repeat the same step till all the digits of the given number are consider.
Algorithm for Radix Sort:
Types of searching:-
Linear search
Binary Search
Linear Search
Linear search or sequential search is a method for finding a particular value in a list, that
consists of checking every one of its elements, one at a time and in sequence, until the desired
one is found.
Linear search is the simplest search algorithm. For a list with n items, the best case is
when the value is equal to the first element of the list, in which case only one comparison is
needed. The worst case is when the value is not in the list (or occurs only once at the end of the
list), in which case n comparisons are needed.
The worst case performance scenario for a linear search is that it has to loop through the
entire collection, either because the item is the last one, or because the item is not found.
Working principle:
Now we should define when iterations should stop. First case is when searched element is found.
Second one is when subarray has no elements. In this case, we can conclude that search value is
not present in the array.
Example 1.
Technique that is used to store and retrieve data from data structure.
It reduces the number of comparison.
It involves two concepts-
Hash Function
Hash Table
Hash table
0
1
2
3
.
.
8
9
A hash table is a data structure that is used to store and retrieve data very
quickly.
Insertion of the data in the hash table is based on the key value obtained from the
hash function.
Using same hash key, the data can be retrieved from the hash table by few or
more Hash key comparison.
E.g. For storing student details, the student Id will work as a key.
Hash function:
It is a function, which is used to put the data in the hash table.
Using the same hash function we can retrieve data from the hash table.
Hash function is used to implement hash table.
The integer value returned by the hash function is called hash key.
Bucket:
The hash function (H (key) = key % table size) is used to map several data’s in the Hash
table.
Each position of the Hash Table is called Bucket.
1. Division Method:
It depends on remainder of division.
Divisor is Table Size.
Formula is ( H ( key ) = key % table size )
E.g. consider the following data or record or key (36, 18, 72, 43, 6) table size = 8
2. Mid Square Method:
We first square the item, and then extract some portion of the resulting digits. For
example, if the item were 44, we would first compute 442=1,936. Extract the middle two digit 93
from the answer. Store the key 44 in the index 93.
93 44
99
3306 107
4999
4, Digit Folding Method:
The folding method for constructing hash functions begins by dividing the item into
equal-size pieces (the last piece may not be of equal size). These pieces are then added together
to give the resulting hash key value. For example, if our item was the phone number 436-555-
4601, we would take the digits and divide them into groups of 2 (43, 65, 55, 46, 01). After the
addition, 43+65+55+46+01, we get 210. If we assume our hash table has 11 slots, then we need
to perform the extra step of dividing by 11 and keeping the remainder. In this case 210 % 11 is 1,
so the phone number 436-555-4601 hashes to slot 1.
1 436-555-4601
10
Collision:
If two keys hashes to the same index, the corresponding records cannot be stored in the same
location. So, if it's already occupied, we must find another location to store the new record.
Simple to compute.
Number of Collision should be less while placing record in Hash Table.
Hash function with no collision Perfect hash function.
Hash Function should produce keys which are distributed uniformly in hash table.
Initial configuration of the hash table with separate chaining. Here we use SLL(Singly Linked
List) concept to chain the elements.
0
NULL
1
NULL
2
3 NULL
4 NULL
5 NULL
6 NULL
7 NULL
8 NULL
9 NULL
NULL
Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.
1. H(22) = 22 % 10
=2
0 NULL
1 NULL
2 22 NULL
3 NULL
4 NULL
5 NULL
6 NULL
7 NULL
8 NULL
9
Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining
84 % 10 = 4
Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining
35 % 10 = 5
Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining
62 % 10 = 2
Advantage
1, more number of elements can be inserted using array of Link List
Disadvantage
1, it requires more pointers, which occupies more memory space.
2, Search takes time. Since it takes time to evaluate Hash Function and also to traverse
the List
Open Addressing
Closed Hashing
Collision resolution technique
When collision occurs, alternative cells are tried until empty cells are found.
Types:-
Linear Probing
Quadratic Probing
Double Hashing
Hash function
H(key) = key % table size.
Insert Operation
To insert a key; Use the hash function to identify the list to which the
element should be inserted.
Then traverse the list to check whether the element is already present.
If exists, increment the count.
Else the new element is placed at the front of the list.
Linear Probing:
Easiest method to handle collision.
Apply the hash function H (key) = key % table size
How to Probing:
first probe – given a key k, hash to H(key)
second probe – if H(key)+f(1) is occupied, try H(key)+f(2)
And so forth.
Probing Properties:
we force f(0)=0
The ith probe is to (H (key) +f (i)) %table size.
If I reach size-1, the probe has failed.
Depending on f (), the probe may fail sooner.
Long sequences of probe are costly.
Probe Sequence is:
H (key) % table size
H (key)+1 % Table size
H (Key)+2 % Table size
……
1. H(Key)=Key mod Tablesize
This is the common formula that you should apply for any hashing
If collocation occurs use Formula 2
2. H(Key)=(H(key)+i) Tablesize
Where i=1, 2, 3, …… etc
Example: - 89 18 49 58 69; Tablesize=10
1. H(89) =89%10
=9
2. H(18) =18%10
=8
3. H(49) =49%10
=9 ((coloids with [Link] try for next free cell using formula 2))
i=1 h1(49) = (H(49)+1)%10
= (9+1)%10
=10%10
=0
4. H(58) =58%10
=8 ((colloids with 18))
i=1 h1(58) = (H(58) +1)%10
= (8+1) %10
=9%10
=9 =>Again collision
i=2 h2(58) =(H(58)+2)%10
=(8+2)%10
=10%10
=0 =>Again collision
EMPTY 89 18 49 58 69
0 49 49 49
1 58 58
2 69
3
8 18 18 18
9 89 89 89 89
Hashing with Quadratic Probe
To resolve the primary clustering problem, quadratic probing can be used. With quadratic
probing, rather than always moving one spot, move i2 spots from the point of collision,
where i is the number of attempts to resolve the collision.
Another collision resolution method which distributes items more evenly.
From the original index H, if the slot is filled, try cells H+12, H+22, H+32,.., H + i2 with
wrap-around.
Limitation: at most half of the table can be used as alternative locations to resolve collisions.
This means that once the table is more than half full, it's difficult to find an empty spot. This
new problem is known as secondary clustering because elements that hash to the same hash
key will always probe the same alternative cells.
Hashing with Double Hashing
Double hashing uses the idea of applying a second hash function to the key when a
collision occurs. The result of the second hash function will be the number of positions forms
the point of collision to insert.
Advantage:
A programmer doesn’t worry about table system.
Simple to implement
Can be used in other data structure as well
Extendible Hashing
Extendible Hashing is a mechanism for altering the size of the hash table to accommodate
new entries when buckets overflow.
Common strategy in internal hashing is to double the hash table and rehash each entry.
However, this technique is slow, because writing all pages to disk is too expensive.
Therefore, instead of doubling the whole hash table, we use a directory of pointers to
buckets, and double the number of buckets by doubling the directory, splitting just the
bucket that overflows.
Since the directory is much smaller than the file, doubling it is much cheaper. Only one
page of keys and pointers is split.
000 100 0 1
010 100
100 000
111 000 000 100 100 000
001 000 010 100 111 000
011 000 001 000 101 000
101 000 011 000 111 001
111 001
001 010
101 100
101 110
00 01 10 11
001 011