Data Structures
Prepared by : Ms Soniya Joseph
Assistant Professor, Sfc
SEARCHING
• Searching is the process of finding some particular element in the list.
• If the element is present in the list, then the process is called
successful, and the process returns the location of that element;
otherwise, the search is called unsuccessful.
• Two popular search methods are Linear Search and Binary Search.
Linear Search
• Linear search is also called as sequential search algorithm.
• It is the simplest searching algorithm.
• In Linear search, we simply traverse the list completely and match
each element of the list with the item whose location is to be found.
• If the match is found, then the location of the item is returned;
otherwise, the search is unsuccessful
Working of Linear Search
Let the element to be searched is K = 41
Now, start from the first element and compare K with each element of the array.
The value of K, i.e., 41, is not matched with the first element of the array. So, move to the next
element. And follow the same process until the respective element is found.
Linear Search complexity
Space Complexity
• The space complexity of linear search is O(1).
// Linear search
#include <stdio.h>
void main()
{
int a[20],item,i,n,pos=-1;
clrscr();
//n=sizeof(a);
printf("Enter the size of an array: ");
scanf("%d",&n);
printf("Enter the array elements:\n");
for(i=0;i<n;i++)
{
scanf("%d",&a[i]);
}
printf("Enter the item to be searched: ");
scanf("%d",&item);
for(i=0;i<n;i++)
{
if(a[i]==item)
{
pos=i+1;
}
}
if(pos==-1)
printf("Unsuccessful search....Item %d is not found in array",item);
else
printf("Item %d is found at location %d",item,pos);
getch();
}
Binary Search
Let the element to search is, K = 56
mid = (beg + end)/2
beg = 0
end = 8
mid = (0 + 8)/2 = 4. So, 4 is the mid of the array
• If a[mid]=item
pos=mid
• Elseif item < a[mid]
k > A[mid] (56> 39) end=mid-1
beg=5 Else
End=8 beg=mid+1
Mid= (5+8)/2 (6)
C p ro g ra m fo rb in a ry s e a rc h
#include <stdio.h>
void main()
{
int a[]={1,2,3,4,7,8,9},n,item,pos;
n=7;
item=7;
clrscr();
pos=bsearch(a,n,item);
printf("BINARY SEARCH\n");
if(pos==-1)
printf("Not found\n");
else
printf("Found at %d",pos);
getch();
int bsearch(int a[],int n,int item)
{
int b,e,m,pos=-1; if(a[m]==item)
{
b=0; return m+1;
e=n-1;
}
while(b<e) else if(item<a[m])
{ e=m-1;
else
b=m+1;
m=(b+e)/2; }
//if (a[l]==item)
//return l;
//else
return -1;
}
The first occurrence of an element in the list – binary
search
#include <stdio.h>
void main()
{
int a[]={1,2,3,4,7,7,7,9},n,item,pos;
n=8;
item=7;
clrscr();
pos=bsearch(a,n,item);
printf("BINARY SEARCH\n");
if(pos==-1)
printf("Not found\n");
else
printf("First occurance of item 7 is found at location %d",pos+1);
getch();
}
int bsearch(int a[],int n,int item)
{
int b,e,m,pos=-1;
b=0;
e=n-1;
while(b<=e)
{
m=(b+e)/2;
if(a[m]==item)
{
pos=m;
e=m-1;
}
else if(item<a[m])
e=m-1;
else
b=m+1;
}
return pos;
Given {4,7,3,2,1,7,9,0} find the location of 7 using Linear and Binary search
and also display its first occurrence.
#include <stdio.h>
void main()
{
int a[]={4,7,3,2,1,7,9,0},pos,n,item,c;
n=8;
item=7;
clrscr();
while(1)
{
printf("\n\n1. Linear Search: \n");
printf("2. Binary Searcg: \n");
printf("3. Exit\n");
printf("Enter your choice: \n");
scanf("%d",&c);
switch(c)
{
case 1:
printf("\n--------Linear Search---------\n");
pos=lsearch(a,n,item);
if (pos==-1)
printf("\nUnsuccessful search.....Item 7 is not found
in the given array\n");
else
printf("\nItem 7 is found at %d position\n",pos+1);
getch();
break;
case 2:
printf("\n----------Binary Search---------\n");
pos=bsearch(a,n,item);
if (pos==-1)
printf("Unsuccessful search.....Item 7 is not found in the
given array\n");
else
printf("\n\nItem 7 is found at %d position\n",pos+1);
getch();
break;
case 3:
exit(0);
}
}
}
int lsearch(int a[],int n, int item)
{
int i;
for(i=0;i<n;i++)
{
if (a[i]==item)
{
return i;
}
}
return -1;
}
int bsearch(int a[],int n,int item)
{
int pass,i,t,beg=0,end=n-1,mid, pos-1;
for(pass=1;pass<n;pass++)
{
for(i=0;i<n-pass;i++)
{
if(a[i]>a[i+1])
{
t=a[i];
a[i]=a[i+1];
a[i+1]=t;
}
}}
printf("The sorted array is: \n");
for(i=0;i<n;i++)
{
printf("%d ",a[i]);
}
while(beg<=end)
{
mid=(big+end)/2;
if(a[mid]==item)
{
pos=mid;
end=mid-1;
}else if(item<a[mid])
end=mid-1;
else
beg=mid+1;
}return pos;}
Bubble Sort
• Bubble Sort is the simplest sorting algorithm that works by repeatedly
swapping the adjacent elements if they are in the wrong order.
• This algorithm is not suitable for large data sets as its average and
worst-case time complexity is quite high.
Bubble Sort Algorithm
In this algorithm,
• traverse from left and compare adjacent elements and the higher one
is placed at right side.
• In this way, the largest element is moved to the rightmost end at first.
• This process is then continued to find the second largest and place it
and so on until the data is sorted.
How does Bubble Sort Work?
• Let us understand the
working of bubble sort
with the help of the
following illustration:
Input: arr[] = {6, 3, 0, 5}
Complexity Analysis of Bubble
Sort:
• Time Complexity: O(N2)
• Auxiliary Space: O(1)
Advantages of Bubble Sort: Disadvantages of Bubble Sort:
• Bubble sort is easy to • Bubble sort has a time
understand and implement. complexity of O(N2) which
• It does not require any makes it very slow for large data
additional memory space. sets.
• It is a stable sorting algorithm, • Bubble sort is a comparison-
meaning that elements with the based sorting algorithm, which
same key value maintain their means that it requires a
relative order in the sorted comparison operator to
output. determine the relative order of
elements in the input data set.
• It can limit the efficiency of the
algorithm in certain cases.
Selection Sort
• Selection sort is a simple and efficient sorting algorithm that works by
repeatedly selecting the smallest (or largest) element from the
unsorted portion of the list and moving it to the sorted portion of the
list.
• The algorithm repeatedly selects the smallest (or largest) element
from the unsorted portion of the list and swaps it with the first
element of the unsorted part.
• This process is repeated for the remaining unsorted portion until the
entire list is sorted.
How does Selection Sort Algorithm
work?
• Lets consider the following array as an example: arr[] = {64, 25, 12, 22, 11}
First Pass
Algorithm of the Selection Sort
Algorithm
Step 1: Set Min to location 0 in Step 1.
Step 2: Look for the smallest element on the list.
Step 3: Replace the value at location Min with a different value.
Step 4: Increase Min to point to the next element
Step 5: Continue until the list is sorted.
Algorithm
• Algorithm Selection Sort (A[0...... n-1])
• A is an array of n elements to be sorted.
Step 1- set LOC=0;
Step 2-Repeat step 3 and 4 for K= 0 to n-1
Step 3-LOC= CALL MIN (A, K, n)
Step 4-[Interchange A[K] and A [LOC]]
temp = A[K]
A[K] = A[LOC]
A[LOC]= temp
Step 5-Exit
Algorithm
• MIN (A[0...... n-1], K, n)
• A is an array of n elements and K is the number of passes.
Step 1- set min =A[k],loc=k
Step 2-Repeat step 3 for j=k+1 to n-1
Step 3- if min>A[j]
min=A[j]
LOC=j
endif
Step 4- return LOC;
Complexity Analysis of Selection
Sort
• Time Complexity: The time complexity of Selection Sort is O(N2) as
there are two nested loops:
• One loop to select an element of Array one by one = O(N)
• Another loop to compare that element with every other Array
element = O(N)
• Therefore overall complexity = O(N) * O(N) = O(N*N) = O(N2)
• Auxiliary Space: O(1) as the only extra memory used is for temporary
variables while swapping two values in Array. The selection sort never
makes more than O(N) swaps and can be useful when memory
writing is costly.
Advantages of Selection Sort Algorithm
• Simple and easy to understand.
• Works well with small datasets.
Disadvantages of the Selection Sort Algorithm
• Selection sort has a time complexity of O(n^2) in the worst and
average case.
• Does not work well on large datasets.
• Does not preserve the relative order of items with equal keys which
means it is not stable.
Insertion Sort
• Insertion sort is a simple sorting algorithm that works similar to the
way you sort playing cards in your hands.
• The array is virtually split into a sorted and an unsorted part.
• Values from the unsorted part are picked and placed at the correct
position in the sorted part.
Insertion Sort Algorithm
• To sort an array of size N in ascending order iterate over the array and
compare the current element (key) to its predecessor, if the key
element is smaller than its predecessor, compare it to the elements
before.
• Move the greater elements one position up to make space for the
swapped element.
Working of Insertion Sort algorithm
• Consider an example: arr[]: {12, 11, 13, 5, 6}
First Pass:
Algorithm
• The simple steps of achieving the insertion sort are listed as follows -
• Step 1 - If the element is the first element, assume that it is already
sorted. Return 1.
• Step2 - Pick the next element, and store it separately in a key.
• Step3 - Now, compare the key with all elements in the sorted array.
• Step 4 - If the element in the sorted array is smaller than the current
element, then move to the next element. Else, shift greater elements
in the array towards the right.
• Step 5 - Insert the value.
• Step 6 - Repeat until the array is sorted
Algorithm
• Insertion sort(A[0.....n-1],n)
• A is an array of n elements.
Step 1:Repeat step 2 to 5 for pass=1 to n-1
Step 2: Set K=A[pass]
Step 3:Repeat step 4 for j=pass-1 to 0
Step 4: if (k<A[j])
A[j+1]=A[j]
Step 5 : A[j+1]=k
Step 6: Exit
Complexity Analysis of Insertion
Sort:
• Time Complexity: O(N^2)
• Auxiliary Space: O(1)
Time Complexity of Insertion Sort
• The worst-case time complexity of the Insertion sort is O(N^2)
• The average case time complexity of the Insertion sort is O(N^2)
• The time complexity of the best case is O(N).
Space Complexity of Insertion Sort
• The auxiliary space complexity of Insertion Sort is O(1)
Characteristics of Insertion Sort
• This algorithm is one of the simplest algorithms with a simple
implementation
• Basically, Insertion sort is efficient for small data values
• Insertion sort is adaptive in nature, i.e. it is appropriate for data sets
that are already partially sorted.
Merge Sort
• Merge sort is defined as a sorting algorithm that works by dividing an
array into smaller subarrays, sorting each subarray, and then merging
the sorted subarrays back together to form the final sorted array.
• In simple terms, we can say that the process of merge sort is to divide
the array into two halves, sort each half, and then merge the sorted
halves back together.
• This process is repeated until the entire array is sorted.
How does Merge Sort work?
• Merge sort is a recursive algorithm that continuously splits the array
in half until it cannot be further divided i.e., the array has only one
element left (an array with one element is always sorted).
• Then the sorted subarrays are merged into one sorted array.
Merge-Sort(A,low,high)
• Given an array A of (high-low+1) elements. This algorithm sorts the
elements in ascending order. The variables low and high are used to
identify the positions of first and last elements in each partition.
Step 1: If (low<high)
Step 2: mid=(low+high)/2
Step 3: call Merge-sort (A,low,mid0
Step 4:call Merge-sort(A,mid+1,high)
Step 5:call Merge(A,low,mid,high)
Step 6: End if
Step 7: Exit
Complexity Analysis of Merge Sort
• Time Complexity: O(N log(N)), Merge Sort is a recursive algorithm and time complexity can
be expressed as following recurrence relation.
• T(n) = 2T(n/2) + θ(n)
• The above recurrence can be solved either using the Recurrence Tree method or the
Master method. It falls in case II of the Master Method and the solution of the recurrence
is θ(Nlog(N)). The time complexity of Merge Sort isθ(Nlog(N)) in all 3 cases (worst, average,
and best) as merge sort always divides the array into two halves and takes linear time to
merge two halves.
• Auxiliary Space: O(N), In merge sort all elements are copied into an auxiliary array. So N
auxiliary space is required for merge sort.
Applications of Merge Sort:
• Sorting large datasets: Merge sort is particularly well-suited for
sorting large datasets due to its guaranteed worst-case time
complexity of O(n log n).
• External sorting: Merge sort is commonly used in external sorting,
where the data to be sorted is too large to fit into memory.
• Custom sorting: Merge sort can be adapted to handle different input
distributions, such as partially sorted, nearly sorted, or completely
unsorted data.
• Inversion Count Problem
Advantages of Merge Sort
• Stability: Merge sort is a stable sorting algorithm, which means it
maintains the relative order of equal elements in the input array.
• Guaranteed worst-case performance: Merge sort has a worst-case
time complexity of O(N logN), which means it performs well even on
large datasets.
• Parallelizable: Merge sort is a naturally parallelizable algorithm, which
means it can be easily parallelized to take advantage of multiple
processors or threads.
Drawbacks of Merge Sort
• Space complexity: Merge sort requires additional memory to store
the merged sub-arrays during the sorting process.
• Not in-place: Merge sort is not an in-place sorting algorithm, which
means it requires additional memory to store the sorted data. This
can be a disadvantage in applications where memory usage is a
concern.
• Not always optimal for small datasets: For small datasets, Merge sort
has a higher time complexity than some other sorting algorithms,
such as insertion sort. This can result in slower performance for very
small datasets.
Merge(A,low,mid,high)
• Given an array A of (high-low+1) elements. THe variables low,mid and
high are used to identify the positions of elements in each partitions.
• Step 1: Set i=low,j=mid+1,k=low
• Step 2: while((i<=mid)and(j<=high)) do step 3
• Step 3: if(A[i]<A[j])
c[k]=A[i]
k=k+1
i=i+1
else
c[k]=A[j]
k=k+1
j=j+1
End if
End while(step 2)
Step 4: while(i<=mid)
c[k]=A[i]
k=k+1
i=i+1
End while(step 4)
• Step 5: while(j<=high)
c[k]=A[j]
k=k+1
j-j+1
End while(step 5)
Step 6: for i=low to k-1
A[i]=c[i]
End for (step 6)
Return
Merge Sort for Linked Lists
• Merge sort is often preferred
for sorting a linked list.
• The slow random-access
performance of a linked list
makes some other algorithms
(such as quicksort) perform
poorly, and others (such as
heapsort) completely
impossible.
• Let the head be the first node of the linked list to be sorted and
headRef be the pointer to head.
• Note that we need a reference to head in MergeSort() as the below
implementation changes next links to sort the linked lists (not data at
the nodes), so the head node has to be changed if the data at the
original head is not the smallest value in the linked list.
Complexity Analysis
• Time Complexity: O(n*log n)
• Auxiliary Space: O(log n)
Approach 2: This approach is
simpler and uses log n space.
• mergeSort()
• If the size of the linked list is 1 then return the head
• Find mid using The Tortoise and The Hare Approach
• Store the next of mid in head2 i.e. the right sub-linked list.
• Now Make the next midpoint null.
• Recursively call mergeSort() on both left and right sub-linked list and
store the new head of the left and right linked list.
• Call merge() given the arguments new heads of left and right sub-
linked lists and store the final head returned after merging.
• Return the final head of the merged linkedlist.
• merge(head1, head2)
• Take a pointer say merged to store the merged list in it and store a
dummy node in it.
• Take a pointer temp and assign merge to it.
• If the data of head1 is less than the data of head2, then, store head1
in next of temp & move head1 to the next of head1.
• Else store head2 in next of temp & move head2 to the next of head2.
• Move temp to the next of temp.
• Repeat steps 3, 4 & 5 until head1 is not equal to null and head2 is not
equal to null.
• Now add any remaining nodes of the first or the second linked list to
the merged linked list.
• Return the next of merged(that will ignore the dummy and return the
head of the final merged linked list)
Quick sort
• QuickSort is a sorting algorithm based on the Divide and Conquer
algorithm that picks an element as a pivot and partitions the given
array around the picked pivot by placing the pivot in its correct
position in the sorted array.
How does QuickSort work?
• The key process in quickSort is a partition().
• The target of partitions is to place the pivot (any element can be
chosen to be a pivot) at its correct position in the sorted array and put
all smaller elements to the left of the pivot, and all greater elements
to the right of the pivot.
• Partition is done recursively on each side of the pivot after the pivot is
placed in its correct position and this finally sorts the array.
Choice of Pivot:
• There are Always pick the first element as a pivot.
• Always pick the last element as a pivot (implemented below)
• Pick a random element as a pivot.
• Pick the middle as the [Link] different choices for picking pivots.
Partition Algorithm:
• The logic is simple, we start from the leftmost element and keep track
of the index of smaller (or equal) elements as i. While traversing, if we
find a smaller element, we swap the current element with arr[i].
Otherwise, we ignore the current element.
• Let us understand the working of partition and the Quick Sort
algorithm with the help of the following example:
• Consider: arr[] = {10, 80, 30, 90, 40}.
• Compare 10 with the pivot and as it is less than pivot arrange it
accrodingly.
Illustration of Quicksort
• As the partition process is done recursively, it keeps on putting the
pivot in its actual position in the sorted array.
• Repeatedly putting pivots in their actual position makes the array
sorted.
• Follow the below images to understand how the recursive
implementation of the partition algorithm helps to sort the array.
Complexity Analysis of Quick Sort:
Time Complexity:
• Best Case: Ω (N log (N))
• The best-case scenario for quicksort occur when the pivot chosen at the each step divides
the array into roughly equal halves.
• In this case, the algorithm will make balanced partitions, leading to efficient Sorting.
• Average Case: θ ( N log (N))
• Quicksort’s average-case performance is usually very good in practice, making it one of the
fastest sorting Algorithm.
• Worst Case: O(N2)
• The worst-case Scenario for Quicksort occur when the pivot at each step consistently
results in highly unbalanced partitions. When the array is already sorted and the pivot is
always chosen as the smallest or largest element. To mitigate the worst-case Scenario,
various techniques are used such as choosing a good pivot (e.g., median of three) and using
Randomized algorithm (Randomized Quicksort ) to shuffle the element before sorting.
• Auxiliary Space: O(1), if we don’t consider the recursive stack space. If we consider the
recursive stack space then, in the worst case quicksort could make O(N).
Advantages of Quick Sort:
• It is a divide-and-conquer algorithm that makes it easier to solve
problems.
• It is efficient on large data sets.
• It has a low overhead, as it only requires a small amount of memory to
function.
Disadvantages of Quick Sort:
• It has a worst-case time complexity of O(N2), which occurs when the
pivot is chosen poorly.
• It is not a good choice for small data sets.
• It is not a stable sort, meaning that if two elements have the same key,
their relative order will not be preserved in the sorted output in case of
quick sort, because here we are swapping elements according to the
pivot’s position (without considering their original positions).
Shell Sort
• It is also known as diminshing increment sort.
• developed by Donald L shell
• It improves the efficiency of simple insertion sort by quickly shifting
the elements to their proper position.
• It is an improved version of insertion sort.
• In shell sort, we compare the items at particular
distance(increments)and then we so them.
• Elementsa are sorted in multiple passes and in each passes data are
taken with smaller and smaller gap sizes.
• It provide good performance.
• In this method,the given array is divided into subarrays based on the
increment value.
• Suppose the increment value is 5,then the elements in the array at a
distance(increment) 5 are sorted first.
• Let us take an array from A[0......n-1] and increment 5 for grouping
together the items as shown below.
subarray 1:A[0], A[5],A[10] and so on.
subarray 2:A[1], A[6],A[11] and so on.
subaaray 3:A[2], A[7], A[12] and so on.
Time and space complexity of shell
sort
• Time Complexity- O(n2)
• Space Complexity-O(1)
Introduction to Hashing in Data
Structure:
• Hashing is a popular technique in computer science that involves
mapping large data sets to fixed-length values.
• It is a process of converting a data set of variable size into a data set
of a fixed size.
• The ability to perform efficient lookup operations makes hashing an
essential concept in data structures.
What is Hashing?
• A hashing algorithm is used to convert an input (such as a string or
integer) into a fixed-size output (referred to as a hash code or hash
value).
• The data is then stored and retrieved using this hash value as an index
in an array or hash table.
• The hash function must be deterministic, which guarantees that it will
always yield the same result for a given input.
• Hashing is commonly used to create a unique identifier for a piece of
data, which can be used to quickly look up that data in a large dataset.
• For example, a web browser may use hashing to store website
passwords securely. When a user enters their password, the browser
converts it into a hash value and compares it to the stored hash value to
authenticate the user.
What is a hash Key?
• In the context of hashing, a hash key (also known as a hash value or
hash code) is a fixed-size numerical or alphanumeric representation
generated by a hashing algorithm.
• It is derived from the input data, such as a text string or a file,
through a process known as hashing.
How Hashing Works?
• The process of hashing can be broken down into three steps:
• Input: The data to be hashed is input into the hashing algorithm.
• Hash Function: The hashing algorithm takes the input data and
applies a mathematical function to generate a fixed-size hash value.
The hash function should be designed so that different input values
produce different hash values, and small changes in the input produce
large changes in the output.
• Output: The hash value is returned, which is used as an index to store
or retrieve data in a data structure.
Hash table
• A hash table, also known as a hash map, is a data structure that
implements an associative array, also called a dictionary, which is an
abstract data type that maps keys to values.
• A hash table uses a hash function to compute an index, also called a
hash code, into an array of buckets or slots, from which the desired
value can be found.
• During lookup, the key is hashed and the resulting hash indicates
where the corresponding value is stored.
• There are several operations that can be performed on a hash table,
including:
• Insertion: Inserting a new key-value pair into the hash table.
• Deletion: Removing a key-value pair from the hash table.
• Search: Searching for a key-value pair in the hash table.
Hash Function
• A Hash Function is a function that converts a given numeric or
alphanumeric key to a small practical integer value. The mapped
integer value is used as an index in the hash table.
• In simple terms, a hash function maps a significant number or string
to a small integer that can be used as the index in the hash table.
• The pair is of the form (key, value), where for a given key, one can find
a value using some kind of a “function” that maps keys to values.
• The key for a given object can be calculated using a function called a
hash function.
• For example, given an array A, if i is the key, then we can find the
value by simply looking up A[i].
Hash function can be denoted by ‘H’
Let k be the key and h(k) is a hash function then, h(k) will give us a new
index to store the elements linked with k.
• Hashing is a technique to
convert a range of key values
into a range of indexes of an
array. We're going to use modulo
operator to get a range of key
values.
• Consider an example of hash
table of size 20, and the
following items are to be stored.
Item are in the (key,value)
format.
Characteristics of a good hash
function
• Easy to compute
• Uniform distribution
• Less Collision
• High load factor
load factor=no of elements in hash table/hash table size
Hash Collision
• A hash collision or hash clash is
when two pieces of data in a
hash table share the same hash
value.
• The hash value in this case is
derived from a hash function
which takes a data input and
returns a fixed length of bits.
Different techniques for choosing
hash function
• Division method
• Midsquare method
• Folding method
• Mixed method
Mixed Method
• We can mix up other methods with division methods in order to
ensure that addresses are in the range of hash table.
Eg: key =27862123
folding method-H(27862123)=27+8621+23=8671
table size=97
division method-H(8671)=8671%97=38
Collision Resolution Techniques
There are two types of collision resolution techniques.
• Separate chaining (open hashing)
• Open addressing (closed hashing)
Separate chaining:
• This method involves making a linked list out of the slot where the
collision happened, then adding the new key to the list.
• Separate chaining is the term used to describe how this connected list
of slots resembles a chain.
• It is more frequently utilized when we are unsure of the number of
keys to add or remove.
Advantages and Disadvantages
Advantages of separate chaining
• It is easy to implement.
• The hash table never fills full, so we can add more elements to the chain.
• It is less sensitive to the function of the hashing.
Disadvantages of separate chaining
• In this, the cache performance of chaining is not good.
• Memory wastage is too much in this method.
• It requires more space for element links.
Time complexity
• Its worst-case complexity for searching is o(n).
• Its worst-case complexity for deletion is o(n).
• Let us see the following example
to get better idea. If we have
some elements like {15, 47, 23,
34, 85, 97, 65, 89, 70}. And our
hash function is h(x) = x mod 7.
• The hash values will be
• The linked list data structure is used to implement this technique. So
what happens is, when multiple elements are hashed into the same
slot index, then these elements are inserted into a singly-linked list
which is known as a chain.
Operations on a Chained hash table
Insertion in a chained hash table
• It is quite simple process,first we get the hash value from the hash
function which will map to the hash table.
• After mapping the element is inserted in the linked list.
• running time complexity for insertion is O(1)
Deletion from a chained hash function
• It is same as we did in the singly linked list.
• First we will perform a search operation and then the delete
operation as in the case of the singly linked list is performed
• running time complexity for deletion is O(m) where m is the number
of elements in the linked list at that location.
Searching in a chained hash table
• It is very simple
• First we will get the hash value of the key by the hash function in the
hash table.
• Then will search for the element in the linked list
• running time complexity is O(m) where m is the number of element
present in the linked list at that location.
Open addressing:
• To prevent collisions in the hashing table open addressing is employed as a
collision-resolution technique.
• No key is kept anywhere else besides the hash table.
• As a result, the hash table’s size is never equal to or less than the number of
keys.
• Additionally known as closed hashing.
The following techniques are used in open addressing:
• Linear probing
• Quadratic probing
• Double hashing
• Rehashing
Linear probing:
• This involves doing a linear probe for the following slot when a collision
occurs and continuing to do so until an empty slot is discovered.
• The worst time to search for an element in linear probing is O.
• The cache performs best with linear probing, but clustering is a
concern.
• This method’s key benefit is that it is simple to calculate.
Disadvantages of linear probing:
• The main problem is clustering.
• It takes too much time to find an empty slot.
Linear Probing Example
• Insert the following sequence of keys in the hash table
{9, 7, 11, 13, 12, 8}
• Use linear probing technique for collision resolution
h(k, i) = [h(k) + i] mod m
(k) = 2k + 5
m=10
Step 3
Quadratic probing
• When a collision happens in this, we probe for the i2-nd slot in the ith
iteration, continuing to do so until an empty slot is discovered.
• In comparison to linear probing, quadratic probing has a worse cache
performance. Additionally, clustering is less of a concern with
quadratic probing.
How Quadratic Probing is done?
• Let hash(x) be the slot index computed using the hash function.
• If the slot hash(x) % S is full, then we try (hash(x) + 1*1) % S.
• If (hash(x) + 1*1) % S is also full, then we try (hash(x) + 2*2) % S.
• If (hash(x) + 2*2) % S is also full, then we try (hash(x) + 3*3) % S.
• This process is repeated for all the values of i until an empty slot is
found
• example: Let us consider a
simple hash function as “key
mod 7” and sequence of keys as
50, 700, 76, 85, 92, 73, 101
Double hashing
• In this, you employ a different hashing algorithm, and in the ith
iteration, you look for (i * hash 2(x)).
• The determination of two hash functions requires more time.
• Although there is no clustering issue, the performance of the cache is
relatively poor when using double probing.
• First hash function is typically hash1(key) = key % TABLE_SIZE
• A popular second hash function is hash2(key) = PRIME – (key %
PRIME) where PRIME is a prime smaller than the TABLE_SIZE.
A good second Hash function is:
• It must never evaluate to zero
• Just make sure that all cells can be probed
Rehashing
• Rehashing is a collision resolution technique which tries to find
another empty spot for the value that causes collision.
• It is a process through which of the table is doubled and the value
that causes collision can find the spot in that new table.
• The difference between double hashing and rehashing are:
• In double hashing, two different hash functions are applied at the
same time and in rehashing same function is applied again and again
to generate a unique mapping value.
• create new table which is double size of previous hash table.
• we will scan elements of the previous table one by one and calculate
the hash key with the new hash function and insert them into the
new hash [Link] technique is called rehashing.
• It ensures insertion of elements in the hash table.
Example- Refer text page no-11.11 and 11.12
THANK YOU