0% found this document useful (0 votes)

46 views75 pages

Understanding Hashing in Data Structures

detailed notes on hash tables

Uploaded by

lkithlik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views75 pages

Understanding Hashing in Data Structures

detailed notes on hash tables

Uploaded by

lkithlik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Hashing:

Hashing in Data Structure-

In data structures,
• Hashing is a well-known technique to search any
particular element among several elements.
• It minimizes the number of comparisons while
performing the search.

Advantage-
• Unlike other searching techniques,
• Hashing is extremely efficient.
• The time taken by it to perform the search does not
depend upon the total number of elements.
• It completes the search with constant time complexity
O(1).
2
Hashing Mechanism-
In hashing,
• An array data structure called as Hash table is used to
store the data items.
• Based on the hash key value, data items are inserted into
the hash table.

Hash Key Value-

• Hash key value is a special value that serves as an index
for a data item.
• It indicates where the data item should be be stored in the
hash table.
• Hash key value is generated using a hash function.

3
4
Hash Function-
• Hash function is a function that maps any big number or
string to a small integer value.
• Hash function takes the data item as an input and returns
a small integer value as an output.
• The small integer value is called as a hash value.
• Hash value of the data item is then used as an index for
storing it into the hash table.

Types of Hash Functions-

• There are various types of hash functions available such
as-
• Mid Square Hash Function
• Division Hash Function
• Folding Hash Function etc

• It depends on the user which hash function he wants to 5

use.
The properties of a good hash function are-
• It is efficiently computable.
• It minimizes the number of collisions.
• It distributes the keys uniformly over the table.

6
Hashing: Collision Resolution Schemes
• Collision Resolution Techniques

• Separate Chaining
– Separate Chaining with String Keys
– The class hierarchy of Hash Tables
– Implementation of Separate Chaining

• Introduction to Collision Resolution using Open

Addressing
– Linear Probing
– Quadratic Probing
– Double Hashing
– Rehashing

• Algorithms for insertion, searching, and deletion in Open

Addressing
• Separate Chaining versus Open-addressing
7
Collision Resolution Techniques

• There are two broad ways of collision resolution:

1. Separate Chaining:: An array of linked list implementation.

2. Open Addressing: Array-based implementation.

(i) Linear probing (linear search)

(ii) Quadratic probing (nonlinear search)
(iii) Double hashing (uses two hash functions)

8
Collision Resolution Techniques

9
Collision Resolution Techniques
Which is the Preferred Technique?

The performance of both the techniques depend on the kind

of operations that are required to be performed on the keys
stored in the hash table-

10
Collision Resolution Techniques

11
12
13
14
15
16
17
18
Separate Chaining
• The hash table is implemented as an array of linked lists.

• Inserting an item, r, that hashes at index i is simply insertion into the linked list at
position i.

• Synonyms are chained in the same linked list.

19
Separate Chaining (cont’d)
• Retrieval of an item, r, with hash address, i, is simply retrieval from the linked list
at position i.
• Deletion of an item, r, with hash address, i, is simply deleting r from the linked list
at position i.
• Example: Load the keys 23, 13, 21, 14, 7, 8, and 15 , in this order, in a hash table
of size 7 using separate chaining with the hash function: h(key) = key % 7
h(23) = 23 % 7 = 2
h(13) = 13 % 7 = 6
h(21) = 21 % 7 = 0
h(14) = 14 % 7 = 0 collision
h(7) = 7 % 7 = 0 collision
h(8) = 8 % 7 = 1
h(15) = 15 % 7 = 1 collision

20
Separate Chaining with String Keys
• Recall that search keys can be numbers, strings or some other object.
• A hash function for a string s = c0c1c2…cn-1 can be defined as:
hash = (c0 + c1 + c2 + … + cn-1) % tableSize
this can be implemented as:
public static int hash(String key, int tableSize){
int hashValue = 0;
for (int i = 0; i < [Link](); i++){
hashValue += [Link](i);
}
return hashValue % tableSize;
}

• Example: The following class describes commodity items:

class CommodityItem {
String name; // commodity name
int quantity; // commodity quantity needed
double price; // commodity price
}
21
Separate Chaining with String Keys (cont’d)
• Use the hash function hash to load the following commodity items into a
hash table of size 13 using separate chaining:
onion 1 10.0
tomato 1 8.50
cabbage 3 3.50
carrot 1 5.50
okra 1 6.50
mellon 2 10.0
potato 2 7.50
Banana 3 4.00
olive 2 15.0
salt 2 2.50
cucumber 3 4.50
mushroom 3 5.50
orange 2 3.00
• Solution:

hash(onion) = (111 + 110 + 105 + 111 + 110) % 13 = 547 % 13 = 1

hash(salt) = (115 + 97 + 108 + 116) % 13 = 436 % 13 = 7
hash(orange) = (111 + 114 + 97 + 110 + 103 + 101)%13 = 636 %13 = 12 22
Separate Chaining with String Keys (cont’d)
0 okra potato

1 onion carrot
2 Item Qty Price h(key)
onion 1 10.0 1
3
tomato 1 8.50 10
4 cabbage 3 3.50 4
cabbage carrot 1 5.50 1
5 okra 1 6.50 0
mellon 2 10.0 10
6 mushroom potato 2 7.50 0
Banana 3 4.0 11
7 olive 2 15.0 10
salt salt 2 2.50 7
8 cucumber 3 4.50 9
mushroom 3 5.50 6
9 cucumber orange 2 3.00 12

10 tomato mellon olive

11 banana
12 orange 23
Separate Chaining with String Keys (cont’d)
• Alternative hash functions for a string
s = c0c1c2…cn-1
exist, some are:
• hash = (c0 + 27 * c1 + 729 * c2) % tableSize
• hash = (c0 + cn-1 + [Link]()) % tableSize
s .length ()−1
• hash = [  26 * k + [Link](k)−' ']%tableSize
k =0

24
Implementing Hash Tables: The Hierarchy Tree

Container AbstractContainer

SearchableContainer

HashTable AbstractHashTable

ChainedHashTable

OpenScatterTable
25
Implementation of Separate Chaining
public class ChainedHashTable extends AbstractHashTable {
protected MyLinkedList [ ] array;
public ChainedHashTable(int size) {
array = new MyLinkedList[size];
for(int j = 0; j < size; j++)
array[j] = new MyLinkedList( );
}
public void insert(Object key) {
array[h(key)].append(key); count++;
}
public void withdraw(Object key) {
array[h(key)].extract(key); count--;
}
public Object find(Object key){
int index = h(key);
[Link] e = array[index].getHead( );
while(e != null){
if([Link]([Link]()) return [Link]();
e = [Link]();
}
return null;
}
26
}
Introduction to Open Addressing
• All items are stored in the hash table itself.
• In addition to the cell data (if any), each cell keeps one of the three states: EMPTY,
OCCUPIED, DELETED.
• While inserting, if a collision occurs, alternative cells are tried until an empty cell is found.

• Deletion: (lazy deletion): When a key is deleted the slot is marked as DELETED rather than
EMPTY otherwise subsequent searches that hash at the deleted cell will fail.

• Probe sequence: A probe sequence is the sequence of array indexes that is followed in
searching for an empty cell during an insertion, or in searching for a key during find or delete
operations.

• The most common probe sequences are of the form:

hi(key) = [h(key) + c(i)] % n, for i = 0, 1, …, n-1.
where h is a hash function and n is the size of the hash table
• The function c(i) is required to have the following two properties:
Property 1: c(0) = 0
Property 2: The set of values {c(0) % n, c(1) % n, c(2) % n, . . . , c(n-1) % n} must be a
permutation of {0, 1, 2,. . ., n – 1}, that is, it must contain every integer between 0 and n - 1
inclusive.

27
Introduction to Open Addressing

28
Introduction to Open Addressing

29
Introduction to Open Addressing

30
Introduction to Open Addressing

31
Introduction to Open Addressing (cont’d)
• The function c(i) is used to resolve collisions.

• To insert item r, we examine array location h0(r) = h(r). If there is a collision, array locations
h1(r), h2(r), ..., hn-1(r) are examined until an empty slot is found.

• Similarly, to find item r, we examine the same sequence of locations in the same order.

• Note: For a given hash function h(key), the only difference in the open addressing collision
resolution techniques (linear probing, quadratic probing and double hashing) is in the
definition of the function c(i).
• Common definitions of c(i) are:

Collision resolution technique c(i)

Linear probing i
Quadratic probing ±i2
Double hashing i*hp(key)

where hp(key) is another hash function. 32

Introduction to Open Addressing (cont'd)
• Advantages of Open addressing:
– All items are stored in the hash table itself. There is no need for
another data structure.
– Open addressing is more efficient storage-wise.

• Disadvantages of Open Addressing:

– The keys of the objects to be hashed must be distinct.
– Dependent on choosing a proper table size.
– Requires the use of a three-state (Occupied, Empty, or Deleted)
flag in each cell.

33
Open Addressing Facts
• In general, primes give the best table sizes.

• With any open addressing method of collision resolution,

as the table fills, there can be a severe degradation in the table performance.

• Load factors between 0.6 and 0.7 are common.

• Load factors > 0.7 are undesirable.

• The search time depends only on the load factor, not on the table size.

• We can use the desired load factor to determine appropriate table size:

34
Open Addressing Facts

35
Open Addressing Facts

36
Open Addressing: Linear Probing
• c(i) is a linear function in i of the form c(i) = a*i.
• Usually c(i) is chosen as:
c(i) = i for i = 0, 1, . . . , tableSize – 1

• The probe sequences are then given by:

hi(key) = [h(key) + i] % tableSize for i = 0, 1, . . . , tableSize – 1

• For c(i) = a*i to satisfy Property 2, a and n must be relatively

prime.

37
Open Addressing: Linear Probing

38
Open Addressing: Linear Probing

39
Open Addressing: Linear Probing

40
Open Addressing: Linear Probing

41
Open Addressing: Linear Probing

42
Linear Probing (cont’d)
Example: Perform the operations given below, in the given order, on
an initially empty hash table of size 13 using linear probing with
c(i) = i and the hash function: h(key) = key % 13:

insert(18), insert(26), insert(35), insert(9), find(15), find(48),

delete(35), delete(40), find(9), insert(64), insert(47), find(35)

• The required probe sequences are given by:

hi(key) = (h(key) + i) % 13 i = 0, 1, 2, . . ., 12

43
Linear Probing (cont’d)
a

Index Status Value

0 O 26
1 E
2 E
3 E
4 E
5 O 18
6 E
7 E
8 O 47
9 D 35
10 O 9
11 E
12 O 64
44
Disadvantage of Linear Probing: Primary Clustering
• Linear probing is subject to a primary clustering phenomenon.
• Elements tend to cluster around table locations that they originally hash to.
• Primary clusters can combine to form larger clusters. This leads to long probe
sequences and hence deterioration in hash table efficiency.

Example of a primary cluster: Insert keys: 18, 41, 22, 44, 59, 32, 31, 73, in this order, in an
originally empty hash table of size 13, using the hash function h(key) = key % 13 and c(i) = i:
h(18) = 5
h(41) = 2
h(22) = 9
h(44) = 5+1
h(59) = 7
h(32) = 6+1+1
h(31) = 5+1+1+1+1+1
h(73) = 8+1+1+1
45
Open Addressing: Quadratic Probing
• Quadratic probing eliminates primary clusters.
• c(i) is a quadratic function in i of the form c(i) = a*i2 + b*i. Usually c(i) is chosen
as:
c(i) = i2 for i = 0, 1, . . . , tableSize – 1
or
c(i) = i2 for i = 0, 1, . . . , (tableSize – 1) / 2

• The probe sequences are then given by:

hi(key) = [h(key) + i2] % tableSize for i = 0, 1, . . . , tableSize – 1
or
hi(key) = [h(key)  i2] % tableSize for i = 0, 1, . . . , (tableSize – 1) / 2

• Note for Quadratic Probing:

➢ Hashtable size should not be an even number; otherwise Property 2 will not be
satisfied.
➢ Ideally, table size should be a prime of the form 4j+3, where j is an integer. This
choice of table size guarantees Property 2. 46
Quadratic Probing (cont’d)
• Example: Load the keys 23, 13, 21, 14, 7, 8, and 15, in this order,
in a hash table of size 7 using quadratic probing with c(i) = i2 and
the hash function: h(key) = key % 7
• The required probe sequences are given by:
hi(key) = (h(key)  i2) % 7 i = 0, 1, 2, 3

47
Quadratic Probing (cont’d)
h0(23) = (23 % 7) % 7 = 2 hi(key) = (h(key)  i2) % 7 i = 0, 1, 2, 3
h0(13) = (13 % 7) % 7 = 6
h0(21) = (21 % 7) % 7 = 0
h0(14) = (14 % 7) % 7 = 0 collision
0 O 21
h1(14) = (0 + 12) % 7 = 1
h0(7) = (7 % 7) % 7 = 0 collision
h1(7) = (0 + 12) % 7 = 1 collision 1 O 14
h-1(7) = (0 - 12) % 7 = -1
NORMALIZE: (-1 + 7) % 7 = 6 collision 2 O 23
h2(7) = (0 + 22) % 7 = 4
h0(8) = (8 % 7)%7 = 1 collision 3 O 15
h1(8) = (1 + 12) % 7 = 2 collision
h-1(8) = (1 - 12) % 7 = 0 collision 4 O 7
h2(8) = (1 + 22) % 7 = 5
h0(15) = (15 % 7)%7 = 1 collision 5 O 8
2
h1(15) = (1 + 1 ) % 7 = 2 collision
h-1(15) = (1 - 12) % 7 = 0 collision 6 O 13
2
h2(15) = (1 + 2 ) % 7 = 5 collision
h-2(15) = (1 - 22) % 7 = -3
NORMALIZE: (-3 + 7) % 7 = 4 collision 48
h3(15) = (1 + 32)%7 = 3
Secondary Clusters
• Quadratic probing is better than linear probing because it eliminates primary
clustering.
• However, it may result in secondary clustering: if h(k1) = h(k2) the probing
sequences for k1 and k2 are exactly the same. This sequence of locations is called
a secondary cluster.
• Secondary clustering is less harmful than primary clustering because secondary
clusters do not combine to form large clusters.
• Example of Secondary Clustering: Suppose keys k0, k1, k2, k3, and k4 are
inserted in the given order in an originally empty hash table using quadratic
probing with c(i) = i2. Assuming that each of the keys hashes to the same array
index x. A secondary cluster will develop and grow in size:

49
Double Hashing
• To eliminate secondary clustering, synonyms must have different probe sequences.

• Double hashing achieves this by having two hash functions that both depend on the
hash key.

• c(i) = i * hp(key) for i = 0, 1, . . . , tableSize – 1

where hp (or h2) is another hash function.

• The probing sequence is:

hi(key) = [h(key) + i*hp(key)]% tableSize for i = 0, 1, . . . , tableSize – 1

• The function c(i) = i*hp(r) satisfies Property 2 provided hp(r) and tableSize are
relatively prime.

• To guarantee Property 2, tableSize must be a prime number.

• Common definitions for hp are :

➢ hp(key) = 1 + key % (tableSize - 1)
➢ hp(key) = q - (key % q) where q is a prime less than tableSize
➢ hp(key) = q*(key % q) where q is a prime less than tableSize
50
Double Hashing (cont'd)
Performance of Double hashing:
– Much better than linear or quadratic probing because it eliminates both primary
and secondary clustering.
– BUT requires a computation of a second hash function hp.

Example: Load the keys 18, 26, 35, 9, 64, 47, 96, 36, and 70 in this order, in an
empty hash table of size 13
(a) using double hashing with the first hash function: h(key) = key % 13 and the
second hash function: hp(key) = 1 + key % 12
(b) using double hashing with the first hash function: h(key) = key % 13 and
the second hash function: hp(key) = 7 - key % 7
Show all computations.

51
Double Hashing (cont’d)

h0(18) = (18%13)%13 = 5 hi(key) = [h(key) + i*hp(key)]% 13

h0(26) = (26%13)%13 = 0 h(key) = key % 13
h0(35) = (35%13)%13 = 9
h0(9) = (9%13)%13 = 9 collision hp(key) = 1 + key % 12
hp(9) = 1 + 9%12 = 10
h1(9) = (9 + 1*10)%13 = 6
h0(64) = (64%13)%13 = 12
h0(47) = (47%13)%13 = 8
h0(96) = (96%13)%13 = 5 collision
hp(96) = 1 + 96%12 = 1
h1(96) = (5 + 1*1)%13 = 6 collision
h2(96) = (5 + 2*1)%13 = 7
h0(36) = (36%13)%13 = 10
h0(70) = (70%13)%13 = 5 collision
hp(70) = 1 + 70%12 = 11
h1(70) = (5 + 1*11)%13 = 3
52
Double Hashing (cont'd)

h0(18) = (18%13)%13 = 5 hi(key) = [h(key) + i*hp(key)]% 13

h0(26) = (26%13)%13 = 0 h(key) = key % 13
h0(35) = (35%13)%13 = 9
h0(9) = (9%13)%13 = 9 collision hp(key) = 7 - key % 7
hp(9) = 7 - 9%7 = 5
h1(9) = (9 + 1*5)%13 = 1
h0(64) = (64%13)%13 = 12
h0(47) = (47%13)%13 = 8
h0(96) = (96%13)%13 = 5 collision
hp(96) = 7 - 96%7 = 2
h1(96) = (5 + 1*2)%13 = 7
h0(36) = (36%13)%13 = 10
h0(70) = (70%13)%13 = 5 collision
hp(70) = 7 - 70%7 = 7
h1(70) = (5 + 1*7)%13 = 12 collision
h2(70) = (5 + 2*7)%13 = 6
53
Implementation of Open Addressing
public class OpenScatterTable extends AbstractHashTable {
protected Entry array[];
protected static final int EMPTY = 0;
protected static final int OCCUPIED = 1;
protected static final int DELETED = 2;

protected static final class Entry {

public int state = EMPTY;
public Comparable object;
// …
}

public OpenScatterTable(int size) {

array = new Entry[size];
for(int i = 0; i < size; i++)
array[i] = new Entry();
}
// …
}

54
Implementation of Open Addressing (Con’t.)
/* finds the index of the first unoccupied slot
in the probe sequence of obj */
protected int findIndexUnoccupied(Comparable obj){
int hashValue = h(obj);
int tableSize = getLength();
int indexDeleted = -1;
for(int i = 0; i < tableSize; i++){
int index = (hashValue + c(i)) % tableSize;
if(array[index].state == OCCUPIED
&& [Link](array[index].object))
throw new IllegalArgumentException(
"Error: Duplicate key");
else if(array[index].state == EMPTY ||
(array[index].state == DELETED &&
[Link](array[index].object)))
return indexDeleted ==-1?index:indexDeleted;
else if(array[index].state == DELETED &&
indexDeleted == -1)
indexDeleted = index;
}
if(indexDeleted != -1) return indexDeleted;
throw new IllegalArgumentException(
"Error: Hash table is full");
}
55
Implementation of Open Addressing (Con’t.)
protected int findObjectIndex(Comparable obj){
int hashValue = h(obj);
int tableSize = getLength();

for(int i = 0; i < tableSize; i++){

int index = (hashValue + c(i)) % tableSize;
if(array[index].state == EMPTY
|| (array[index].state == DELETED
&& [Link](array[index].object)))
return -1;
else if(array[index].state == OCCUPIED
&& [Link](array[index].object))
return index;
}
return -1;
}

public Comparable find(Comparable obj){

int index = findObjectIndex(obj);
if(index >= 0)return array[index].object;
else return null; 56
}
Implementation of Open Addressing (Con’t.)
public void insert(Comparable obj){
if(count == getLength()) throw new ContainerFullException();
else {
int index = findIndexUnoccupied(obj);
// throws exception if an UNOCCUPIED slot is not found
array[index].state = OCCUPIED;
array[index].object = obj;
count++;
}
}

public void withdraw(Comparable obj){

if(count == 0) throw new ContainerEmptyException();
int index = findObjectIndex(obj);
if(index < 0)
throw new IllegalArgumentException("Object not found");
else {
array[index].state = DELETED;
// lazy deletion: DO NOT SET THE LOCATION TO null
count--;
}
} 57
Separate Chaining versus Open-addressing
Separate Chaining has several advantages over open addressing:
• Collision resolution is simple and efficient.
• The hash table can hold more elements without the large
performance deterioration of open addressing (The load factor can
be 1 or greater)
• The performance of chaining declines much more slowly than
open addressing.
• Deletion is easy - no special flag values are necessary.
• Table size need not be a prime number.
• The keys of the objects to be hashed need not be unique.
Disadvantages of Separate Chaining:
• It requires the implementation of a separate data structure for
chains, and code to manage it.
• The main cost of chaining is the extra space required for the
linked lists.
• For some languages, creating new nodes (for linked lists) is
expensive and slows down the system.
58
REHASHING
1. Rehashing is a collision resolution technique.

2. Rehashing is a technique in which the table is resized, i.e., the

size of table is doubled by creating a new table. It is preferable is
the total size of table is a prime number. There are situations in
which the rehashing is required.

• When table is completely full

• With quadratic probing when the table is filled half.

• When insertions fail due to overflow.

In such situations, we have to transfer entries from old table to the

new table by re computing their positions using hash functions.

59
REHASHING
Consider we have to insert the elements 37, 90, 55, 22, 17, 49, and
87. the table size is 10 and will use hash function:

H(key) = key mod tablesize

37 % 10 = 7

90 % 10= 0

55 % 10 = 5

22 % 10 = 2

17 % 10 = 7

49 % 10 = 9
60
REHASHING
Now this table is almost full and if we try to insert more elements
collisions will occur and eventually further insertions will fail.
Hence we will rehash by doubling the table size. The old table
size is 10 then we should double this size for new table, that
becomes 20. But 20 is not a prime number, we will prefer to make
the table size as 23. And new hash function will be

H(key)= key mod 23

37 % 23 = 14
90 % 23 = 21
55 % 23 = 9
22 % 23 = 22
17 % 23 = 17
49 % 23 = 3
87 % 23 = 18
Now the hash table is sufficiently large to accommodate new
insertions.
61
REHASHING
Advantages:
1. This technique provides the programmer a flexibility to enlarge
the table size if required.
2. Only the space gets doubled with simple hash function which
avoids occurrence of collisions.

62
EXTENDIBLE HASHING
• Extendible Hashing is a dynamic hashing method wherein directories, and
buckets are used to hash data. It is an aggressively flexible method in which the
hash function also experiences dynamic changes.

• Main features of Extendible Hashing: The main features in this hashing

technique are:
– Directories: The directories store addresses of the buckets in pointers. An id is
assigned to each directory which may change each time when Directory Expansion
takes place.
– Buckets: The buckets are used to hash the actual data.
– Basic Structure of Extendible Hashing:

63
EXTENDIBLE HASHING
Frequently used terms in Extendible Hashing:
• Directories: These containers store pointers to buckets. Each directory is given a unique
id which may change each time when expansion takes place. The hash function returns
this directory id which is used to navigate to the appropriate bucket. Number of
Directories = 2^Global Depth.
• Buckets: They store the hashed keys. Directories point to buckets. A bucket may
contain more than one pointers to it if its local depth is less than the global depth.
• Global Depth: It is associated with the Directories. They denote the number of bits
which are used by the hash function to categorize the keys. Global Depth = Number of
bits in directory id.
• Local Depth: It is the same as that of Global Depth except for the fact that Local Depth
is associated with the buckets and not the directories. Local depth in accordance with
the global depth is used to decide the action that to be performed in case an overflow
occurs. Local Depth is always less than or equal to the Global Depth.
• Bucket Splitting: When the number of elements in a bucket exceeds a particular size,
then the bucket is split into two parts.
• Directory Expansion: Directory Expansion Takes place when a bucket overflows.
Directory Expansion is performed when the local depth of the overflowing bucket is
equal to the global depth.

64
EXTENDIBLE HASHING
Basic Working of Extendible Hashing:

65
EXTENDIBLE HASHING
Basic Working of Extendible Hashing:
Step 1 – Analyze Data Elements: Data elements may exist in various forms eg. Integer,
String, Float, etc.. Currently, let us consider data elements of type integer. eg: 49.

Step 2 – Convert into binary format: Convert the data element in Binary form. For string
elements, consider the ASCII equivalent integer of the starting character and then convert
the integer into binary form. Since we have 49 as our data element, its binary form is
110001.

Step 3 – Check Global Depth of the directory. Suppose the global depth of the Hash-
directory is 3.

Step 4 – Identify the Directory: Consider the ‘Global-Depth’ number of LSBs in the binary
number and match it to the directory id.
Eg. The binary obtained is: 110001 and the global-depth is 3. So, the hash function will
return 3 LSBs of 110001 viz. 001.

Step 5 – Navigation: Now, navigate to the bucket pointed by the directory with directory-id
001.

Step 6 – Insertion and Overflow Check: Insert the element and check if the bucket
overflows. If an overflow is encountered, go to step 7 followed by Step 8, otherwise, go to
step 9.
66
EXTENDIBLE HASHING
Step 7 – Tackling Over Flow Condition during Data Insertion: Many times, while inserting
data in the buckets, it might happen that the Bucket overflows. In such cases, we need to
follow an appropriate procedure to avoid mishandling of data.
First, Check if the local depth is less than or equal to the global depth. Then choose one of
the cases below.
• Case1: If the local depth of the overflowing Bucket is equal to the global depth, then
Directory Expansion, as well as Bucket Split, needs to be performed. Then increment
the global depth and the local depth value by 1. And, assign appropriate pointers.
Directory expansion will double the number of directories present in the hash structure.
• Case2: In case the local depth is less than the global depth, then only Bucket Split takes
place. Then increment only the local depth value by 1. And, assign appropriate
pointers.

67
EXTENDIBLE HASHING
Step 8 – Rehashing of Split Bucket Elements: The Elements present in the overflowing
bucket that is split are rehashed w.r.t the new global depth of the directory.
Step 9 – The element is successfully hashed.

68
EXTENDIBLE HASHING

69
EXTENDIBLE HASHING

70
EXTENDIBLE HASHING

71
EXTENDIBLE HASHING

72
EXTENDIBLE HASHING

73
EXTENDIBLE HASHING

74
EXTENDIBLE HASHING

75
EXTENDIBLE HASHING

Hashing Collision Resolution Techniques
No ratings yet
Hashing Collision Resolution Techniques
20 pages
Hashing: Collision Handling Methods
No ratings yet
Hashing: Collision Handling Methods
52 pages
Hashing Techniques and Collision Resolution
No ratings yet
Hashing Techniques and Collision Resolution
55 pages
Hashing
No ratings yet
Hashing
27 pages
Hash Table Implementation and Techniques
100% (1)
Hash Table Implementation and Techniques
30 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
4 pages
Module 8
No ratings yet
Module 8
15 pages
Hashing Techniques and Collision Resolution
No ratings yet
Hashing Techniques and Collision Resolution
44 pages
Hashing Data Structures Notes
No ratings yet
Hashing Data Structures Notes
27 pages
Understanding Hashing and Its Functions
No ratings yet
Understanding Hashing and Its Functions
30 pages
Hash Functions and Collision Resolution
No ratings yet
Hash Functions and Collision Resolution
47 pages
Hashing Techniques and Collision Handling
No ratings yet
Hashing Techniques and Collision Handling
16 pages
Understanding Hashing Techniques and Functions
No ratings yet
Understanding Hashing Techniques and Functions
32 pages
Understanding Hash Tables and Collision Resolution
No ratings yet
Understanding Hash Tables and Collision Resolution
14 pages
Lazy Deletion in Hash Tables
No ratings yet
Lazy Deletion in Hash Tables
27 pages
Introduction to Hashing Techniques
No ratings yet
Introduction to Hashing Techniques
12 pages
Chapter 6 Hashing in Data Structure
No ratings yet
Chapter 6 Hashing in Data Structure
17 pages
DSA Hashing
No ratings yet
DSA Hashing
34 pages
Understanding Hash Tables in Data Structures
No ratings yet
Understanding Hash Tables in Data Structures
34 pages
Understanding Hashing and Collision Handling
No ratings yet
Understanding Hashing and Collision Handling
12 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
30 pages
Collision Resolution Techniques
No ratings yet
Collision Resolution Techniques
9 pages
Closed Hashing Techniques Explained
No ratings yet
Closed Hashing Techniques Explained
11 pages
Hashing Time Complexity Explained
No ratings yet
Hashing Time Complexity Explained
22 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
37 pages
Hashing
No ratings yet
Hashing
7 pages
Hash Table Functions and Techniques
No ratings yet
Hash Table Functions and Techniques
26 pages
Hash Table Search Complexity Explained
No ratings yet
Hash Table Search Complexity Explained
43 pages
Hash-Based Dictionary Implementation
No ratings yet
Hash-Based Dictionary Implementation
40 pages
Hashing
No ratings yet
Hashing
22 pages
Understanding Hashing in Data Structures
No ratings yet
Understanding Hashing in Data Structures
5 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
42 pages
Understanding Hashing Techniques
No ratings yet
Understanding Hashing Techniques
7 pages
Understanding Hashing Structures
No ratings yet
Understanding Hashing Structures
47 pages
Double Hashing in Collision Resolution
No ratings yet
Double Hashing in Collision Resolution
22 pages
Understanding Hashing Techniques and Functions
No ratings yet
Understanding Hashing Techniques and Functions
9 pages
Collision Resolution Techniques in Hashing
No ratings yet
Collision Resolution Techniques in Hashing
19 pages
Understanding Hashing and Collision Resolution
No ratings yet
Understanding Hashing and Collision Resolution
33 pages
Hashing and Priority Queues Overview
No ratings yet
Hashing and Priority Queues Overview
35 pages
Overview of Hash Functions and Techniques
No ratings yet
Overview of Hash Functions and Techniques
9 pages
Hashing Time Complexity Explained
No ratings yet
Hashing Time Complexity Explained
44 pages
Hashing Techniques Overview
No ratings yet
Hashing Techniques Overview
30 pages
Hashing Techniques in Algorithms
No ratings yet
Hashing Techniques in Algorithms
36 pages
Hashing Concepts and Techniques
No ratings yet
Hashing Concepts and Techniques
26 pages
Hash Table: Structure and Techniques
No ratings yet
Hash Table: Structure and Techniques
9 pages
Overview of Hashing Techniques
No ratings yet
Overview of Hashing Techniques
13 pages
Hashing Techniques in Data Structures
No ratings yet
Hashing Techniques in Data Structures
47 pages
Hashing Techniques in Algorithm Design
No ratings yet
Hashing Techniques in Algorithm Design
38 pages
Sparse Matrix and Hashing Techniques
No ratings yet
Sparse Matrix and Hashing Techniques
41 pages
Hashing in Data Structures Explained
No ratings yet
Hashing in Data Structures Explained
7 pages
Open Addressing Hash Table Operations
No ratings yet
Open Addressing Hash Table Operations
23 pages
Understanding Hash Tables and Hashing
No ratings yet
Understanding Hash Tables and Hashing
16 pages
Hashing V2
No ratings yet
Hashing V2
38 pages
Introduction to Hashing Techniques
No ratings yet
Introduction to Hashing Techniques
21 pages
Understanding Hashing Techniques and Functions
No ratings yet
Understanding Hashing Techniques and Functions
34 pages
Hashing Techniques in C++ Programming
No ratings yet
Hashing Techniques in C++ Programming
21 pages
Understanding Hash Tables and Collision Resolution
No ratings yet
Understanding Hash Tables and Collision Resolution
48 pages
Matrix Hashing with Collision Resolution
No ratings yet
Matrix Hashing with Collision Resolution
7 pages
Hashing Techniques and Applications
No ratings yet
Hashing Techniques and Applications
44 pages
Binary Ordering Algorithm for Matrices
No ratings yet
Binary Ordering Algorithm for Matrices
5 pages
Dynamic Memory Allocation Techniques
No ratings yet
Dynamic Memory Allocation Techniques
11 pages
C Programs for Sorting and Searching Arrays
No ratings yet
C Programs for Sorting and Searching Arrays
12 pages
Longest Common Subsequence Explained
No ratings yet
Longest Common Subsequence Explained
4 pages
Dynamic Programming for Coin Change
No ratings yet
Dynamic Programming for Coin Change
18 pages
BTEC HND Computing Assessment Verification
0% (2)
BTEC HND Computing Assessment Verification
142 pages
Stack Implementation in C: Array & Linked List
No ratings yet
Stack Implementation in C: Array & Linked List
4 pages
Machine Learning Exam Spring 2024
No ratings yet
Machine Learning Exam Spring 2024
14 pages
Advanced Graph Algorithms Explained
No ratings yet
Advanced Graph Algorithms Explained
3 pages
AI & ML LAB MANUAL - Updated
No ratings yet
AI & ML LAB MANUAL - Updated
24 pages
Sorting Algorithms and Knapsack Solutions
No ratings yet
Sorting Algorithms and Knapsack Solutions
18 pages
K-means Clustering Example with Two Clusters
No ratings yet
K-means Clustering Example with Two Clusters
11 pages
Optimized Slicing Algorithm for STL Models
No ratings yet
Optimized Slicing Algorithm for STL Models
9 pages
Data Structures Makeup Exam 2016
No ratings yet
Data Structures Makeup Exam 2016
2 pages
Neural Network Question Bank PDF
No ratings yet
Neural Network Question Bank PDF
6 pages
Chapter 6 Dynamic Programming
No ratings yet
Chapter 6 Dynamic Programming
87 pages
Design and Analysis of Algorithms Exam
No ratings yet
Design and Analysis of Algorithms Exam
22 pages
Merge Sort Algorithm Explained
No ratings yet
Merge Sort Algorithm Explained
14 pages
AI Lab Manual: Search Strategies
100% (1)
AI Lab Manual: Search Strategies
30 pages
Skiplists Paper
No ratings yet
Skiplists Paper
8 pages
LeetCode Patterns and Questions Guide
No ratings yet
LeetCode Patterns and Questions Guide
12 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
6 pages
Python Exception Handling & File Management
No ratings yet
Python Exception Handling & File Management
22 pages
Class 8 Algorithm Design Questions
No ratings yet
Class 8 Algorithm Design Questions
4 pages
Linked List Operations in C++ Code
No ratings yet
Linked List Operations in C++ Code
11 pages
Indexing and Hashing
No ratings yet
Indexing and Hashing
84 pages
Kaedah Berangka: Penyelesaian Persamaan
No ratings yet
Kaedah Berangka: Penyelesaian Persamaan
10 pages
Understanding NP-Completeness
No ratings yet
Understanding NP-Completeness
15 pages
Programming Lab: Data Structures & Algorithms
No ratings yet
Programming Lab: Data Structures & Algorithms
2 pages
Data Structures in C++ Explained
No ratings yet
Data Structures in C++ Explained
20 pages

Understanding Hashing in Data Structures

Uploaded by

Understanding Hashing in Data Structures

Uploaded by

Hashing:

Hashing in Data Structure-

Hash Key Value-

Types of Hash Functions-

• It depends on the user which hash function he wants to 5

• Introduction to Collision Resolution using Open

• Algorithms for insertion, searching, and deletion in Open

• There are two broad ways of collision resolution:

1. Separate Chaining:: An array of linked list implementation.

2. Open Addressing: Array-based implementation.

(i) Linear probing (linear search)

The performance of both the techniques depend on the kind

• Synonyms are chained in the same linked list.

• Example: The following class describes commodity items:

hash(onion) = (111 + 110 + 105 + 111 + 110) % 13 = 547 % 13 = 1

10 tomato mellon olive

• The most common probe sequences are of the form:

Collision resolution technique c(i)

where hp(key) is another hash function. 32

• Disadvantages of Open Addressing:

• With any open addressing method of collision resolution,

• Load factors between 0.6 and 0.7 are common.

• Load factors > 0.7 are undesirable.

• The probe sequences are then given by:

• For c(i) = a*i to satisfy Property 2, a and n must be relatively

insert(18), insert(26), insert(35), insert(9), find(15), find(48),

• The required probe sequences are given by:

Index Status Value

• The probe sequences are then given by:

• Note for Quadratic Probing:

• c(i) = i * hp(key) for i = 0, 1, . . . , tableSize – 1

• The probing sequence is:

• To guarantee Property 2, tableSize must be a prime number.

• Common definitions for hp are :

h0(18) = (18%13)%13 = 5 hi(key) = [h(key) + i*hp(key)]% 13

h0(18) = (18%13)%13 = 5 hi(key) = [h(key) + i*hp(key)]% 13

protected static final class Entry {

public OpenScatterTable(int size) {

for(int i = 0; i < tableSize; i++){

public Comparable find(Comparable obj){

public void withdraw(Comparable obj){

2. Rehashing is a technique in which the table is resized, i.e., the

• When table is completely full

• With quadratic probing when the table is filled half.

• When insertions fail due to overflow.

In such situations, we have to transfer entries from old table to the

H(key) = key mod tablesize

H(key)= key mod 23

• Main features of Extendible Hashing: The main features in this hashing

You might also like