HASHING IN
D ATA S T R U C T U R E
2
CONTENTS
o Introduction
o Hash table organizations
o Hashing functions
o Hashing techniques
o Summary
3
INTRODUCTION
• Hashing is an effective way to store and retrieve data in
some data structure.
• Hashing technique is designed to use a special function
called the head function which is used to map a given
value with a particular key for faster access of elements.
• The efficiency of mapping depends on the efficiency of the
hash function used
• Ex: In universities, each student is assigned a unique roll
number . In libraries, each book is assigned a unique
number that can be used to determine information about
the book, such as its exact position in the library or the
users it has been issued to etc.
4
• Hashing is implemented in two steps:
• An element is converted into an integer by using a hash
function. This element can be used as an index to store the
original element, which falls into the hash table.
• The element is stored in the hash table where it can be quickly
retrieved using hashed key.
hash = hashfunc(key)
index = hash % array_size
5
A D VA N TA G E S
• The main advantage of hash tables over other table data
structures is speed.
• This advantage is more apparent when the number of
entries is large (thousands or more).
• Hash tables are particularly efficient when the maximum
number of entries can be predicted in advance, so that the
bucket array can be allocated once with the optimum size
and never resized.
H A S H TA B L E
6
• Hash Table is a data structure which stores data in an
associative manner.
• In hash table, the data is stored in an array format where
each data value has its own unique index value.
• Access of data becomes very fast, if we know the index of
the desired data.
7
HASH FUNCTION
• A hash function is any function that can be used to map a data set of an
arbitrary size to a data set of a fixed size, which falls into the hash table.
• It is important to have a good hash function with the following basic
requirements:
• Easy to compute
• Uniform distribution
• Less Collision
There are different types of hash functions
• Division Method.
• Mid Square Method.
• Folding Method.
• Multiplication Method.
8
HASH TECHNIQUES
Collision resolution techniques
• If x1 and x2 are two different keys, it is possible that h(x1) = h(x2). This
is called a collision. It is the most important issue in hash table
implementations.
• Choosing a hash function that minimizes the number of collisions and
also hashes uniformly is another critical issue.
• Separate chaining (open hashing)
• Linear probing (open addressing or closed hashing)
• Quadratic Probing
• Double hashing
9
Separate Chaining (Open Hashing)
• Separate chaining is one of the most commonly used collision resolution
techniques.
• It is usually implemented using linked lists. In separate chaining, each
element of the hash table is a linked list.
• To store an element in the hash table we must insert it into a specific
linked list.
• If there is any collision (i.e. two different elements have same hash
value) then store both the elements in the same linked list.
10
11
Linear Probing
• In open addressing, instead of in linked lists, all entry records are stored in
the array itself.
• When a new entry has to be inserted, the hash index of the hashed value is
computed and then the array is examined.
• If the slot at the hashed index is unoccupied, then the entry record is
inserted in slot at the hashed index else it proceeds in some probe sequence
until it finds an unoccupied slot.
12
Linear probing is when the interval between successive probes is fixed
(usually to 1).
Let’s assume that the hashed index for a particular entry is index.
The probing sequence for linear probing will be:
index = index % hashTableSize
index = (index + 1) % hashTableSize
index = (index + 2) % hashTableSize
index = (index + 3) % hashTableSize
13 /*Create initial hash table*/
void initialize_hash_table(EMPLOYEE a[])
{
int i;
for(i=0;i<HASH_SIZE;i++)
{a[i].id=0;
}
}
/*Compute hash value using the function: H(K)=k%m*/
int H(int k)
{
return k%HASH_SIZE;
}
14
/*Insert an item into the empty slot using linear probing*/
void insert_hash_table(int id, char name[], EMPLOYEE a[])
{
int i,index,h_value;
h_value= H(id);
for(i=0;i<HASH_SIZE;i++)
{
index=(h_value+i)%HASH_SIZE;
if(a[index].id==0) //empty slot found
{
a[index].id=id; //insert employee id
strcpy(a[index].name,name); //insert employee name
break;
}
15
}
if(i==HASH_SIZE)
printf("Hash table full\n"); // NO empty slot
}
/*Display the hash table*/
void display_hash_table(EMPLOYEE a[],int n)
{
int k,i;
printf("H_Value\t Emp_id\t Emp_name\n");
for(i=0;i<n;i++)
{if(a[i].id!=0)
printf("%d\t %d\t %s\n",i,a[i].id,a[i].name);
}
}
THANK YOU