0% found this document useful (0 votes)
14 views28 pages

Understanding Hash Functions and Algorithms

Hash functions are mathematical functions that convert arbitrary-length input into fixed-length hash values, providing features like pre-image resistance, second pre-image resistance, and collision resistance. Popular hash functions include MD5, SHA family, RIPEMD, and Whirlpool, each with varying levels of security and applications such as password storage and data integrity checks. While MD5 is no longer recommended due to vulnerabilities, SHA-2 and SHA-3 are considered secure, and RIPEMD algorithms are less secure than SHA but still used in some applications.

Uploaded by

Malik Abu bakar
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views28 pages

Understanding Hash Functions and Algorithms

Hash functions are mathematical functions that convert arbitrary-length input into fixed-length hash values, providing features like pre-image resistance, second pre-image resistance, and collision resistance. Popular hash functions include MD5, SHA family, RIPEMD, and Whirlpool, each with varying levels of security and applications such as password storage and data integrity checks. While MD5 is no longer recommended due to vulnerabilities, SHA-2 and SHA-3 are considered secure, and RIPEMD algorithms are less secure than SHA but still used in some applications.

Uploaded by

Malik Abu bakar
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

* HASH FUNCTION

 A mathematical function that converts a numerical input


value into another compressed numerical value.
 The input to the hash function is of arbitrary length but
output is always of fixed length.
 Values returned by a hash function are called message
digest or simply hash values.
* Features of Hash Functions

 Fixed Length Output (Hash Value)


 Efficiency of Operation
* Fixed Length Output
 Hash function coverts data of arbitrary length to a fixed length.
This process is often referred to as hashing the data.
 In general, the hash is much smaller than the input data, hence
hash functions are sometimes called compression functions.
 Since a hash is a smaller representation of a larger data, it is
also referred to as a digest.
 Hash function with n bit output is referred to as an n-bit hash
function. Popular hash functions generate values between 160
and 512 bits.
* Efficiency of Operation
 Generally for any hash function h with input x,
computation of h(x) is a fast operation.
 Computationally hash functions are much faster than
a symmetric encryption.
* Properties of Hash
Functions
 Pre-Image Resistance
 Second Pre-Image Resistance
 Collision Resistance
* Pre-Image Resistance
 This property means that it should be computationally hard
to reverse a hash function.
 In other words, if a hash function h produced a hash value z,
then it should be a difficult process to find any input value x
that hashes to z.
 This property protects against an attacker who only has a
hash value and is trying to find the input.
* Second Pre-Image
Resistance
 This property means given an input and its hash, it should
be hard to find a different input with the same hash.
 In other words, if a hash function h for an input x produces
hash value h(x), then it should be difficult to find any other
input value y such that h(y) = h(x).
 This property of hash function protects against an attacker
who has an input value and its hash, and wants to substitute
different value as legitimate value in place of original input
value.
* Collision Resistance
 This property means it should be hard to find two different
inputs of any length that result in the same hash. This
property is also referred to as collision free hash function.
 In other words, for a hash function h, it is hard to find any
two different inputs x and y such that h(x) = h(y).
 Since, hash function is compressing function with fixed
hash length, it is impossible for a hash function not to have
collisions. This property of collision free only confirms that
these collisions should be hard to find.
 This property makes it very difficult for an attacker to find
two input values with the same hash.
 Also, if a hash function is collision-resistant then it is
second pre-image resistant.
* Design of Hashing
 Algorithms
At the heart of a hashing is a mathematical function that operates
on two fixed-size blocks of data to create a hash code.
 This hash function forms the part of the hashing algorithm.
 The size of each data block varies depending on the algorithm.
Typically the block sizes are from 128 bits to 512 bits.
 The following illustration demonstrates hash function −
* Design of Hashing
 Algorithms
Hashing algorithm involves rounds of above hash function like a
block cipher.
 Each round takes an input of a fixed size, typically a combination
of the most recent message block and the output of the last round.
 This process is repeated for as many rounds as are required to hash
the entire message.
 Schematic of hashing algorithm is depicted in the following
illustration −
* Design of Hashing
Algorithms
 Since, the hash value of first message block becomes an input to
the second hash operation, output of which alters the result of
the third operation, and so on. This effect, known as
an avalanche effect of hashing.
 Avalanche effect results in substantially different hash values
for two messages that differ by even a single bit of data.
 Understand the difference between hash function and algorithm
correctly. The hash function generates a hash code by operating
on two blocks of fixed-length binary data.
 Hashing algorithm is a process for using the hash function,
specifying how the message will be broken up and how the
results from previous message blocks are chained together.
* Popular Hash
Functions
Message Digest (MD)
 MD5 was most popular and widely used hash function for quite
some years.
 The MD family comprises of hash functions MD2, MD4, MD5
and MD6. It was adopted as Internet Standard RFC 1321. It is a
128-bit hash function.
 MD5 digests have been widely used in the software world to
provide assurance about integrity of transferred file. For
example, file servers often provide a pre-computed MD5
checksum for the files, so that a user can compare the checksum
of the downloaded file to it.
 In 2004, collisions were found in MD5. An analytical attack
was reported to be successful only in an hour by using
computer cluster. This collision attack resulted in compromised
MD5 and hence it is no longer recommended for use.

* MD5 Algorithm
MD5 (Message Digest 5) is a cryptographic hash function that
takes an input of any size and produces a 128-bit hash value.
 MD5 is a widely used hash function that is used in a variety of
security applications, including digital signatures, message
authentication codes (MACs), and file integrity checking.
 The MD5 algorithm works by breaking the input message into
small blocks of data. Each block of data is then processed using a
complex mathematical function to produce a 128-bit hash value.
The hash value is then used to verify the integrity of the message.
 MD5 was designed to be a secure hash function, but it has been
shown to be vulnerable to collision attacks.
 This means that it is possible to find two different messages that
produce the same MD5 hash value.
 As a result, MD5 is no longer considered to be secure and should
not be used for applications where security is critical.

* MD5 Algorithm
Here is an example of how MD5 is used to verify the integrity of
a file:
 The file is divided into small blocks of data.
 Each block of data is processed using the MD5 algorithm to
produce a 128-bit hash value.
 The hash values for all of the blocks of data are then
concatenated together to produce a single 128-bit hash value
for the entire file.
 The hash value is stored with the file.

* MD5 Algorithm
When the file is downloaded, the hash value is recalculated using
the MD5 algorithm.
 If the two hash values match, then the file has not been tampered
with.
 If the two hash values do not match, then the file has been
tampered with and should not be used.
 MD5 is still used in some applications, such as storing passwords
in databases.
 However, it is important to be aware of the security
vulnerabilities of MD5 and to use more secure hash functions for
applications where security is critical.
* Secure Hash Function
(SHA)
Family of SHA comprise of four SHA algorithms; SHA-0, SHA-1,
SHA-2, and SHA-3. Though from same family, there are structurally
different.
 The original version is SHA-0, a 160-bit hash function, was
published by the National Institute of Standards and Technology
(NIST) in 1993. It had few weaknesses and did not become very
popular. Later in 1995, SHA-1 was designed to correct alleged
weaknesses of SHA-0.
 SHA-1 is the most widely used of the existing SHA hash
functions. It is employed in several widely used applications and
protocols including Secure Socket Layer (SSL) security.
 In 2005, a method was found for uncovering collisions for SHA-
1 within practical time frame making long-term employability of
SHA-1 doubtful.
* Secure Hash Function
(SHA)
 SHA-2 family has four further SHA variants, SHA-224, SHA-
256, SHA-384, and SHA-512 depending up on number of bits in
their hash value.
 No successful attacks have yet been reported on SHA-2 hash
function.
 Though SHA-2 is a strong hash function. Though significantly
different, its basic design is still follows design of SHA-1. Hence,
NIST called for new competitive hash function designs.
 InOctober 2012, the NIST chose the Keccak algorithm as the
new SHA-3 standard.
 Keccak offers many benefits, such as efficient performance and
good resistance for attacks.
* Secure Hash Function
 SHA
(SHA)
algorithms are an important tool for protecting data security.
 By using SHA algorithms, organizations can ensure that their data
is not tampered with and that their communications are secure.
 Here is an example of how SHA-256 is used to verify the integrity
of a file:
 The file is divided into small blocks of data.
 Each block of data is processed using the SHA-256 algorithm
to produce a 256-bit hash value.
 The hash values for all of the blocks of data are then
concatenated together to produce a single 256-bit hash value
for the entire file.
 The hash value is stored with the file.
* Secure Hash Function
(SHA)
 When the file is downloaded, the hash value is recalculated
using the SHA-256 algorithm.
 If the two hash values match, then the file has not been
tampered with.
 If the two hash values do not match, then the file has been
tampered with and should not be used.
 SHA algorithms are a powerful tool for protecting data
security.
 By using SHA algorithms, organizations can ensure that their
data is not tampered with and that their communications are
secure.
* RIPEMD
• RIPEMD” stands for “RIPE Message
Digest,” where “RIPE” stands for “RACE
Integrity Primitives Evaluation” and
where “RACE” stands for “Research and
Development in Advanced
Communications Technologies in
Europe”—a nice example of a recursive
abbreviation.
• The RIPEMD-160 secure hash function
may be best known these days for its role
as part of the implementation of Bitcoin.
* RIPEMD
The RIPEMD is an acronym for RACE Integrity Primitives Evaluation
Message Digest. This set of hash functions was designed by open
research community and generally known as a family of European
hash functions.
 The set includes RIPEMD, RIPEMD-128, and RIPEMD-160. There
also exist 256, and 320-bit versions of this algorithm.
 Original RIPEMD (128 bit) is based upon the design principles used
in MD4 and found to provide questionable security. RIPEMD 128-
bit version came as a quick fix replacement to overcome
vulnerabilities on the original RIPEMD.
 RIPEMD-160 is an improved version and the most widely used
version in the family. The 256 and 320-bit versions reduce the
chance of accidental collision, but do not have higher levels of
security as compared to RIPEMD-128 and RIPEMD-160
respectively.
* RIPEMD
 RIPEMD algorithms are used in a variety of security applications,
including digital signatures, message authentication codes (MACs),
and file integrity checking.
 For example, RIPEMD-160 is used to generate the hash values for Bitcoin
addresses.
 RIPEMD algorithms are a powerful tool for protecting data security.
 By using RIPEMD algorithms, organizations can ensure that their
data is not tampered with and that their communications are secure.
 RIPEMD algorithms are not as secure as SHA algorithms. SHA
algorithms are considered to be more resistant to collision attacks.
 As a result, SHA algorithms are more widely used in security
applications than RIPEMD algorithms.
 Overall, RIPEMD algorithms are a legacy family of cryptographic
hash functions that are still used in some applications, but they are
not as secure as SHA algorithms.
* Whirlpool
This is a 512-bit hash function.
 It is derived from the modified version of Advanced Encryption
Standard (AES). One of the designer was Vincent Rijmen, a co-
creator of the AES.
 Three versions of Whirlpool have been released; namely
WHIRLPOOL-0, WHIRLPOOL-T, and WHIRLPOOL.
 Whirlpool is a cryptographic hash function designed by Vincent
Rijmen and Paulo S. L. M. Barreto.
 First published in 2000 and revised in 2001 and 2003.
 It is a block cipher hash function and designed after square block
cipher.
 It takes less than 2^256 bits length input and convert it in 512 bit
hash.
* Whirlpool
 Whirlpool is a collision-resistant hash function, which means that it is
computationally infeasible to find two different messages that
produce the same hash value.
 It is also a one-way function, which means that it is impossible to
reverse the hash function to recover the original message from the
hash value.
 Whirlpool is used in a variety of applications, including digital
signatures, file authentication, and message authentication codes.
 It is also recommended by the NESSIE project, which is a European
Union-sponsored effort to put forward a portfolio of strong
cryptographic primitives of various types.
* Whirlpool
 The main difference between Whirlpool and AES is that Whirlpool is
a hash function, while AES is a block cipher.
 A hash function is a one-way function that takes an input of any
length and produces an output of a fixed length.
 A block cipher is a symmetric encryption algorithm that takes an
input of a fixed length and produces an output of the same length.
 Whirlpool is a good choice for applications where it is important to be
able to verify the authenticity of a message or file.
 AES is a good choice for applications where it is important to encrypt
data securely.
* Applications of Hash
Functions
1. Password Storage
2. Data Integrity Check
* Password Storage
• Hash functions provide protection to password storage.
• Instead of storing password in clear, mostly all logon processes store the hash
values of passwords in the file.
• The Password file consists of a table of pairs which are in the form (user id,
h(P)).
• The process of logon is depicted in the following illustration −
• An intruder can only see the hashes of passwords, even if he accessed the
password.
• He can neither logon using hash nor can he derive the password from hash value
since hash function possesses the property of pre-image resistance.
* Data Integrity Check
• Data integrity check is a most common application of the hash functions. It is
used to generate the checksums on data files. This application provides
assurance to the user about correctness of the data.
• The integrity check helps the user to detect any changes made to original file. It
however, does not provide any assurance about originality.
• The attacker, instead of modifying file data, can change the entire file and
compute all together new hash and send to the receiver.
• This integrity check application is useful only if the user is sure about the
originality of file.

You might also like