• Huffman coding is a variable length coding and it achieves error-
free/lossless compression as it is reversible.
It exploits only coding and inter-pixel redundancy to achieve
compression.
• In terms of the noiseless coding theorem, the resulting code is
optimal for a fixed value of n, subject to the constraint that the source
symbols be coded one at a time.
• The length of the code should be inversely proportional to its
probability of occurrence.
❑ Steps of Huffman coding:
• 1. Sort the symbols according to decreasing order of their
probabilities.
• 2. Combine the lowest probable symbols to form a composite symbol
with probability equal to sum of the respective probabilities.
• 3. Repeat step 2 until you are left with only 2 symbols.
• 4. Extract the codewords from the resulting binary tree
representation of the code.
❑ Arithmetic (OR Range) coding
• It is also a type of lossless compression algorithm and we get the
same image at the output.
Lossless algorithms are always used where reliability/data
preservation is the main factor, for eg. Medical imaging.
• For arithmetic coding:
– Sequences of source symbols are encoded together.
– There is no one-to-one correspondence between source symbols and
code words.
❑ Arithmetic (OR Range) coding
• Arithmetic coding takes in a complete data stream and output one
specific codeword which is a floating point number between 0 and 1,
with as few digits as possible.
As the number of symbols in the message increases, the interval used
to represent it becomes smaller, which means longer the bit stream,
more is the precision of the output number.
• Huffmann coding encodes source symbols one at a time which might
not be efficient while Arithmetic coding encodes sequences of source
symbols to variable length codewords.
• Arithmetic coding is a non-block code which , means that arithmetic
code does not generate individual codes for each character.
• The efficiency = 100% can always be achieved using arithmetic
coding,
as entropy H will be equal to Lavg, and hence it also satisfies
Shannon’s first theorem which says that minimum coding bits should
be equal to entropy H, which is obtained by arithmetic coding.
But the difficulty for Arithmetic coding is practical
implementation of long messages is not possible, as arbitrary-precision
floating point library/registers are not there.
❑ Steps of Arithmetic coding:
• 1. Arrange the characters of string/numbers into ascending order.
• 2. Divide the numeric range 0 to 1 into the number of different
symbols present in the message.
• 3. Expand the first letter to be coded along with the range.
• 4. Further subdivide this range into number of symbols.
• 5. Repeat the procedure until the termination character is encoded.
• A sequence of source symbols is assigned a single arithmetic
code word which corresponds to a sub-interval in [0,1].
• Smaller intervals require more information units (i.e., bits) to
be represented.