Information Theory: Entropy & Coding
Information Theory: Entropy & Coding
The efficiency of a source encoder is defined as the ratio of the minimum average code word length to the actual average code word length, denoted as η = Lmin/¯L. An ideal source encoder achieves efficiency when η approaches unity, meaning the average code word length used is as small as the entropy of the source, H(S), without being smaller, which represents the fundamental limit .
According to the Shannon Source Coding Theorem, the entropy H(S) represents a lower bound on the average code word length ¯L required to represent symbols from a discrete memoryless source. The average code word length can be equal to the entropy but cannot be smaller. This relationship defines the efficiency of a source encoding scheme, seeking to achieve the average code word length as close to the entropy as possible for optimal encoding .
The average information content per symbol in a discrete memoryless source is calculated using the formula H(S) = ∑pk log2(1/pk), where the summation is over all symbols in the alphabet, and pk represents the probability of each symbol. This value, known as entropy, represents the average uncertainty or the average amount of information one expects to gain upon observing one symbol from the source .
The entropy value guides the design of an efficient source encoder by setting a fundamental limit on the minimum average code word length required for encoding symbols. An encoder should aim to achieve an average code word length close to this entropy to ensure efficiency, optimizing data compression while maintaining the ability to perfectly reconstruct the original source sequence. This is fundamental in communication systems for minimizing data rate without losing information integrity .
Entropy, denoted as H(S), is a measure of the average information content per source symbol in a discrete memoryless source. It quantifies the uncertainty in the random variable and is bounded by 0 ≤ H(S) ≤ log2(K), where K is the number of symbols in the alphabet. The entropy achieves its maximum value when all symbols are equally likely, indicating maximum uncertainty, and reaches zero when one symbol has a probability of one, implying no uncertainty .
A discrete memoryless source is characterized by the statistical independence of symbols emitted across successive intervals, meaning the selection of any symbol is independent of previous ones. The information content of each symbol is determined by its probability, with less probable symbols carrying more information. This independence implies that each symbol contributes uniquely to the total information, and the collective average information is captured by the entropy of the source .
For a source code to be uniquely decodable according to the Shannon Source Coding Theorem, the encoded binary sequence must allow for the perfect reconstruction of the original source sequence. This means each code word must be distinguishable so that no part of it resembles any other complete code word, ensuring that the original data can be uniquely interpreted from the encoded form .
In a discrete memoryless source, the information conveyed by a symbol is inversely related to its probability of occurring. Specifically, the information content I(sk) of a symbol sk is given by I(sk) = log(1/pk), where pk is the probability of the symbol. This means that a symbol with lower probability conveys more information, as it introduces more uncertainty .
The condition H(S) = log2(K) represents the scenario where a discrete memoryless source achieves maximum entropy, indicating maximum uncertainty about the emitted symbols. This occurs when all symbols in the source alphabet have equal probability, reflecting the highest possible diversity and unpredictability of symbol occurrence. It signifies optimal conditions for encoding from an information-theoretical perspective .
The bounds on the entropy of a discrete memoryless source, 0 ≤ H(S) ≤ log2(K), are important because they signify the range of uncertainty and information content in the source. The lower bound of zero occurs when there is no uncertainty, i.e., one symbol occurs with certainty. The upper bound indicates maximum uncertainty, which occurs when all symbols in the alphabet are equally likely, thereby maximizing the information content per symbol .