0% found this document useful (0 votes)

4 views15 pages

Lecture35-37 SourceCoding

Uploaded by

Mohammed Abdul Jaleel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views15 pages

Lecture35-37 SourceCoding

Uploaded by

Mohammed Abdul Jaleel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

'

Source Coding

1. Source symbols encoded in binary 2. The average codelength must be reduced 3. Remove redundancy reduces bit-rate Consider a discrete memoryless source on the alphabet S = {s0 , s1 , , sk } Let the corresponding probabilities be {p0 , p1 , , pk } and codelengths be {l0 , l1 , , lk }. Then, the average codelength(average number of bits per symbol) of the source is dened as & %

' L=

$ pk ll

k=0

If Lmin is the minimum possible value of L, then the coding eciency of the source is given by .

Lmin = L For an ecient code approaches unity. The question: What is smallest average codelength that is possible? The Answer: Shannons source coding theorem Given a discrete memoryless source of entropy H(s), the average codeword length L for any distortionless source encoding scheme is bounded by &

L H(s)

Since, H(s) is the fundamental limit on the average number of bits/symbol, we can say Lmin H(s) H(s) = = L Data Compaction: 1. Removal of redundant information prior to transmission. 2. Lossless data compaction no information is lost. 3. A source code which represents the output of a discrete memoryless source should be uniquely decodable. & %

Source Coding Schemes for Data Compaction

Prex Coding 1. The Prex Code is variable length source coding scheme where no code is the prex of any other code. 2. The prex code is a uniquely decodable code. 3. But, the converse is not true i.e., all uniquely decodable codes may not be prex codes.

Table 1: Illustrating the denition of prex code [Link] Occurrence 0.5 0.25 0.125 0.125 Code I 0 1 00 11 Code II 0 10 110 111 Code III 0 01 011 0111

Symbol s0 s1 s2 s3

Table 2: Table is reproduced from [Link] book on Communication Systems From 1 we see that Code I is not a prex code. Code II is a prex code. Code III is also uniquely decodable but not a prex code. Prex codes also satises Kraft-McMillan inequality which is given by & %

$ 2lk 1

k=0

Kraft-McMillan inequality maps codewords to a binary tree as shown is Figure 1.

s 0 Initial State 1 0 1 s 0 s
2 1 0

1 s
3

Figure 1: Decision tree for Code II Given a discrete memoryless source of entropy H(s), a prex code can be constructed with an average code-word length which is l, & %

' bounded as follows:

H(s) < H(s) + 1 (L)

(1)

The left hand side of the above equation, the equality is satised owing to the condition that, any symbol sk is emitted with the probability

pk = 2lk

(2)

where, lk is the length of the codeword assigned to the symbol sk . Hence, from Eq. 2, we have & %

'
K1 K1

2lk =
k=0 k=0

pk = 1

(3)

With this condition, the Kraft-McMillan inequality tells that a prex code can be constructed such that the length of the codeword assigned to source symbol sk is log2 pk . Therefore, the average codeword length is given by

L=
k=0

lk 2lk

(4)

and the corresponding entropy is given by & %

H(s) =
k=0 K1

1 2lk lk 2lk

log2 (2lk ) (5)

=
k=0

Hence, from Eq. 5, the equality condition on the leftside of Eq. 1, L = H(s) is satised. To prove the inequality condition we will proceed as follows: Let Ln denote the average codeword length of the extended prex code. For a uniquely decodable code, Ln is the smallest possible. & %

Human Coding

1. Human code is a prex code 2. The length of codeword for each symbol is roughly equal to the amount of information conveyed. 3. The code need not be unique (see Figure 3) A Human tree is constructed as shown in Figure. 3, (a) and (b) represents two forms of Human trees. We see that both schemes have same average length but dierent variances. Variance is a measure of the variability in codeword lengths of a source code. It is dened as follows:
K1

2 = &
k=0

pk (lk L)2

(6) %

where, pk is the probability of kth symbol. lk is the codeword length of kth symbol and L is the average codeword length. It is reasonable to choose the human tree which gives greater variace.

'
(a)

s1 0

s2 1

s3 0 1

Symbol s0 s1

code 10 00 01 110 111

Average length, L = .2.2 Variance = 0.160

s2 s3

0 1 s0 s1 s2 0 0 0 1 (b) 1 1 s3 0 (b) Average length, L = .2.2 Variance = 1.036 s4 1

Symbol s

code 0 10 110 1110 1111

s1 s2 s3 s4

Figure 2: Human tree

Drawbacks: 1. Requires proper statistics. 2. Cannot exploit relationships between words, phrases etc., 3. Does not consider redundancy of the language.

Lempel-Ziv Coding
1. Overcomes the drawbacks of Human coding 2. It is an adaptive and simple encoding scheme. 3. When applied to English text it achieves 55% in contrast to Human coding which achieves only 43%.

4. Encodes patterns in the text This algorithm is accomplished by parsing the source data stream into segments that are the shortest subsequences not encountered previously. (see Figure 3 the example is reproduced from [Link] book on Communication Systems.) & %

Let the input sequence be 000101110010100101......... We assume that 0 and 1 are known and stored in codebook subsequences stored : 0, 1 Data to be parsed: 000101110010100101......... The shortest subsequence of the data stream encountered for the first time and not seen before is 00 subsequences stored: 0, 1, 00 Data to be parsed: 0101110010100101......... The second shortest subsequence not seen before is 01; accordingly, we go on to write Subsequences stored: 0, 1, 00, 01 Data to be parsed: 01110010100101......... We continue in the manner described here until the given data stream has been completely parsed. The code book is shown below:

Numerical positions: 1 subsequences: Numerical Repre sentations: Binary encoded blocks: 0

2 1

3 00 11

4 01 12

5 011 42

6 10 21

7 010 41

8 100 61

9 101 62

0010

0011

1001

0100

1000

1100

1101

Figure 3: Lempel-Ziv Encoding

Source Coding and Data Compaction Techniques
No ratings yet
Source Coding and Data Compaction Techniques
20 pages
Source Coding Techniques Overview
No ratings yet
Source Coding Techniques Overview
44 pages
Source Encoding
No ratings yet
Source Encoding
40 pages
Data Compression Techniques Explained
No ratings yet
Data Compression Techniques Explained
70 pages
Source Coding Techniques Overview
No ratings yet
Source Coding Techniques Overview
49 pages
Lec 2 Source Coding and Data Compression
No ratings yet
Lec 2 Source Coding and Data Compression
31 pages
Chapter 2
No ratings yet
Chapter 2
15 pages
Lempel-Ziv Coding Explained
No ratings yet
Lempel-Ziv Coding Explained
53 pages
Understanding Source Coding Techniques
No ratings yet
Understanding Source Coding Techniques
40 pages
PDF Data Compression Techniques
No ratings yet
PDF Data Compression Techniques
24 pages
Mathematical Foundations of Data Compression
No ratings yet
Mathematical Foundations of Data Compression
47 pages
Chap 2
No ratings yet
Chap 2
47 pages
Information Theory and Coding Essentials
No ratings yet
Information Theory and Coding Essentials
44 pages
Source Coding Techniques Overview
No ratings yet
Source Coding Techniques Overview
111 pages
Binary Source Coding Explained
No ratings yet
Binary Source Coding Explained
16 pages
ECE 515: Distortionless Source Coding
No ratings yet
ECE 515: Distortionless Source Coding
80 pages
Information Theory and Coding Techniques
No ratings yet
Information Theory and Coding Techniques
77 pages
Data Compression Techniques Explained
No ratings yet
Data Compression Techniques Explained
11 pages
Digital Encoding Techniques Overview
No ratings yet
Digital Encoding Techniques Overview
52 pages
Real-World Data Compression Techniques
No ratings yet
Real-World Data Compression Techniques
55 pages
Source Coding Techniques in ECE421
No ratings yet
Source Coding Techniques in ECE421
31 pages
Understanding Line Coding Techniques
No ratings yet
Understanding Line Coding Techniques
68 pages
Understanding Source Coding Techniques
No ratings yet
Understanding Source Coding Techniques
72 pages
Shannon-Fano Elias Coding Overview
No ratings yet
Shannon-Fano Elias Coding Overview
68 pages
Huffman and LZW Compression Algorithms
No ratings yet
Huffman and LZW Compression Algorithms
14 pages
PCM Encoding and Error Correction Techniques
No ratings yet
PCM Encoding and Error Correction Techniques
106 pages
Huffman and Shannon Coding Methods
No ratings yet
Huffman and Shannon Coding Methods
59 pages
Uniquely Decodable Codes in Compression
No ratings yet
Uniquely Decodable Codes in Compression
4 pages
Understanding Data Compression Concepts
No ratings yet
Understanding Data Compression Concepts
30 pages
Lossless Compression Algorithms Explained
No ratings yet
Lossless Compression Algorithms Explained
21 pages
Shannon's Source Coding Theorem Explained
No ratings yet
Shannon's Source Coding Theorem Explained
30 pages
Source Coding and Huffman Coding Explained
No ratings yet
Source Coding and Huffman Coding Explained
34 pages
Data Compression Techniques Explained
No ratings yet
Data Compression Techniques Explained
29 pages
Compression Techniques for Data Types
No ratings yet
Compression Techniques for Data Types
28 pages
Source Coding: Compression Techniques
No ratings yet
Source Coding: Compression Techniques
72 pages
Text and Image Compression Techniques
No ratings yet
Text and Image Compression Techniques
65 pages
Multimedia Data Compression Techniques
No ratings yet
Multimedia Data Compression Techniques
31 pages
Lossless Compression Algorithms Overview
No ratings yet
Lossless Compression Algorithms Overview
53 pages
Image Compression Techniques Explained
No ratings yet
Image Compression Techniques Explained
37 pages
Lossless Compression Algorithms Explained
No ratings yet
Lossless Compression Algorithms Explained
21 pages
Source Encoding Techniques in Communication
No ratings yet
Source Encoding Techniques in Communication
18 pages
Binary Source Coding Concepts
No ratings yet
Binary Source Coding Concepts
9 pages
Source Coding in Digital Communication
No ratings yet
Source Coding in Digital Communication
35 pages
Understanding Data Compression Techniques
No ratings yet
Understanding Data Compression Techniques
21 pages
Source Encoding
No ratings yet
Source Encoding
4 pages
Forouzan6e ch11 PPTs Accessible
No ratings yet
Forouzan6e ch11 PPTs Accessible
119 pages
Chap3 Source Coding
No ratings yet
Chap3 Source Coding
48 pages
Source Coding Techniques Overview
No ratings yet
Source Coding Techniques Overview
29 pages
Binary Source Coding Explained
No ratings yet
Binary Source Coding Explained
17 pages
Shannon-Fano and Huffman Coding Methods
No ratings yet
Shannon-Fano and Huffman Coding Methods
55 pages
Multimedia Data Compression Techniques
No ratings yet
Multimedia Data Compression Techniques
21 pages
Huffman Encoding and Information Theory
No ratings yet
Huffman Encoding and Information Theory
30 pages
Multimedia System Design Lecture 3
No ratings yet
Multimedia System Design Lecture 3
75 pages
Information Theory and Coding Concepts
No ratings yet
Information Theory and Coding Concepts
150 pages
Module 2
No ratings yet
Module 2
73 pages
Kraft Inequality and Codeword Length
No ratings yet
Kraft Inequality and Codeword Length
11 pages
Lecture 4 Entropy and Data Compression (III)
No ratings yet
Lecture 4 Entropy and Data Compression (III)
26 pages

Lecture35-37 SourceCoding

Uploaded by

Lecture35-37 SourceCoding

Uploaded by

'

Source Coding Schemes for Data Compaction

Kraft-McMillan inequality maps codewords to a binary tree as shown is Figure 1.

' bounded as follows:

H(s) < H(s) + 1 (L)

and the corresponding entropy is given by & %

log2 (2lk ) (5)

code 10 00 01 110 111

Average length, L = .2.2 Variance = 0.160

0 1 s0 s1 s2 0 0 0 1 (b) 1 1 s3 0 (b) Average length, L = .2.2 Variance = 1.036 s4 1

code 0 10 110 1110 1111

Figure 2: Human tree

Numerical positions: 1 subsequences: Numerical Repre sentations: Binary encoded blocks: 0

Figure 3: Lempel-Ziv Encoding

You might also like