0% found this document useful (0 votes)
72 views6 pages

Ktu Data Compression Techniques Notes

Data compression reduces the number of bits needed to represent information, improving storage efficiency and transmission speed. It is categorized into lossless and lossy techniques, with lossless allowing exact data reconstruction and lossy removing less important information for higher compression. Key compression methods include Huffman coding, JPEG for images, MPEG for videos, and MP3 for audio, with applications in multimedia streaming, image storage, and mobile communication.

Uploaded by

aniabc2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views6 pages

Ktu Data Compression Techniques Notes

Data compression reduces the number of bits needed to represent information, improving storage efficiency and transmission speed. It is categorized into lossless and lossy techniques, with lossless allowing exact data reconstruction and lossy removing less important information for higher compression. Key compression methods include Huffman coding, JPEG for images, MPEG for videos, and MP3 for audio, with applications in multimedia streaming, image storage, and mobile communication.

Uploaded by

aniabc2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

KTU S8 CSE – Data Compression Techniques

(CST446)

Quick Revision Notes – All Modules

Data compression is the process of reducing the number of bits required to represent information. It helps
reduce storage space and transmission bandwidth. Compression techniques are classified into lossless
and lossy compression.
Module 1 – Fundamentals of Data Compression
• Data compression reduces redundancy in data representation.

• Compression improves storage efficiency and transmission speed.

• Two major categories: Lossless compression and Lossy compression.

• Lossless compression allows exact reconstruction of the original data.

• Lossy compression removes less important information for higher compression.

• Entropy represents the theoretical limit of compression.

• Information theory forms the basis of many compression algorithms.

Entropy Formula:
H(X) = − Σ P(x) log2 P(x)

Where P(x) is the probability of symbol occurrence. Entropy measures the average information content per
symbol.
Module 2 – Lossless Compression Techniques
• Huffman Coding: Variable-length coding technique based on symbol frequencies.

• Arithmetic Coding: Encodes entire message into a single fractional number.

• LZ77 Algorithm: Uses sliding window technique to replace repeated patterns.

• LZ78 Algorithm: Builds dictionary dynamically during compression.

• LZW Algorithm: Improved dictionary-based compression used in GIF images.

• Run Length Encoding (RLE): Replaces repeated symbols with count and symbol.

Advantages of Lossless Compression:


• No loss of information.

• Used for text, executable files, and medical images.

• Allows exact reconstruction of original data.


Module 3 – Image Compression
• Images contain spatial redundancy which can be compressed.

• JPEG is the most common image compression standard.

• JPEG uses Discrete Cosine Transform (DCT).

• Compression steps: Color conversion → DCT → Quantization → Entropy coding.

• JPEG-LS provides lossless or near-lossless compression.

• PNG uses lossless compression.

JPEG Compression Steps:


• Divide image into 8×8 blocks.

• Apply DCT to transform spatial domain into frequency domain.

• Quantize coefficients to reduce precision.

• Apply entropy coding such as Huffman coding.


Module 4 – Video Compression
• Video compression removes spatial and temporal redundancy.

• MPEG standards are widely used for video compression.

• Types include MPEG■1, MPEG■2, MPEG■4 and H.264.

• Motion estimation finds similar blocks between frames.

• Motion compensation predicts frame using previous frames.

• Frames types: I■frame (intra), P■frame (predicted), B■frame (bidirectional).

Motion Compensation Concept:


Instead of transmitting full frames, the encoder sends motion vectors and residual errors between frames
to reduce redundancy.
Module 5 – Audio Compression
• Audio compression reduces data required for storing sound.

• Human hearing characteristics are used to remove inaudible sounds.

• MP3 is one of the most popular audio compression standards.

• Audio compression uses psychoacoustic models.

• Lossless audio compression formats include FLAC and ALAC.

Applications of Data Compression:


• Multimedia streaming (YouTube, Netflix).

• Image storage and transmission.

• Video conferencing.

• Mobile communication systems.

• Cloud storage optimization.

Common questions

Powered by AI

MPEG standards in video compression are integral to modern digital broadcasting and online streaming due to their ability to efficiently compress video data while maintaining quality and supporting a wide range of resolutions and bitrates. These standards, including MPEG-1, MPEG-2, MPEG-4, and H.264, use techniques like motion compensation and spatial-temporal redundancy removal to produce compressed video files that are manageable in size yet high in quality, facilitating smooth transmission over bandwidth-constrained networks. This versatility allows MPEG standards to meet diverse demands from high-definition television broadcasting to mobile device streaming, optimizing content delivery for platforms like YouTube and Netflix .

Motion compensation in video compression enhances efficiency by reducing the amount of data that needs to be encoded and transmitted for video playback. Instead of encoding each frame independently (which would result in significant redundancy due to gradual changes between frames), motion compensation predicts frames by analyzing movement between them. It encodes only the differences (motion vectors and residuals) between current and predicted frames, thus significantly reducing redundant data. This results in decreased file sizes and improved transmission efficiency without sacrificing perceptual video quality, making the process much more efficient than a simple, static frame-by-frame encoding approach .

Huffman Coding optimizes data compression by assigning variable-length codes to input characters. More frequently occurring characters are assigned shorter codes, while less frequent characters are given longer codes. This results in a compressed file size that approximates the theoretical limit set by the data's entropy, thus optimizing storage and transmission efficiency. Because this method allows for exact reconstruction of the original data from the compressed version, it is classified under lossless compression techniques, ensuring no loss of information during the process .

Psychoacoustic models in audio compression exploit the characteristics of human hearing which is not uniformly sensitive to all frequencies and amplitude levels. By identifying and eliminating sounds that are masked by louder tones or those which are inaudible due to frequency ranges beyond human perception, these models enable more efficient file size reduction. Essentially, psychoacoustic models remove data that contribute little to the sound recognition and quality, thereby drastically reducing file size without perceptibly affecting audio quality, a technique prominently used in popular audio compression methods like MP3 .

Entropy defines the theoretical limit of how much data can be compressed without losing information. According to information theory, entropy represents the average amount of information produced by a stochastic source of data, and serves as a lower bound for lossless compression. This means that the more random or less predictable the data, the higher the entropy and the less compression possible; conversely, more predictable data has lower entropy, allowing greater compression .

Data compression techniques benefit cloud storage optimization by reducing the amount of data that needs to be stored and transferred across networks, thereby saving costs and improving efficiency. By compressing data before it is uploaded to the cloud, businesses can significantly decrease storage requirements and accelerate data retrieval times, which is critical for cost management and operational performance. This leads to enhanced resource allocation, as compressed data occupies less physical storage space and requires less bandwidth for data migrations and access. Consequently, compression is crucial for scalable and economical cloud storage strategies .

The Discrete Cosine Transform (DCT) is crucial in JPEG image compression as it transforms the image from the spatial domain to the frequency domain, concentrating most of the image's significant visual information into a few low-frequency components. By converting 8×8 blocks of pixels, the DCT makes it easier to identify and compress redundant information. This allows for the quantization step where data precision of less important high-frequency components can be reduced significantly without greatly affecting image quality, thus reducing the number of bits required for storage and enabling significant reductions in storage space .

The LZ77 algorithm uses a sliding window technique to compress data by replacing repeated patterns with references to earlier occurrences within a fixed-size window, thereby implicitly using a "dictionary" that consists of previously seen data within the window. In contrast, the LZW algorithm explicitly builds a dictionary dynamically during the compression process, starting with individual symbols as the initial dictionary, and creating new entries from unexplored symbol combinations. While both techniques aim at reducing redundancy, LZ77 works with a more implicit and temporary memory of previously seen data, whereas LZW constructs a more permanent and evolving dictionary during the compression process .

Lossless compression is particularly advantageous for medical images and data-sensitive applications because it ensures data integrity by allowing the exact reconstruction of the original file. This characteristic is crucial for medical imaging where undistorted data can be the difference in diagnosing patients, as any loss of information might affect data interpretations and clinical decisions. It also benefits other sensitive applications in preserving the accuracy and reliability of data processing, particularly where legal compliance or data authenticity is mandated. Therefore, lossless compression supports both storage efficiency and the critical need for data precision in such environments .

Lossless compression techniques allow for the exact reconstruction of the original data without any information loss, making them ideal for applications where data integrity is critical, such as text files, executables, and medical images. These methods reduce redundancy in data representation without sacrificing any original details. In contrast, lossy compression techniques achieve higher compression ratios by removing less important information, resulting in some loss of data fidelity. This makes them suitable for applications like audio and video where perfect fidelity is not required, but not suitable for contexts where precise data representation is required. Consequently, lossless compression maintains data integrity, while lossy achieves greater compression at the cost of data quality .

You might also like