LOSSY AND LOSSLESS COMPRESSION TECHNIQUES
Data compression is the function of presentation layer in OSI reference model.
Compression is often used to maximize the use of bandwidth across a network or to optimize
disk space when saving data.
There are two general types of compression algorithms:
1. Lossless compression
2. Lossy compression
Lossless Compression
Lossless compression compresses the data in such a way that when data is
decompressed it is exactly the same as it was before compression i.e. there is no loss of data.
A lossless compression is used to compress file data such as executable code, text
files, and numeric data, because programs that process such file data cannot tolerate mistakes
in the data.
Lossless compression will typically not compress file as much as lossy compression
techniques and may take more processing power to accomplish the compression.
Lossless Compression Algorithms
The various algorithms used to implement lossless data compression are :
1. Run length encoding
2. Differential pulse code modulation
3. Dictionary based encoding
1. Run length encoding
• This method replaces the consecutive occurrences of a given symbol with only one copy of
the symbol along with a count of how many times that symbol occurs. Hence the names ‘run
length'.
• For example, the string AAABBCDDDD would be encoded as 3A2BIC4D.
• A real life example where run-length encoding is quite effective is the fax machine. Most
faxes are white sheets with the occasional black text. So, a run-length encoding scheme can
take each line and transmit a code for while then the number of pixels, then the code for black
and the number of pixels and so on.
• This method of compression must be used carefully. If there is not a lot of repetition in the
data then it is possible the run length encoding scheme would actually increase the size of a
file.
2. Differential pulse code modulation
• In this method first a reference symbol is placed. Then for each symbol in the data, we place
the difference between that symbol and the reference symbol used.
• For example, using symbol A as reference symbol, the string AAABBC DDDD would be
encoded as AOOOl123333, since A is the same as reference symbol, B has a difference of 1
from the reference symbol and so on.
3. Dictionary based encoding
• One of the best known dictionary based encoding algorithms is Lempel-Ziv (LZ)
compression algorithm.
• This method is also known as substitution coder.
• In this method, a dictionary (table) of variable length strings (common phrases) is built.
• This dictionary contains almost every string that is expected to occur in data.
• When any of these strings occur in the data, then they are replaced with the corresponding
index to the dictionary.
• In this method, instead of working with individual characters in text data, we treat each
word as a string and output the index in the dictionary for that word.
• For example, let us say that the word "compression" has the index 4978 in one particular
dictionary; it is the 4978th word is usr/share/dict/words. To compress a body of text, each time
the string "compression" appears, it would be replaced by 4978.
Lossy Compression
Lossy compression is the one that does not promise that the data received is exactly
the same as data send i.e. the data may be lost.
This is because a lossy algorithm removes information that it cannot later restore.
Lossy algorithms are used to compress still images, video and audio.
Lossy algorithms typically achieve much better compression ratios than the lossless
algorithms.
Audio Compression
• Audio compression is used for speech or music.
• For speech, we need to compress a 64-KHz digitized signal; For music, we need to
compress a [Link] signal
• Two types of techniques are used for audio compression:
1. Predictive encoding
2. Perceptual encoding
Predictive encoding
• In predictive encoding, the differences between the samples are encoded instead of
encoding all the sampled values.
• This type of compression is normally used for speech.
• Several standards have been defined such as GSM (13 kbps), G. 729 (8 kbps), and G.723.3
(6.4 or 5.3 kbps).
Perceptual encoding
• Perceptual encoding scheme is used to create a CD-quality audio that requires a
transmission bandwidth of 1.411 Mbps.
• MP3 (MPEG audio layer 3), a part of MPEG standard uses this perceptual encoding.
• Perceptual encoding is based on the science of psychoacoustics, a study of how people
perceive sound.
• The perceptual encoding exploits certain flaws in the human auditory system to encode a
signal in such a way that it sounds the same to a human listener, even if it looks quite
different on an oscilloscope.
• The key property of perceptual coding is that some sounds can mask other sound. For
example, imagine that you are broadcasting a live flute concert and all of a sudden someone
starts striking a hammer on a metal sheet. You will not be able to hear the flute any more. Its
sound has been masked by the hammer.
• Such a technique explained above is called frequency masking-the ability of a loud sound in
one frequency band to hide a softer sound in another frequency band that would have been
audible in the absence of the loud sound.
• Masking can also be done on the basis of time. For example: Even if the hammer is not
striking on a metal sheet, the flute will be inaudible for a short period of time because the ears
turn down its gain when they start and take a finite time to turn up again.
• Thus, a loud sound can numb our ears for a short time even after the sound has stopped.
This effect is called temporal masking.
MP3
• MP3 uses these two phenomena, i.e. frequency masking and temporal masking to compress
audio signals.
• In such a system, the technique analyzes and divides the spectrum into several groups. Zero
bits are allocated to the frequency ranges that are totally masked.
• A small number of bits are allocated to the frequency ranges that are partially masked.
• A larger number. of bits are allocated to the frequency ranges that are not masked.
• Based on the range of frequencies in the original analog audio, MP3 produces three data
rates: 96kbps, 128 kbps and 160 kbps.
MultiMedia Systems/Multimedia Input & Output
Technologies/MultiMedia Systems
Hardware for Multimedia
1 Input and Output Devices
o 1.1 Key devices for multimedia output
o 1.2 Monitors
o 1.3 Speakers and midi interfaces
o 1.4 Alphanumeric keyboards and optical character recognition
o 1.5 Digital cameras and scanners
o 1.6 Video Camera and Frame Grabbers
o 1.7 Microphones and MIDI keyboards
o 1.8 Mice, Trackballs, Joy sticks, Drawing tablets
o 1.9 CD-ROMs and Video Disks
Input and Output Devices
Key devices for multimedia output
Monitors for text and graphics (still and motion)
Speakers and midi interfaces for sound
Specialized helmets and immersive displays for virtual reality
Key devices for multimedia input
Keyboard and ocr for text
Digital cameras, scanners, and cd-roms for graphics
midi keyboards, cd-roms and microphones for sound
Video cameras, cd-roms, and frame grabbers for video
Mice, trackballs, joy sticks, virtual reality gloves and wands, for spatial data
Modems and network interfaces for network data
Monitors
Most important output device
Provides all the visual output to the user
Should be designed for the highest quality image, with least distortion
Large vacuum tube with electron gun at one end aimed at a large surface
(viewing screen) on the other end
Viewing screen is coated with chemicals that glow with di�erent colors; three
different phosphors are used for color screens
Source of electron beam is electrically negative pole or cathode (hence the
name Cathode Ray Tube, or CRT)
Two different sets of colors used in monitors ** rgb and cmy, with either set
capable of full color spectrum
Electron beam strikes the screen many times per second
Phosphors are re-excited at each electron strike for a brief instance
Refresh rate, measured in Hz
Preferred refresh rate is 75 Hz or more
Electron beam sweeps across the screen in a regular pattern
Required to refresh phosphors frequently and equally
Raster scan pattern
Always strikes when going from left to right (trace), and turned on to
go from right to left (retrace)
Three separate electron beams for three colors, for better focus and higher
refresh rates
Screen divided into individual picture elements, or pixels
Each pixel is made of its own phosphor elements to give the color
Memory chip contains a map of what colors to display on each pixel
Bit map
Mostly used in context of binary images (black or white)
Hardware for Multimedia 20
One bit per pixel to indicate whether pixel is black or white
Color maps, or pixmap
One byte for each color for every pixel (24-bit color)
Image changed in the memory map associated with screen
For realistic motion images and for flicker-free screen, bit-map must
be modi�ed faster than the eye can perceive (30 frames/sec)
For a 640 � 480 screen, number of bits is: 640 � 480 � 24 = 7; 372;
800
To refresh the screen at 30 times per second, the number of bits
transferred in a second is: 640 � 480 � 24 � 30 = 221; 184; 000 or 221 Mb
Larger screen requires more data to be transferred
Transfer rate limitation can be overcome by using hardware accelerator
board to perform certain graphic display functions in hardware
Full-screen 30 image per second performance may not be possible
even with graphics accelerator board
Physical size of monitor
Important factor in the quality of multimedia presentation
Typically between 11 and 20 inches on diagonal
Another important factor is the number of pixels per inch
� Too few pixels make the image look grainy � For best quality images, pixels should not
be wider than 0.01 inches (28mm) in diameter � Latter quantity is used for marketing the
monitors (25mm dot pitch)
Graphics display board
Used in addition to monitor to speed up graphics
Special hardware circuits for 2D and 3D graphics
Simple graphics boards just translate image data from ram into one
usable by monitor
Complex boards can even speed up the refresh rate of screen
Qualities of a good multimedia monitor
Size, refresh rate, dot pitch
Other concerns about monitor include weight and ambient light
Liquid crystal display monitors
Flat screen displays
Crystals allow more or less light to pass through them, depending upon
the strength of an electric field
Not appropriate for multimedia presentation as the view angle is
extremely important
3D monitors in the future
Human factor concerns
Speakers and midi interfaces
Production of sound
1. Digitized representation of frequency and sound transmitted at appropriate time to the
loudspeaker (.WAV �les) ** common method 2. Commands for sound synthesis can be
transmitted to a synthesizer at appropriate time (midi �les) ** used for the generation of
music
Musical Instruments Digital Interface (midi)
Standard to permit interface for both hardware and control logic
between computers and music
synthesizers
Adopted in 1982
Hardware for Multimedia 21
Consists of two parts
1. Hardware standard � Speci�es cables, circuits, connectors, and electrical signals to be
used 2. Message standard � Types and formats of messages to be transmitted to/from
synthesizers, control units (keyboards), and computers � Messages consist of a device
number, a control segment to tell the device the function to be performed (turn on/o� a
speci�ed circuit), and a data segment to provide the information necessary for the action
(volume of sound, or frequency of basic sound
An entire piece of music can be described by a sequence of midi
messages
midi interface
Required in the computer to communicate with midi instruments
Circuit board to translate the signals
Alphanumeric keyboards and optical character recognition
Used for textual input
Pressing a key on a keyboard closes a circuit corresponding to the key to send
a unique code to the cpu
Printed text can be input using ocr software
ocr software analyzes an image to translate symbols into character
codes
Systematically checks the entire page, searching for patterns of dark
and light recognizable as alphabetic,numeric, or punctuation characters
Choose the best match from a set of known patterns
Quality of scanned page as well as output
Digital cameras and scanners
Real image and Digital image (Representation of real image in terms of pixels)
Still image
Snapshot of an instance
Motion image
Sequence of images giving the impression of continuous motion
Graininess in real images (Individual dots observed when a photograph taken by
conventional camera and enlarged)
Digital image capture
Light is focused on photosensitive cells to produce electric current in response to
intensity and wavelength of light
Electric current is scanned for each point on the image and translated to binary codes
Codes correspond to pixel values and can be used to rebuild the original picture
Scanners scan an image from one end to the other
Scanning mechanism shines bright light on the image and codes and records the
reflected light for each point
Scanner does not store data but sends it to the computer, possibly after compression of
the same
Quality of images
Depends on the quality of optics and sharpness of focus
Perceived by sharpness of resulting image
Accuracy of encoding for each pixel depends on the precision of photosensitive cells
Resolution of scanner/camera (number of dots/inch)
Amount of storage available
Hardware for Multimedia
Preferable to scan at the highest possible resolution under given hardware and storage
space constraints to get the most detail in the original image
Video Camera and Frame Grabbers
Standard video camera contains photosensitive cells, scanning one frame after another.
Output of the cells gets recorded as analog stream of colors, or sent to digiting circuitry to
generate a stream of digital codes
Video input card
Required for use of video camera to input video stream into computer
Digitizes the analog signal from camera
Output can be sent to a file for storage, cpu for processing, or monitor for
display (or all of them)
Frame grabber
Allows the capture of a single frame of data from video stream
Not as good resolution as a still camera
Typical frame grabbers process 30 frames per second for real time
performance
Microphones and MIDI keyboards
These are used to input original sounds (analog)
Microphone has a diaphragm that vibrates in response to sound waves
Vibrations modulate a continuous electric current analogous to sound waves
Modulated current can be digitized and stored as standardized format for audio data,
such as .WAV �le
Microphone plugs into a sound input board
Developer can control the sampling rate for digitizing
Higher sampling rate gives better fidelity but requires more space
Sampling rate for music ** 20,000 Hz
Sampling rate for speech ** 10,000 Hz
Editing digital audio files (cut and paste)using Audio softwares like
Cooledit,Audacity etc
Mice, Trackballs, Joy sticks, Drawing tablets
These are used to enter positional information as 2D or 3D data from a standard reference
point
Latitude, longitude, altitude
Common to de�ne a point on the computer screen
Mouse de�nes the movement in terms of two numbers ** left/right and up/down on
the screen, with respect to one corner
Movement of mouse is tracked by software, which can also set the tracking speed
Trackball works the same way as the mouse
A joystick is a trackball with a handle
Pressing the button of mouse/trackball/joystick sends a signal to the computer asking
it to perform some function
Multimedia software should be able to determine the positional information as well as
the signal context (mouse press)
CD-ROMs and Video Disks
This is a Popular media for storage and transport of data. Data written on disk by burning tiny
holes, interpreted as binary 0 and 1 by software. These days Flash drives are USB devices
which are gaining popularity. Features
Read-only devices; data can be written only once
CD-roms can typically store about 600MB of information
With time, the speed has improved (4X in 1995 to more than 50X now)
DVD-roms allow a few gigabytes of data on a single disk
Ideal media for distributing multimedia productions (low cost)
Audio
AAC
Advanced Audio Coding (similar to MP3) is a digital audio format designed for high
compression as well as high audio quality.
Like MP3s, Advanced Audio Coding (AAC) files are also lossy audio files. However, AAC
files, in their original state, are much higher in quality than any of the other audio file formats
on the list. AAC files are generally similar in size to MP3s, despite being a tad higher in
quality.
They can also be created with a variable bit rate or constant bit rate. AAC files are also open-
source, which means you don’t need to pay royalties to create and distribute them (unlike
MP3 files).
.AAC files are most commonly associated with iTunes, though they can be used on other
player devices and gaming consoles.
AVI
Audio Video Interleaved is a Windows movie file with high video quality, but a large file
size. Approximately 25 GB is required for 60 minutes of video.
MP3
MPEG 1 Audio Layer 3 is a digital audio format that is designed for high compression of
audio files while maintaining high audio quality.
.MP3 files are the most common audio file around. MP3s feature lossy compression, which
means their quality will degrade over subsequent edits. MP3s are still relatively large in size
when compared to other audio file formats on this list.
MP3 files can be encoded at a constant bit rate or variable bit rate. A constant bit rate ensures
the same quality throughout the audio file, but results in a higher file size. Variable bit rate
detracts from quality during silent or near-silent moments of a file, resulting in a smaller
overall file size. Most smart phones and music players use the MP3 format.
MP3 VBR
MP3 using Variable Bit Rates that provides better quality and smaller files.
Audible 2, 3 and 4
Audio file format (.aa file extension) used for audio books or other voice recordings. Entire
books can be stored in a single file.
Apple Lossless
Uses the .m4a file extension, the same as AAC. Creates a larger file than AAC, but retains
more information and quality.
AIFF
Audio Interchange File Format similar to WAV. AIFF provides original sound quality and
large file size.
WAV
Wave provides the same file sound quality and large file size as the original CD.
Video
H.264
This is a digital video codec noted for high data compression while maintaining high quality.
MPEG-2
A combination of audio and video compression for storage of movies.
Mov
QuickTime Movie Format
m4v
A MPEG-4 Video file.
MP4
MPEG-4 is a versatile file format that can include audio, video, images and animations.
DAT
Digital Data Storage. Data file format that can be used for text, graphics or binary data.
VOB
Video Object is a MPEG-2 DVD video movie file.
Distributed DBMS Architectures
DDBMS architectures are generally developed depending on three
parameters −
Distribution − It states the physical distribution of data across the different
sites.
Autonomy − It indicates the distribution of control of the database system
and the degree to which each constituent DBMS can operate independently.
Heterogeneity − It refers to the uniformity or dissimilarity of the data
models, system components and databases.
Architectural Models
Some of the common architectural models are −
Client - Server Architecture for DDBMS
Peer - to - Peer Architecture for DDBMS
Multi - DBMS Architecture
Client - Server Architecture for DDBMS
This is a two-level architecture where the functionality is divided into
servers and clients. The server functions primarily encompass data
management, query processing, optimization and transaction
management. Client functions include mainly user interface. However,
they have some functions like consistency checking and transaction
management.
The two different client - server architecture are −
Single Server Multiple Client
Multiple Server Multiple Client (shown in the following diagram)
Peer- to-Peer Architecture for DDBMS
In these systems, each peer acts both as a client and a server for
imparting database services. The peers share their resource with other
peers and co-ordinate their activities.
This architecture generally has four levels of schemas −
Global Conceptual Schema − Depicts the global logical view of data.
Local Conceptual Schema − Depicts logical data organization at each site.
Local Internal Schema − Depicts physical data organization at each site.
External Schema − Depicts user view of data.
Multi - DBMS Architectures
This is an integrated database system formed by a collection of two or
more autonomous database systems.
Multi-DBMS can be expressed through six levels of schemas −
Multi-database View Level − Depicts multiple user views comprising of
subsets of the integrated distributed database.
Multi-database Conceptual Level − Depicts integrated multi-database that
comprises of global logical multi-database structure definitions.
Multi-database Internal Level − Depicts the data distribution across
different sites and multi-database to local data mapping.
Local database View Level − Depicts public view of local data.
Local database Conceptual Level − Depicts local data organization at each
site.
Local database Internal Level − Depicts physical data organization at
each site.
There are two design alternatives for multi-DBMS −
Model with multi-database conceptual level.
Model without multi-database conceptual level.