0% found this document useful (0 votes)
15 views4 pages

ASCII vs Unicode: Data Encoding Explained

The document explains data representation in computers, focusing on ASCII and Unicode encoding systems. ASCII, developed in the 1960s, uses 7 to 8 bits to represent English characters and is limited to a small set of symbols, while Unicode is a universal standard that supports characters from all languages and symbols using various encoding forms. Unicode provides a unique code point for each character, making it suitable for global communication.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

ASCII vs Unicode: Data Encoding Explained

The document explains data representation in computers, focusing on ASCII and Unicode encoding systems. ASCII, developed in the 1960s, uses 7 to 8 bits to represent English characters and is limited to a small set of symbols, while Unicode is a universal standard that supports characters from all languages and symbols using various encoding forms. Unicode provides a unique code point for each character, making it suitable for global communication.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Representation: ASCII and Unicode

1. What is Data Representation?


 Computers work only with binary data (0s and 1s).
 To represent text (letters, digits, punctuation, emojis), we need a standard encoding
system that maps characters → binary codes.
 Two of the most important encoding systems are:
1. ASCII (American Standard Code for Information Interchange)
2. Unicode

2. ASCII (American Standard Code for Information


Interchange)
Definition:

ASCII is a character encoding standard developed in the 1960s to represent English


characters using 7 bits (later extended to 8 bits).

How it Works:

 Each character (letter, digit, symbol, control command) is assigned a unique numeric
code.
 For example:
o 'A' = 65 → 1000001 (binary)
o 'a' = 97 → 1100001 (binary)
o '0' = 48 → 0110000 (binary)
o Space = 32

Types of ASCII:

1. 7-bit ASCII (Standard ASCII):


o Can represent 128 characters (0–127).
o Includes: English alphabets (uppercase & lowercase), numbers, punctuation,
and control codes.
2. 8-bit ASCII (Extended ASCII):
o Can represent 256 characters (0–255).
o Adds extra symbols like graphical characters, accented letters (ç, ñ, é).

Advantages of ASCII:

 Simple and widely used.


 Efficient for English-based systems.
Limitations of ASCII:

 Can only represent English and a few special characters.


 Not suitable for global languages (e.g., Hindi, Chinese, Arabic).

3. Unicode
Definition:

Unicode is a universal character encoding standard designed to represent text from all
writing systems in the world (languages, symbols, emojis).

How it Works:

 Uses different encoding forms: UTF-8, UTF-16, UTF-32.


 Provides a unique numeric code (called code point) for every character.
 Example:
o 'A' = U+0041
o 'क' (Hindi letter) = U+0915
o '中' (Chinese character) = U+4E2D
o Emoji
[Link]
aCo
oU
[Link]
SC
dacode:
dva
cod
3. code
code
. tat
38(Va pa
tages
Suppo
wo
sy
ascu
ac
syste
tsto
co
va
to
edue
Uea6:p
geSC
equ ocod
aoatd:gs:
oov paoss
gt
oj ees
sdes
guages,
so
age
ab
code
es,
gt
bo
peedste
ab
[Link]
cy so
.es
es.
ts
s,
to
w 😀
codes
sa
tco
web.
o
W
aused
Java.
Uses नम
cbytes
eac
S
but
sto
(7
Codes:
(79)
O
(76),
(69),
0
(Wo
Sto
as
8
U
bytes.
e):
gs.
at
oe:
equ
eac
cy
SC
eu
oost
adt00
8ed
000
ca
de
st
0938
09 dत
),acode:
age.
te).
eped
acte
pat
dows
d:
wa
py:
600
es
eyo000
)7
8eO
0bd. e
00

You might also like