0% found this document useful (0 votes)

10 views14 pages

Data Representation and File Organization

The document covers data representation in programming, focusing on user-defined data types, file organization, and floating-point number representation. It explains the necessity of user-defined types, composite and non-composite data types, various file organization methods, and the implications of floating-point approximations and rounding errors. Additionally, it discusses normalization, precision versus range, and potential issues like overflow and underflow in floating-point arithmetic.

Uploaded by

nmirza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views14 pages

Data Representation and File Organization

Uploaded by

nmirza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

01 Data Representation

Candidates should be able to:

Show understanding of why user-defined types are necessary

Define and use non-composite types

Define and use composite data types

Choose and design an appropriate user-defined data type for a given problem

Show understanding of the methods of file organization and select an

appropriate method of file organization and file access for a given problem

Show understanding of methods of file access

Show understanding of hashing algorithms

Describe the format of binary floating-point real numbers

Convert binary floating-point real numbers into denary and vice versa

Normalize floating-point numbers

Show understanding of the consequences of a binary representation only

being an approximation to the real number it represents (in certain cases)

Show understanding that binary representations can give rise to rounding

errors

Definitions
The programming language defines the range of possible
Built-in data type values that can be assigned to and the operations that can be
applied to a variable

A data type for which the programmer has included the

User-defined data type
definition in the program

01 Data Representation 1
Composite data type A data type that is derived from other data types

Non-Composite data
A data type defined without referencing another data type
type

A data-type which provides an ordered list of values that a

Enumerated data type
variable of this type can take on

Pointer data type Used to reference a memory location

A data type that allows storing a finite number of different

Sets
values that have no order

Class Includes variables of given data types and methods

Serial file organization Records stored in the order they were added in

Sequential file Physically stores record and ordered according to their key
organization field value

Random file organization Stores records of data in a file in any available position

Each record in the file is read, one by one, until the desired
Sequential Access
record is found

Jumps to a specific record in the file without accessing other

Random Access
records

A mathematical formula used to perform a calculation on the

Hashing Algorithm
key field of the record

Normalized two’s
First and second bits of mantissa must be different for a
complement binary
floating point number to be normalized
number

Number following a calculation is too big to be represented in

Overflow
the given format

Underflow Number is too small to be represented in the given format

1.1 User-defined data types

1.1.1 Data Types
Built-in data types

01 Data Representation 2
The programming language defines the range of possible values that can be
assigned to and the operations that can be applied to a variable

User-defined data types

A data type for which the programmer has included the definition in the
program

It is used to create a new data type

Allows to extend the flexibility of the programming language

Non-Composite data types

A data type defined without referencing another data type

Example: Enumerated data type, Pointer

Enumerated data type

A data-type which provides an ordered list of values that a variable of this

type can take on

Example:
TYPE SchoolDay = (Monday, Tuesday, Wednesday, Thursday, Friday)

Note: Since this is a different data type than STRING, quotation marks are not
used to represent them

Pointer data type

Used to reference a memory location

Example:

DECLARE IntPointer : ^INTEGER

DECLARE MyVar : INTEGER

DECLARE MyVar2 : INTEGER

MyVar ← 57
IntPointer ← @MyVar

MyVar2 ← IntPointer^

IntPointer^ ← 100

01 Data Representation 3
@ symbol before a variable identifier gives the address of the variable

^ is put before the data type to define a pointer data type

^ is placed after a pointer variable. It is an identifier that dereferences the

pointer variable

Composite data types

A data type that is derived from other data types

They are used to extend functionality of programming language

Example: Records, Sets, Classes

Sets

A data type that allows storing a finite number of different values that have no
order

Example:
TYPE Sletter = SET OF CHAR
DEFINE vowel = (’a’, ‘e’, ‘i’, ‘o’, ‘u’) : Sletter

Classes

Includes variables of given data types and methods

Used in Object Oriented Programming, discussed in later chapters

01 Data Representation 4
Skill Check 1

1. Describe the purpose of a user-defined data type. [2]

2. Define, using pseudocode, the following enumerated data types:

a. SchoolDay to hold data about the days students are usually in school.
[1]

b. WeekEnd to hold data about the days that are not school days. [1]

3. Define, using pseudocode, the composite data type ClubMeet. This will
hold data about club members that includes:

First name and last name

The two days they attend:

One on a school day

One not on a school day

Use the enumerated types you created in part(b). [4]

Solution

1. To create a new data type to extend the flexibility of the programming

language

2. TYPE SchoolDay = (Monday, Tuesday, Wednesday, Thursday

TYPE WeekEnd = (Saturday, Sunday)

3. TYPE ClubMeet
DECLARE FirstName : STRING
DECLARE LastName : STRING
DECLARE Schoolday : SchoolDay
DECLARE Weekend : WeekEnd
ENDTYPE

01 Data Representation 5
1.2 File organization and access
1.2.1 File Organization
Serial File Organization
Records stored in the order they were added in

New records appended to end of file

Accessed only sequentially

Reorganization not needed when a new record is added

Sequential File Organization

Physically stores record and ordered according to their key field value

Accessed both sequentially and by direct access with an index file

High hit rate (everyone needs a statement)

Justification:

Suitable for batch processing

All customers need statement

There’s a unique key field

Organized by unique key field

Random File Organization

Stores records of data in a file in any available position

The location of any record in the file is found by using a hashing algorithm on
the key field of a record

Low waiting time

Low hit rate (only one record will match your account number/username)

Justification:

Real time access

01 Data Representation 6
No need to search records.

1.2.2 File Access

Sequential Access
Each record in the file is read, one by one, until the desired record is found

Efficient when every record in the file needs to be processed

Every record is searched until a record is found, or whole file has been
searched and not found, or if key field of current record being checked is
greater than the key field of record being searched in sequential file
organization

Direct Access
Jumps to a specific record in the file without accessing other records

Required when only an individual record from a file needs to be processed

For sequential file, an index of all the key fields is kept and used to look up the
address of the file location where a given record is stored

For random access files, a hashing algorithm is used on the key field to
calculate the address of the file location where a given record is stored

1.2.3 Hashing Algorithms

A mathematical formula used to perform a calculation on the key field of the
record

The result of the calculation gives the address where the record should be
found

To write a record:

Key field is hashed to produce a location address

If location is free, add the data there

Otherwise, use an overflow method to find a free location

If no free location, data cannot be stored

01 Data Representation 7
To read a record:

Key field is hashed to produce a location address

The record at the address is checked to see if it matches the desired

record

If it does not match, then the following records need to be read until a
match is found (if open hash is used), or the overflow area needs to be
searched for a match (if closed hash is used)

Skill Check 2

1. Compare sequential and serial methods of file organization. [4]

2. State the most suitable method of file access when a record is

referenced by a unique address on a disk-type storage medium. [1]

3. State the most suitable method of file access when a bank stores its
data records in ascending order of account number. [1]

Solution

1. In both serial and sequential files, records are stored one after the
other and need to be accessed one after the other. Serial files are
stored in chronological order, whereas sequential files are stored
with ordered records and stored in the order of the key field. In serial
files, new records are added in the next available space. In
sequential files, new records are inserted in the correct position.

2. Direct Access

3. Sequential Access

1.3 Floating point numbers, representation

and manipulation
Allows for representation of fractional values in binary number system

The number is the form M × 2E

01 Data Representation 8
M is the mantissa and E is the exponent

1.3.1 Converting binary floating-point numbers into

denary
For the mantissa, start with -1 as the first bit, 0.5 as the second bit, and every
next bit being half of the previous one

Add up all the mantissa values where a 1 bit appears to get the value of M

For the exponent consider the normal 8-bit binary number with the right most
bit representing 1

Add up the exponent values where a 1 bit appears to get the value of E

Use M×2E to get the denary output

For example:

0.1011010 00000100
1 1 1 1 45
0.1011010= 2

+ 8

+ 16

+ 64

= 64

Exponent = 4

Hence, the number is 45

64
× 24 = 11.25

1.3.2 Converting denary numbers into binary floating-

point numbers
Take a denary number and convert to normal two’s complement binary with
the decimal point and replace missing bits to the right with 0s

Move the binary point after the first digit

Represent the exponent by the number of places the decimal place is shifted
left to

01 Data Representation 9
For example:
4.5 →
0100.1000→ 0.1001000= Mantissa
00000011= Exponent

1.3.3 Potential rounding errors and approximations

Most numbers cannot be represented exactly in binary representation, hence
an approximate value is represented

This approximation becomes more accurate with greater number of bits

allowed to represent the number

Repeated calculation and using of previously rounded values can lead to an

inevitable rounding error

The inaccuracy may become significant enough to see

Normalization
First and second bits of mantissa must be different for a floating point number
to be normalized

Reasons:

To store maximum range of numbers with the smallest number of bits

To maximize the precision of the number for the given number of bits

To prevent multiple representations of the same number

To minimize the number of leading zeroes/ones

To normalize, simply make the mantissa start with 1.0 (for a negative number)
or 0.1 (for a negative number) by shifting bits left

For example:

0.0011100 00000101
Shift bits left to get 0.1110000

Since it was shifted left, the value was increased for the relevant bit,
hence reduce the exponent by the number of left shifts, which is 2 in

01 Data Representation 10
this case

Reducing exponent by 2 gives 00000011

Hence, the normalized form is: 0.111000000000011

1.3.4 Precision versus range

There is a trade-off between range and precision

More bits in mantissa means greater precision and vice versa

More bits in exponent means a greater range and vice versa

For a fixed number of bits, increasing one results in the decrease of other
and hence the trade-off

For a binary number with an 8-bit mantissa and an 8-bit exponent (using two’s
complement):

The maximum positive number which can be stored is:

127
0111111101111111= 128
× 2127
The smallest positive number which can be stored is:

1
0100000010000000= 2
× 2−128
The smallest magnitude negative number which can be stored is:

65
1011111110000000= 128

× 2−128
The largest magnitude negative number which can be stored is:

1000000001111111= −1 × 2127

1.3.5 Floating-point problems

Overflow: Number following a calculation is too big to be represented in the
given format

01 Data Representation 11
Underflow: Number is too small to be represented in the given format

Unable to stored the number zero in normalized form

01 Data Representation 12
Skill Check 3

1. Numbers are stored in a computer using floating-point representation

with:

12 bits for the mantissa

4 bits for the exponent

two’s complement form for both the mantissa and exponent

a) Write the normalized floating-point representation of the following

unsigned binary number using this system. [2]

1011100.011001

b) State the consequence of storing the binary number in part(a)(i) as a

floating-point number in this system. Justify your answer. [2]

2. Explain the reason why binary numbers are stored in normalized form.
[3]

Solution

a. Mantissa: 010111000110
Exponent:
0111
b. The accuracy of the number would be reduced because the least
significant bits of the original number have been lost.

2. To store the maximum range of numbers in the minimum number of

bits. Normalization minimizes the number of significant bits enabling
very large numbers to be stored with accuracy. Avoids the possibility
of many numbers having multiple representations.

Points To Note

01 Data Representation 13
Examples of non-composite user-defined data types include enumerated and
pointer data types

Record, set and class are examples of composite user-defined data types

File organization allow for serial, sequential or direct access

Floating-point representation for a real number allows a wider range of values

to be represented

A normalized floating-point representation achieves the best precision for the

value stored

Stored floating-point values rarely give an accurate representation of the

denary equivalent

01 Data Representation 14

Memory Allocation for HouseHeight
No ratings yet
Memory Allocation for HouseHeight
78 pages
A2 Computer Science Data Types Guide
No ratings yet
A2 Computer Science Data Types Guide
19 pages
CAIE - A2 Level - Computer Science
No ratings yet
CAIE - A2 Level - Computer Science
11 pages
A2 Computer Science Notes 2023-2025
No ratings yet
A2 Computer Science Notes 2023-2025
17 pages
CompSci Paper 3 Notes
No ratings yet
CompSci Paper 3 Notes
37 pages
Understanding User-Defined Data Types
No ratings yet
Understanding User-Defined Data Types
76 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
18 pages
User Defined Data Types and File Organization
No ratings yet
User Defined Data Types and File Organization
24 pages
9618 Computer Science A2 Notes
No ratings yet
9618 Computer Science A2 Notes
53 pages
User-Defined Data Types Explained
No ratings yet
User-Defined Data Types Explained
25 pages
User-Defined Data Types & File Organization
No ratings yet
User-Defined Data Types & File Organization
51 pages
A2 Computer Science Notes 2023-2025
No ratings yet
A2 Computer Science Notes 2023-2025
18 pages
THEORY
No ratings yet
THEORY
19 pages
CAIE - A2 Level - Computer Science
No ratings yet
CAIE - A2 Level - Computer Science
11 pages
User-Defined Data Types and File Access
No ratings yet
User-Defined Data Types and File Access
52 pages
User-Defined Data Types & File Access Methods
No ratings yet
User-Defined Data Types & File Access Methods
34 pages
Paper 3 or 4
No ratings yet
Paper 3 or 4
52 pages
User-Defined Data Types in Programming
No ratings yet
User-Defined Data Types in Programming
70 pages
User-Defined Data Types Explained
No ratings yet
User-Defined Data Types Explained
11 pages
A2 Computer Science 9618 Notes Summary
100% (1)
A2 Computer Science 9618 Notes Summary
21 pages
A2 Computer Science 9618 Notes
No ratings yet
A2 Computer Science 9618 Notes
20 pages
A2 Computer Science Data Types Guide
No ratings yet
A2 Computer Science Data Types Guide
19 pages
Caie A2 Level: Computer SCIENCE (9618)
No ratings yet
Caie A2 Level: Computer SCIENCE (9618)
20 pages
A2 Computer Science 9618 Paper 3 Notes
100% (5)
A2 Computer Science 9618 Paper 3 Notes
57 pages
User-Defined Data Types Explained
No ratings yet
User-Defined Data Types Explained
28 pages
A2 Computer
No ratings yet
A2 Computer
52 pages
Revision Guide - Paper 3 - A2 - Book 1 - 9618 - Complete - August 2023
No ratings yet
Revision Guide - Paper 3 - A2 - Book 1 - 9618 - Complete - August 2023
376 pages
9618 Computer Science Revision Notes
No ratings yet
9618 Computer Science Revision Notes
28 pages
CAIE-A2 Level-Computer Science
No ratings yet
CAIE-A2 Level-Computer Science
26 pages
Understanding Data Types and File Access
No ratings yet
Understanding Data Types and File Access
32 pages
Understanding User-Defined Data Types
No ratings yet
Understanding User-Defined Data Types
88 pages
User-Defined Data Types and File Access
No ratings yet
User-Defined Data Types and File Access
19 pages
A-Level - 13 - Data Representation
No ratings yet
A-Level - 13 - Data Representation
91 pages
Data Types and File Access Methods
No ratings yet
Data Types and File Access Methods
34 pages
Chapter 13
No ratings yet
Chapter 13
24 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
27 pages
A-Level Computer Science Data Types Guide
No ratings yet
A-Level Computer Science Data Types Guide
91 pages
CAIE A2 Computer Science Notes 2023
No ratings yet
CAIE A2 Computer Science Notes 2023
27 pages
Algorithms for Searching and Sorting
No ratings yet
Algorithms for Searching and Sorting
12 pages
CS9618 Data Representation & Software Notes
No ratings yet
CS9618 Data Representation & Software Notes
56 pages
A-Level Computer Science (9618) - 13 - Data Representation
No ratings yet
A-Level Computer Science (9618) - 13 - Data Representation
14 pages
CAIE-A2 Level-Computer Science - Theory
No ratings yet
CAIE-A2 Level-Computer Science - Theory
27 pages
Term2 Week 1 Lesson CSC Year 13
No ratings yet
Term2 Week 1 Lesson CSC Year 13
8 pages
User-Defined Data Types Explained
No ratings yet
User-Defined Data Types Explained
151 pages
CH13 Data Representation
No ratings yet
CH13 Data Representation
3 pages
DocScanner 02-Oct-2024 10-50 Am
No ratings yet
DocScanner 02-Oct-2024 10-50 Am
48 pages
Understanding Data Types in Programming
No ratings yet
Understanding Data Types in Programming
27 pages
Data Structures and Types Overview
No ratings yet
Data Structures and Types Overview
21 pages
Data Structures Overview for BCA Students
No ratings yet
Data Structures Overview for BCA Students
35 pages
Understanding Data Types in Programming
No ratings yet
Understanding Data Types in Programming
24 pages
Data Representation and Structures Overview
No ratings yet
Data Representation and Structures Overview
20 pages
Chapters 16 20
No ratings yet
Chapters 16 20
25 pages
Data Structure Overview and Operations
No ratings yet
Data Structure Overview and Operations
63 pages
Caie A2 Level Computer Science 9618 Theory
No ratings yet
Caie A2 Level Computer Science 9618 Theory
28 pages
Introduction to Data Structures Overview
No ratings yet
Introduction to Data Structures Overview
9 pages
User-Defined Data Types Overview
No ratings yet
User-Defined Data Types Overview
4 pages
Data Types in Programming Explained
No ratings yet
Data Types in Programming Explained
12 pages
AS 9618 A2 Revision-1
No ratings yet
AS 9618 A2 Revision-1
66 pages
Introduction to Data Structures
91% (11)
Introduction to Data Structures
613 pages
TSO/ISPF Overview and Logoff Guide
No ratings yet
TSO/ISPF Overview and Logoff Guide
31 pages
Understanding PN Junctions and Diodes
No ratings yet
Understanding PN Junctions and Diodes
3 pages
WAAM: Advancements in Additive Manufacturing
100% (1)
WAAM: Advancements in Additive Manufacturing
11 pages
Early vs Late Treatment in Class II Malocclusions
No ratings yet
Early vs Late Treatment in Class II Malocclusions
8 pages
Signal Processing and Coding Techniques
No ratings yet
Signal Processing and Coding Techniques
8 pages
8086 Assembly Language Lab Guide
No ratings yet
8086 Assembly Language Lab Guide
16 pages
Multi - Choice Question Paper (MSR-WI-1)
100% (1)
Multi - Choice Question Paper (MSR-WI-1)
8 pages
Synapse X Executor Script Update
No ratings yet
Synapse X Executor Script Update
22 pages
Blockchain Technology Overview
No ratings yet
Blockchain Technology Overview
1 page
AC Fundamentals: Voltage and Current Properties
No ratings yet
AC Fundamentals: Voltage and Current Properties
28 pages
Biology Class 10: Life Processes Overview
No ratings yet
Biology Class 10: Life Processes Overview
15 pages
Molecular Theory of Surface Tension
No ratings yet
Molecular Theory of Surface Tension
21 pages
Lathe Vibration Analysis and Solutions
No ratings yet
Lathe Vibration Analysis and Solutions
6 pages
History and Definition of Laplace Transform
No ratings yet
History and Definition of Laplace Transform
15 pages
EC 1254/1252A Exam Paper Overview
No ratings yet
EC 1254/1252A Exam Paper Overview
3 pages
FB3000 Meter Interface User Manual
No ratings yet
FB3000 Meter Interface User Manual
82 pages
JEE Physics Preparation Strategy
No ratings yet
JEE Physics Preparation Strategy
35 pages
Marginal vs Absorption Costing Analysis
No ratings yet
Marginal vs Absorption Costing Analysis
5 pages
LIATE Mnemonic for Integration by Parts
No ratings yet
LIATE Mnemonic for Integration by Parts
24 pages
The Role of Logic in Legal Reasoning
No ratings yet
The Role of Logic in Legal Reasoning
6 pages
Hall's Culturalism in Cultural Studies
No ratings yet
Hall's Culturalism in Cultural Studies
4 pages
Hoymiles DTU-Pro-S Datasheet
No ratings yet
Hoymiles DTU-Pro-S Datasheet
2 pages
AutoCAD Electrical Symbol Builder Guide
No ratings yet
AutoCAD Electrical Symbol Builder Guide
65 pages
Metallurgical Balance of Fresnillo Mine
No ratings yet
Metallurgical Balance of Fresnillo Mine
8 pages
Introduction to Micrometeorology for Wind Energy
No ratings yet
Introduction to Micrometeorology for Wind Energy
104 pages
Technical Writing in Circuit Lab
No ratings yet
Technical Writing in Circuit Lab
8 pages
NASA's Manned Spaceflight History
100% (4)
NASA's Manned Spaceflight History
15 pages
CPU Scheduling Algorithms Overview
No ratings yet
CPU Scheduling Algorithms Overview
65 pages
Comparing Coefficients with suest Test
No ratings yet
Comparing Coefficients with suest Test
4 pages
Understanding Python Dictionaries
No ratings yet
Understanding Python Dictionaries
3 pages

Data Representation and File Organization

Uploaded by

Data Representation and File Organization

Uploaded by

01 Data Representation

Candidates should be able to:

Define and use non-composite types

Define and use composite data types

Show understanding of the methods of file organization and select an

Show understanding of methods of file access

Show understanding of hashing algorithms

Describe the format of binary floating-point real numbers

Normalize floating-point numbers

Show understanding of the consequences of a binary representation only

Show understanding that binary representations can give rise to rounding

A data type for which the programmer has included the

A data-type which provides an ordered list of values that a

Pointer data type Used to reference a memory location

A data type that allows storing a finite number of different

Class Includes variables of given data types and methods

Jumps to a specific record in the file without accessing other

A mathematical formula used to perform a calculation on the

Number following a calculation is too big to be represented in

Underflow Number is too small to be represented in the given format

1.1 User-defined data types

User-defined data types

It is used to create a new data type

Allows to extend the flexibility of the programming language

Non-Composite data types

Example: Enumerated data type, Pointer

Enumerated data type

A data-type which provides an ordered list of values that a variable of this

Pointer data type

Used to reference a memory location

DECLARE IntPointer : ^INTEGER

DECLARE MyVar2 : INTEGER

^ is put before the data type to define a pointer data type

^ is placed after a pointer variable. It is an identifier that dereferences the

Composite data types

They are used to extend functionality of programming language

Example: Records, Sets, Classes

Includes variables of given data types and methods

Used in Object Oriented Programming, discussed in later chapters

1. Describe the purpose of a user-defined data type. [2]

2. Define, using pseudocode, the following enumerated data types:

First name and last name

The two days they attend:

One on a school day

One not on a school day

Use the enumerated types you created in part(b). [4]

1. To create a new data type to extend the flexibility of the programming

2. TYPE SchoolDay = (Monday, Tuesday, Wednesday, Thursday

New records appended to end of file

Accessed only sequentially

Reorganization not needed when a new record is added

Sequential File Organization

Accessed both sequentially and by direct access with an index file

High hit rate (everyone needs a statement)

Suitable for batch processing

All customers need statement

There’s a unique key field

Organized by unique key field

Random File Organization

Low waiting time

Real time access

1.2.2 File Access

Efficient when every record in the file needs to be processed

Required when only an individual record from a file needs to be processed

1.2.3 Hashing Algorithms

Key field is hashed to produce a location address

If location is free, add the data there

Otherwise, use an overflow method to find a free location

If no free location, data cannot be stored

Key field is hashed to produce a location address

The record at the address is checked to see if it matches the desired

1. Compare sequential and serial methods of file organization. [4]

2. State the most suitable method of file access when a record is

1.3 Floating point numbers, representation

The number is the form M × 2E ﻿

1.3.1 Converting binary floating-point numbers into

The number is the form M × 2E

Use M×2E to get the denary output

Reducing exponent by 2 gives 00000011

Hence, the normalized form is: 0.111000000000011