FILE
UNIT-IV
Concept of File
• Computers can store information on various
storage media such as, magnetic disks,
magnetic tapes, optical disks.
• The physical storage is converted into a logical
storage unit by operating system.
• The logical storage unit is called FILE.
Concept of File
• A file is a collection of similar records.
• A record is a collection of related fields that
can be treated as a unit by some application
program.
• A field is some basic element of data.
• Any individual field contains a single value.
Example
File attributes
• Name : A file is named for the convenience of the
user and is referred by its name. A name is
usually a string of characters.
• Identifier : This unique tag, usually a
number ,identifies the file within the file system
• Type : Files are of so many types. The type
depends on the extension of the file.
• Location : This information is a pointer to a
device and to the location of the file on that
device.
File attributes
• Size : The current size of the file (in bytes,
words, blocks).
• Protection : Access control information
determines who can do reading, writing,
executing and so on.
• Time, Date, User identification : This
information may be kept for creation, last
modification,last use.
File types
File operation
• Creating a file
• Open
• Close
• Writing a file
• Reading a file
• Repositioning within a file
• Deleting a file
• Truncating a file
Access methods
Three types
• Sequential file access
• Direct access
• Indexed Sequential File access
Sequential file access
• Information in the file is processed in order i.e.
one record after the other. Magnetic tapes are
supporting this type of file accessing
Direct access
• Direct access is also called relative access.
Here records can read/write randomly without
any order.
• The direct access method is based on a disk
model of a file, because disks allow random
access to any file block.
Simulation of Sequential Access on a Direct-
access File
Indexed Sequential File access
• The main disadvantage in the sequential file is,
it takes more time to access a Record
• Records are organized in sequence based on a
key field.
• Each record in the index file consisting of 2
field, A key field and a pointer field.
Directory(Introduction)
• Sometimes the file system consisting of
millions of files, at that situation it is very hard
to manage the files.
• To manage these files grouped these files and
load one group into one partition.
• Each partition is called a directory .
• A directory structure provides a mechanism
for organizing many files in the file system
Directory structure
OPERATION ON THE DIRECTORIES
• 1. Search for a file : Search a directory structure for
required file.
• 2. create a file : New files need to be created, added to
the directory.
• 3. Delete a file : When a file is no longer needed, we
want to remove it from the directory.
• 4. List a directory : We can know the list of files in the
directory.
• 5. Rename a file : When ever we need to change the
name of the file, we can change the name.
• 6. Traverse the file system : We need to access every
directory and every file with in a directory structure we
Directory Structure
• A collection of nodes containing information about all files
Directory
Files
F1 F2 F4
F3
Fn
Both the directory structure and the files reside on disk
Directory structure
• Single level directory
• Two level directory
• Tree structured directory
• Acyclic graph directory
• General graph directory
Single level directory
• The directory system having only one directory,it
consisting of all files some times it is said to be root
directory.
• Simple, easy to quickly locate files
• Inconvenient naming (uniqueness), no grouping
Two level directory
• The problem in single level directory is
different user may be accidentally use the
same name for their files
• Names chosen by one user don't interfere
with names chosen by a different user.
• Root directory is the first level directory and
the next will be user level directory.
• Introduces the notion of a path name
• Efficient searching
Tree structured directory
• Directories can now contain files and
subdirectories
• Efficient searching, allows grouping
• To access a file, the user should either:
– Go to the directory where file resides, or
– Specify the path where the file is
• Path names are either absolute or relative
– Absolute: path of file from the root directory
– Relative: path from the current working directory
• Most OSes have two special entries in each directory:
– “.” for current directory and “..” for parent
Acyclic graph directory
• Allow sharing of subdirectories and files
• Two different names (aliasing)
• If dict deletes list dangling pointer
Solutions:
– Backpointers, so we can delete all pointers
Variable size records a problem
– Backpointers using a daisy chain organization
– Reference count for each file
• New directory entry type
– Link – another name (pointer) to an existing file
– Resolve the link – follow pointer to locate the file
General graph directory
• When we add links to an existing tree
structured directory, the tree structure is
destroyed, resulting is a simple graph
structure
• How do we guarantee no cycles?
– Allow only links to files not subdirectories
– Garbage collection
– Every time a new link is added use a cycle
detection algorithm to determine whether it is OK
UNIT-IV
FILE SYSTEM
File system structure
File-System Implementation
• Boot control block contains info needed by system to
boot OS from that volume
UFS UNIX-boot control block, NTFS windows-partition
• Partition control block (superblock, master file table)
contains volume details, such as no of blocks, size,
free block, FCB pointers
• Directory structure organizes the files
• File Control Block (FCB) contains many details about
the file
A Typical File Control Block
In-Memory File System Structures
UNIT-IV
FILE ALLOCATION
METHODS
• An allocation method refers to how disk blocks
are allocated for files:
• Contiguous allocation
• Linked allocation
• Indexed allocation
• Each file occupies a set of contiguous blocks
on the disk
• Simple – only starting location (block #) and
length (number of blocks) are required
• Random access
• Wasteful of space (dynamic storage-allocation
problem)
• Suffers from external fragmentation
• Compaction used but leads to offline(down
time)
• Sequential & direct access can be done
Contiguous Allocation of Disk Space
Linked Allocation
• Each file is a linked list of disk blocks: blocks may
be scattered anywhere on the disk.
• Simple – need only starting address
• Free-space management system – no waste of
space
• No random access
• Effectively used only for sequential access
• It occupies more space for pointers
• It overcomes external fragmentation
Linked Allocation
File-Allocation Table
• A table has one entry for each disk block and
is indexed by block number.
• Directory entry contains the block number of
first block. It continues until finds EOF
• Unused blocks are indicated by 0 on the table
Indexed Allocation
• Brings all pointers together into the index block
• It solves external fragmentation and size
declaration problem
• Need index table
• Random access
• Each file has it own index block, which is an array
of disk block address
• When file created, all pointers are set to nil.
• It supports direct access
• Pointer overhead of index block is greater than
the pointer overhead of liked allocation
Indexed Allocation
DIRECTORY IMPLEMENTATION
• Linear List
• Simplest method of implementing directory
• All the files in a directory are maintained as
singly lined list.
• Each file contains the pointers to the data
blocks which are assigned to it and the next
file in the directory
• Time consuming to execute
• Linear search
Hash table
• Linear list stores the directory entries but a
hash data structures is also used
• A key-value pair for each file in the directory
gets generated and stored in the hash table
• It decrease the search time and becomes
efficient
• Problem is collisions-situations where file
names hash to same location
• It is fixed size and dependence of hash
function
Free-Space Management
• File system maintains free-space list to track
available blocks/clusters Linked list (free list)
• o Cannot get contiguous space easily
• o No waste of space
• o No need to traverse the entire list
Bitmap
• A Bitmap or Bit Vector is series or collection of
bits where each bit corresponds to a disk
block
• 0 indicates that the block is allocated and 1
indicates a free block
• Simple to understand.
• Finding the first free block is efficient
Bitmap example
16 bit map
blocks-1,2,3,4,8,9,10,11,12,13,16
bits-0000111000000110
Linked
• The free disk blocks are linked together i.e.
a free block contains a pointer to the next free
block.
• The block number of the very first disk block is
stored at a separate location on disk and is
also cached in memory.
• This technique is not sufficient to traverse the
list because we have to read each disk block
that requires I/O time
Grouping
• An advantage of this approach is that the
addresses of a group of free disk blocks can be
found easily
Counting
• Because space is frequently contiguously used
and freed, with contiguous- allocation
• Keep address of first free block and count of
following free blocks