Database
Management
Systems
U N I T: 1
INT RODUCT ION TO DBMS
1. What is Computer?
❖Computer is an electronic machine or device, which can perform arithmetic and logical operations at
very fast speed. It can store the data, process the data and produces information.
❖The word ‘Computer’ is derived from the word called – Compute.
❖Compute means doing any type of Calculation, like Addition, Multiplication, Division, Subtraction etc.
❖Today, lot many activities we are carried out from the computer, in which there is no computation is
involved.
❖Then, How can we define our modern Computers?
❖Our Modern computers can be defined as – Data Processors.
1. What is Computer?
2. What is Data and Information?
❖Data means unstructured raw material and unstructured facts which can be recordable, which
will provide necessary inputs to the computer system.
❖Data can be available in different formats like Text, Number, Image, Audio or Video.
❖Processed (meaningful) data is called Information.
❖The term data and information can be different from the point of view of different users.
❖In short, our modern Computer can be defined as a data processors, which takes data in by
using various input devices, process it with the help of CPU and main memory and generate
information (output), which can be received by the users using different types of output devices.
3. Databases and DBMS
❖Collection of similar (related) data is called Database.
❖Examples: Air-Line Reservation system, Railway Reservation system, Inventory Management
System, Banking database system, E-Commerce database system, YouTube, Instagram etc.
❖DBMS (DataBase Management System) is software product which allow user to create, manage,
manipulate or delete the data.
❖Examples: MS-Access, Oracle, MS-SQL Server, MySQL etc.
3.1 Application Structure
3.2 Understanding DBMS:
4. File Organization:
❖There are three principal methods of organizing files, of which only two provide the direct access
necessary in on-line systems.
❖Data files are organized so as to facilitate access to records and to ensure their efficient storage. A tradeoff
between these two requirements generally exists: if rapid access is required, more storage is required to
make it possible.
❖Access to a record for reading it is the essential operation on data. There are two types of access:
1. Sequential Access: is performed when records are accessed in the order they are stored. Sequential
access is the main access mode only in batch systems, where files are used and updated at regular
intervals.
2. Direct Access: on-line processing requires direct access, whereby a record can be accessed without
accessing the records between it and the beginning of the file. The primary key serves to identify the
needed record.
4. File Organization:
There are three methods are of file organization:
1. Sequential File Organization
2. Index-Sequential File Organization
3. Direct File Organization
4.1 Sequential File Organization:
In sequential organization records are physically stored in a specified order according to a
key field in each record.
Advantages of sequential access:
1. It is fast and efficient when dealing with large volumes of data that need to be processed
periodically (batch system).
Disadvantages of sequential access:
1. Requires that all new transactions be sorted into the proper sequence for sequential
access processing.
2. Locating, storing, modifying, deleting, or adding records in the file requires rearranging
the file.
3. This method is too slow to handle applications requiring immediate updating or
responses.
4.2 Index-Sequential File
Organization:
In the indexed-sequential files method, records are physically stored in sequential
order on a magnetic disk or other direct access storage device based on the key
field of each record. Each file contains an index that references one or more key
fields of each data record to its storage location address.
4.3 Direct File Organization:
Direct file organization provides the fastest direct access to records. When using
direct access methods, records do not have to be arranged in any particular
sequence on storage media. Characteristics of the direct access method include:
1. Computers must keep track of the storage location of each record using a
variety of direct organization methods so that data can be retrieved when
needed.
2. New transactions' data do not have to be sorted.
3. Processing that requires immediate responses or updating is easily
performed.
5. Problems with the Files:
❖Data Redundancy
❖Data Inconsistency
❖Difficulty in data access
❖Data isolation
❖Security problem
❖Atomicity problem
❖Concurrent data anomalies
❖Integrity problem
6. Hierarchy of Data:
❖Data are the principal resources of an organization. Data stored in computer systems form a hierarchy
extending from a single bit to a database, the major record-keeping entity of a firm. Each higher stage of this
hierarchy is organized from the components below it.
1. Bit: Bit is the smallest unit of memory measure. Data of all formats, stored in computer memory in the
forms of strings of binary bits.
2. Byte (Character) : Eight bits make a byte which can represent a character or a special symbol in a
character code. That can be alphabets, numbers or special symbols.
3. Fields / Attribute: A Field is a combination of one or more character. For Example, Roll number of a
student, Mobile number or Email address of the customer, ISBN number of a book are example of
fields. To maintain integrity, you need to assign a data type when you are defining field. Datatypes
include Text, Numbers, Boolean (Yes/No), Date etc. For example, if you have chosen date datatype to
the field ‘DateOfBirth’, then DBMS will not allow any Invalid date in this field at the time of data entry.
6. Hierarchy of Data:
4. Record / Tuple: A record is a group of related fields. For example: data of (RollName, Name,
DateOfBirth, Address) of any one student is a record of that particular student.
5. Data Table / Data File / Entity / Relation: Collection of related records is called Data Table or
Data File. For example, if the class has 60 students, then Data table should have 60 students. Each
record represents a data of one student that might have (RollName, Name, DateOfBirth, Address)
fields.
6. Database: is an integrated collection of logically related records or files. A database consolidates
records previously stored in separate files into a common pool of data records that provides data
for many applications. The data is managed by systems software called database management
systems (DBMS). The data stored in a database is independent of the application programs using it
and of the types of secondary storage devices on which it is stored.
7. Characteristics of database:
1. Self-describing nature of database system. 8. Restriction of un unauthorized access.
2. Insulation between program and data. 9. Data Independence.
3. Support for multiple views of data. 10. Transaction processing.
4. Sharing of data and multiuser system.
5. Control of data redundancy.
6. Data Sharing
7. Enforcement of Integrity constraints.
8. Some Key Terms:
8.1 Data Types
❖A data type is a classification of data which tells the compiler or interpreter how the programmer
intends to use the data.
❖Most programming languages support various types of data, including integer, real, character or
string, and Boolean.
❖Oracle data-base supports:
1. Character datatypes Like: char, nchar, varchar2, nvarchar2, clob, nclob etc
2. Number datatype
3. Date datatype
4. Binary datatype: blob, bfile, raw, long raw
8. Some Key Terms:
8.1 Data Abstraction
❖Data abstraction is the reduction of a particular body of data to a simplified
representation of the whole.
❖Abstraction, in general, is the process of taking away or removing
characteristics from something in order to reduce it to a set of essential
characteristics.
❖Data abstraction is usually the first step in database design. A
complete database is too much complex system to be developed without first
creating a simplified framework.
9. Database Users:
❖The users of a database system can be classified in the following groups, depending upon their
degree of expertise or the mode of their interactions with the DBMS.
1. Naïve Users
2. Online Users
3. Applications Programmers
4. Database Administrator
9.1 Naive Users
❖User who need not be aware of the presence of the database system or any other system
supporting their usage are considered naïve users.
❖Example: A user of an automatic teller machine
❖The operations that can be performed by this class users are very limited and affect a precise
portion of the database.
❖Some other naïve users of the database work with the database through menu-oriented
application programs.
9.2 Online Users:
❖These are users who may communicate with the database directly via an online terminal or
indirectly via a user interface and application program.
❖These users are aware of the presence of the database system and may have acquired a certain
amount of expertise in the limited interaction.
❖They are permitted to the database through the intermediary of the application program.
❖These users may also use a data manipulation language to manipulate the database directly.
9.3 Application Programmers
❖Professional programmers who are responsible for developing application programs or user
interfaces utilized by the naïve and online users fall into this category.
❖The application program can be written in any general-purpose programming language like: C-
Language, C#.NET, C++, Java, PHP or Python.
9.4 Database Administrators
(DBA)
❖Centralized control of the database is exerted by a person or group of person under the
supervision of a high-level administration, is/are called Database Administrator(s) [DBA].
❖The are most familiar with the database and responsible for creating, modifying and maintaining
all three levels of database.
❖ The DBA is the custodian of the data and controls the database structure.
❖DBA grant permissions to different users.
❖DBA is responsible to protect data from unauthorized access, taking backup of data and
recovering data in the case of failure.
10. Three-Level Architecture of
DBMS
❖The generalized architecture of a database system called ANSI/SPARC1 model.
❖Most commercial systems and research database models fit this framework.
❖The DBMS architecture shown in the following figure is divided into three levels:
1. External Level
2. Conceptual Level
3. Internal Level
❖The view at each of these level is described by schema. It an outline or a plan that describe the
records and relationships existing in the view.
1. American Standard Code for Information Interchange / Standards Planning and Requirements Committee
Three-Level Architecture of DBMS
10.1 External or User View
❖The external or user view is at the highest level of database abstraction where only those portion
of the database of concern to a user or application program are included.
❖Any number of user views may exists for a given global or conceptual view.
❖Each external view is described by means of a scheme called an external schema.
❖The external schema consists of the definition of the logical records and the relationships in the
external view.
10.2 Conceptual or Global View
❖At this level of database abstraction all the database entities and the relationships among then
are included.
❖One conceptual view represents the entire database.
❖This conceptual view is defined by the conceptual schema.
10.3 Internal View
❖We find this view at the lowest level of abstraction, closest to the physical storage.
❖If indicates how the data will be stored and described the data structures and access methods to
be used by the database.
❖The internal view is expressed by internal schema, which contains the definition of the stored
record, the method of representing the data, and access aids used.
11 Data Independence
❖Three levels of abstraction, along with the mapping from internal to conceptual and from
conceptual to external, provides two distinct levels of independence: logical data independence
and physical data independence.
❖Logical data independence indicates that the conceptual schema can not be changed without
affecting the existing external schemas. The change would be absorbed by the mapping between
the external and conceptual levels.
❖Physical data independence is achieved by the presence of the internal level of the database and
the mapping or transformation from the conceptual level of the database to the internal level. If
there is a need to change the file organization or the type of physical device used as a result of
growth in the database or new technology, a change is required in the transformation functions
between conceptual level and internal level.
12. Data Dictionary
❖Information pertaining to the structure and usage of data contained in the database, the
metadata, is maintained in a Data Dictionary.
❖The term system catalog also describes this metadata.
❖The Data Dictionary, which is a database itself, documents the data.
❖Each database user can consult the data dictionary to learn what each piece of data and the
various synonyms of the data fields mean.
❖In an integrated system, the data dictionary stores information concerning the external,
conceptual and internal levels of database.
12. Data Dictionary
[1] tblEmployee:
tblEmployee stores personal information of each Employee entity.
Sr No Field Name DataType Constraints Description
1 EmpCode Number(3) Primary Key Unique Identification of each Employee
2 Name Varchar2(25) Not Null Name of the Employee
3 DOB Date/Time Date of Birth
4 Address Varchar(200)
5 City Varchar(30)
6 Phone Number(10) Mobile number of the Employee
7 Email Varchar(30) Email address of the employee
12. Data Dictionary
[2] tblSalary:
tblSalary stores month wise salary details provided to each Employee.
Sr No Field Name DataType Constraints Description
1 EmpCode Number(3) Primary Key Unique Identification of each
Foreign Key (tblEmployee) Employee
2 Month Varchar2(3) Primary Key Month of Salary
3 Year Number(4) Primary Key Year of Salary
4 Basic Number(5) Not Null Basic salary
5 HRA Number(4) Default value 0 House Rent Allowance
6 DA Number(4) Dearness allowance
7 IT Number(4) Default value 1000 Income tax deduction
13. Advantages of DBMS
❖Reduce data redundancy
❖Shared data
❖Integrity
❖Security
❖Conflict resolution
❖Data independence
❖Centralized control
13. Limitations of DBMS
❖Problem associated with Centralized control
❖Cost of implementation (Hardware and Software)
❖Complexity of backup and recovery
❖Need technical staff for handling of database.
14. Database Language
❖Most databases are supporting a language called SQL (Structured Query Language).
❖SQL is a 4th Generation Language, which is easy to learn.
❖Commands in SQL (Query) are similar to the English statements.
For Example:
1. Select * from tblEmployee;
2. Select * from tblEmployee where EmpCode=1;
3. Insert into tabEmployee (EmpCode, Name, City) values (5, ‘John’, ‘Ahmedabad’);
14. Database Language
1. DDL (Data Definition Language)
2. DML (Data Manipulation Language)
3. DCL (Data Control Language)
4. TCL (Transaction Control Language)
14.1 DDL (Data Definition
Language)
•DDL stands for Data Definition Language. Here are some tasks that come under DDL:
It is used to define database structure or •Create: It is used to create objects in the
pattern. database.
•It is used to create schema, tables, indexes, •Alter: It is used to alter the structure of the
constraints, etc. in the database. database.
•Using the DDL statements, you can create •Drop: It is used to delete objects from the
the skeleton of the database. database.
•Data definition language is used to store the •Truncate: It is used to remove all records from a
information of metadata like the number of table.
tables and schemas, their names, indexes, •Rename: It is used to rename an object.
columns in each table, constraints, etc. •Comment: It is used to comment on the data
dictionary.
14.2 DML (Data Manipulation
Language)
DML stands for Data Manipulation Language.
Here are some tasks that come under DML:
It is used for accessing and manipulating •Select: It is used to retrieve data from a database.
data in a database. It handles user requests. •Insert: It is used to insert data into a table.
•Update: It is used to update existing data within a
table.
•Delete: It is used to delete all records from a table.
•Merge: It performs UPSERT operation, i.e., insert
or update operations.
•Call: It is used to call a structured query language
or a Java subprogram.
•Explain Plan: It has the parameter of explaining
data.
•Lock Table: It controls concurrency.
14.3 DCL (Data Control Language)
Here are some tasks that come under DCL:
•DCL stands for Data Control Language. It
is used to retrieve the stored or saved data. •Grant: It is used to give user access privileges to
a database.
•The DCL execution is transactional.
•Revoke: It is used to take back permissions from
the user.
14.3 TCL (Transaction Control
Language)
Here are some tasks that come under TCL:
•TCL is used to run the changes made by the
DML statement. TCL can be grouped into a •Commit: It is used to save the transaction on the
logical transaction. database.
•Rollback: It is used to restore the database to
original since the last Commit.
15. ACID Properties
❖A – Atomicity (Transaction Management Component is part of Database, which ensures
Atomicity). Example of Mobile Recharge
❖C – Consistency (If the database is consistent in its initial state, then it must be consistent after
performing any transaction).
❖I – Isolation (Logical Isolation, Concurrency Control Component is a part of Database, which
ensures Isolation). Example: Order in Restaurant
❖D – Durability (Transaction recorded in the database, must be permeant. Recovery Management
Component is a part of Database, which ensures Durability. Once the ticket is booked then it
should not cancelled unless user is not doing it.