Chapter 2: Database System Architecture
2.1. Database Architecture
A major aim of a database system is to provide users with an abstract view of data, hiding certain
details of how data is stored and manipulated. Since a database is a shared resource, each user
may require a different view of the data held in the database. To this effect, a standard
terminology and general architecture of database system should be used.
The American National Standards Institute/Standards Planning and Requirements Committee
(ANSI-SPARC) introduced the three-level architecture of the database based on their degree of
abstraction.
The purpose and objective of the Three-Level database architecture
All users should be able to access same data. This is important since the database is having
a shared data feature where all the data is stored in one location and all users will have their
own customized way of interacting with the data.
A user's view is unaffected or immune to changes made in other views. Since the
requirement of one user is independent of the other, a change made in one user’s view
should not affect other users.
Users should not need to know physical database storage details. As there are naïve users of
the system, hardware level or physical details should be a black-box for such users.
DBA should be able to change database storage structures without affecting the users'
views. A change in file organization, access method should not affect the structure of the
data which in turn will have no effect on the users.
Internal structure of database should be unaffected by changes to physical aspects of
storage
DBA should be able to change conceptual structure of database without affecting all users.
In any database system, the DBA will have the privilege to change the structure of the
database, like adding tables, adding and deleting an attribute, changing the specification of
the objects in the database.
Three-level ANSI-SPARC Architecture of a Database
External Level:
Users' view of the database.
It describes part of database that is relevant to a particular user.
Different users have their own customized view of the database independent of other
users.
Conceptual Level:
Community view of the database.
Describes what data is stored in database and relationships among the data.
It is the logical structure of the entire database as seen by the database administrator.
Represents
All entities, attributes and their relationships
The constraints on the data
Semantic information about the data
Security and integrity information
Internal Level:
Physical representation of the database on the computer.
Describes how the data is stored in the database.
It is about the data structure and file organization used to store the data on storage
devices.
It is concerned with such things as:
Storage space allocation
Record description for storage
Record placement
Data compression, data encryption …
2.2. Over view of data models, Schemas, and Instances
Data Model is the modeling of the data description, data semantics, and consistency constraints
of the data. It provides the conceptual tools for describing the design of a database at each level
of data abstraction. Therefore, there are following four data models used for understanding the
structure of the database:
It is a set of concepts to describe the structure of a database, and certain constraints that the
database should obey. It is a description of the way that data is stored in a database. Data model
helps to understand the relationship between entities and to create the most effective structure to
hold data.
Data model is an integrated collection of concepts for describing data, relationships between
data, and constraints on the data in an organization.
Data Model is a collection of tools or concepts for describing
� Data
� Data relationships
� Data constraints
Data model comprises three parts:
1. A structural part: set of rules to construct the data base
2. A manipulation part: type of operations to be performed
3. Integrity rules: which insures that the data is accurate
The main purpose of Data Model is to represent the data in an understandable way.
Categories of data models include:
1. Object-based data model
Uses concepts such as entities, attributes and relationships
An entity is a distinct object (student, client, policy)
Attribute: a property that describes some aspect of the entity
Relationship: association between entities
Common examples of object-based data models are:
Entity-relationship(ER) data model
Object-oriented data model
2. Record-based data model
The database consist of a number of fixed format records
Each record type defines a fixed number of fields
Each field is typically of a fixed length
Common examples are:
Relational data model
Network data model
Hierarchical data model
Schemas and Instances
When a database is designed using a Relational data model, all the data is represented in a form
of a table. In such definitions and representation, there are two basic components of the database.
The two components are the definition of the Relation or the Table and the actual data stored in
each table. The data definition is what we call the Schema or the skeleton of the database and the
Relations with some information at some point in time is the Instance or the flesh of the
database.
Schemas
� Schema describes how data is to be structured, defined at setup/Design time (also called
"metadata")
� since it is used during the database development phase, there is rare tendency of changing
the schema unless there is a need for system maintenance which demands change to the
definition of a relation.
� Database Schema: specifies name of relation and the collection of the attributes
(specifically the Name of attributes).
refer to a description of database/intension of the database
specified during database design
should not be changed unless during maintenance
There are three different kinds of schema in the database defined according to the 3 level ANSI-
SPARC architecture:
Internal/physical schema: at the internal level to describe physical storage structures and access
paths. Typically uses a physical data model.
Conceptual/logical schema: at the conceptual level to describe the structure and constraints for
the whole database for a community of users. It uses a conceptual or an implementation data
model.
External schema: at the external level to describe the various user views. Usually uses the same
data model as the conceptual level.
Instances
� Instance: is the collection of data in the database at a particular point of time (snap-shot).
Also called State or Snap Shot or Extension of the database
Refers to the actual data in the database at a specific point in time
State of database is changed any time we add, delete or update an item.
� Since Instance is actual data of database at some point in time, it changes rapidly.
2.3. Architecture and Data Independence
The DBMS design depends upon its architecture. The basic client/server architecture is used
to deal with a large number of PCs, web servers, database servers and other components that
are connected with networks.
The client/server architecture consists of many PCs and a workstation which are connected
via the network.
DBMS architecture depends upon how users are connected to the database to get their
request done.
Types of DBMS Architecture
Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.
1-Tier Architecture
In this architecture, the database is directly available to the user. It means the user can
directly sit on the DBMS and uses it.
Any changes done here will directly be done on the database itself. It doesn't provide a
handy tool for end users.
The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.
2-Tier Architecture
The 2-Tier architecture is same as basic client-server. In the two-tier architecture,
applications on the client end can directly communicate with the database at the server side.
For this interaction, API's like: ODBC, JDBC are used.
The user interfaces and application programs are run on the client-side.
The server side is responsible to provide the functionalities like: query processing and
transaction management.
To communicate with the DBMS, client-side application establishes a connection with the
server side.
Fig: 2-tier Architecture
3-Tier Architecture
The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
The application on the client-end interacts with an application server which further
communicates with the database system.
End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the application.
The 3-Tier architecture is used in case of large web application.
Fig: 3-tier Architecture
Data Independence
The major objective of ANSI-SPARC architecture is to provide data independence, i.e. upper
levels are unaffected by changes to lower levels.
Data independence is the ability to modify a schema definition in one level without affecting
schema definition in the next higher level. It is the strength of higher levels to changes to lower
levels. There are two types of data independence
1. Logical Data Independence:
Refers to immunity of external schemas to changes in conceptual schema
Conceptual schema changes e.g. addition/removal of entities should not require changes
to external schema or rewrites of application programs.
The capacity to change the conceptual schema without having to change the external
schemas and their application programs
2. Physical Data Independence
Refers to immunity of conceptual schema to changes in the internal schema
The ability to modify the physical schema without changing the logical schema
Internal schema changes e.g. using different file organizations, storage structures/devices
should not require change to conceptual or external schemas
The capacity to change the internal schema without having to change the conceptual
schema
� In +
General, the interfaces between the various levels and components should be well defined so
that changes in some parts do not seriously influence others.
2.4. Classification of DBMS
A DBMS has appropriate languages and interfaces to express database queries and updates.
Database languages can be used to read, store and update the data in the database.
Types of Database Languages
1. Data Definition Language (DDL)
DDL stands for Data Definition Language. It is used to define database structure or pattern.
It is used to create schema, tables, indexes, constraints, etc. in the database.
Using the DDL statements, you can create the skeleton of the database.
Data definition language is used to store the information of metadata like the number of
tables and schemas, their names, indexes, columns in each table, constraints, etc.
Here are some tasks that come under DDL:
Create: It is used to create objects in the database.
Alter: It is used to alter the structure of the database.
Drop: It is used to delete objects from the database.
Truncate: It is used to remove all records from a table.
Rename: It is used to rename an object.
Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come under Data
definition language.
2. Data Manipulation Language (DML)
DML stands for Data Manipulation Language. It is used for accessing and manipulating data in
a database. It handles user requests.
Here are some tasks that come under DML:
Select: It is used to retrieve data from a database.
Insert: It is used to insert data into a table.
Update: It is used to update existing data within a table.
Delete: It is used to delete all records from a table.
Merge: It performs UPSERT operation, i.e., insert or update operations.
Call: It is used to call a structured query language or a Java subprogram.
Explain Plan: It has the parameter of explaining data.
Lock Table: It controls concurrency.
3. Data Control Language (DCL)
DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature
of rolling back.)
Here are some tasks that come under DCL:
Grant: It is used to give user access privileges to a database.
Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:
CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.
4. Transaction Control Language (TCL)
TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction. Here are some tasks that come under TCL:
o Commit: It is used to save the transaction on the database.
o Rollback: It is used to restore the database to original since the last Commit.
Activity
1. Define the following terms: data model, database schema, database state, internal schema, conceptual
schema, external schema, data independence, DDL, DML, SDL, VDL, query language, host language,
data sublanguage, database utility, catalog, client/server architecture, three-tier architecture, and n-tier
architecture?
2. What is the difference between a database schema and a database state?
3. Describe the three-schema architecture. Why do we need mappings among schema levels? How do
different schema definition languages support this architecture?
4. What is the difference between logical data independence and physical data independence? Which one
is harder to achieve? Why?
5. What is the difference between procedural and nonprocedural DMLs?
6. With what other computer system software does a DBMS interact?