Chapter one
Introduction to database and Database Management System
What is Database?
Definitions are important, especially in technical subjects because definition describes very
comprehensively the purpose and the core idea behind the thing. Databases have been defined differently
in literature. We are discussing different definitions here, if we concentrate on these definitions, we find
that they support each other and as a result of the understanding of these definitions, we establish a better
understanding of use, working and to some extent the components of a database.
In general database can be defined as:
A shared collection of logically related data, designed to meet the information needs of multiple
users in an organization. The term database is often erroneously referred to as a synonym for a
“database management system (DBMS)”. They are not equivalent and it will be explained in the
next section.
A collection of data: part numbers, product codes, customer information, etc. It
Usually refers to data organized and stored on a computer that can be searched and retrieved by a
computer program. This computer program is called Database management system (DBMS)
A data structure that stores metadata, i.e. data about data. More generally we can say an
organized collection of information.
A collection of information organized and presented to serve a specific purpose. (A telephone
book is a common database.) A computerized database is an updated, organized file of machine
readable information that is rapidly searched and retrieved by computer.
An organized collection of information in computerized format.
A collection of related information about a subject organized in a useful manner
That provides a base or foundation for procedures such as retrieving information, drawing
conclusions, and making decisions.
A Computerized representation of any organizations flow of information and storage of data.
Definition and function of Database Management System (DBMS)
What is DBMS?
Database Management System (DBMS) is a Software package used for providing EFFICIENT,
CONVENIENT and SAFE MULTI-USER storage and access to MASSIVE amounts of
PERSISTENT (data outlives programs that operate on it) data. A DBMS also provides a
systematic method for creating, updating, storing, retrieving data in a database. DBMS also
provides the service of controlling data access, enforcing data integrity, managing concurrency
control, and recovery. Having this in mind, a full scale DBMS should at least have the following
services to provide to the user.
1
Function (Benefits) of DBMS
Data Processing: means storage, retrieval and update in the database.
A user accessible catalogue: Catalog is an object or a place in the DBMS which stores almost all
of the information of the database, including schema Information, user information right of the
users, and many more things about the database. Modern relational DBMS require that the
administrative users of the database
Transaction support service: DBMS is responsible for providing transaction
Support transaction is an act ion that is used to perform some manipulation on the data stored in
the database. DBMS is responsible for supporting al l the required operations on the database, and
also manages the transaction execution so that only the authorized and al lowed actions are
performed.
Concurrency Control Services: access and update on the database by different users
simultaneously should be implemented correctly.
Recovery Services Recovery services mean that in case a database gets an inconsistent State to
get corrupted due to any invalid action of someone, the DBMS should be able to recover itself to
a consistent state, ensuring that the data loss during the recovery process of the database remains
minimum.
Authorization Services (Security): The database is intended to be used by a number of users, who
will perform a number of actions on the database and data stored in the database, The DBMS is
used to al low or restrict different database users to interact with the database. It is the
responsibility of the database to check whether a user intending to get access to database is
authorized to do so or not. If the user is an authorized one than what actions can he/she per form
on the data?
Integrity Services: rules about data and the change that took place on the data, correctness and
consistency of stored data, and quality of data based on business constraints.
Services to promote data independency between the data and the application.
Utility services: sets of utility service facilities like Importing data, Statistical analysis support
etc.
User Interfaces: The data in a database may be accessed by numerous people all with different
levels of expertise. It is important that the system provides an adequate variety of user interfaces
so that it may be used as efficiently and effectively by all those who access it. The DBMS must
allow the same data to be viewed in different ways.
Flexibility: Because programs and data are independent, programs do not have to be modified
when types of unrelated data are added to or deleted from the database, or when physical storage
changes. Only the data are change from the storage area.
2
Fast response to information requests: Because data are integrated into a single database, complex
requests can be handled much more rapidly then if the data were located in separate, non-
integrated files. In many businesses, faster response means better customer service. Possible fast
data retrieve from the data store.
Less storage area: Theoretically, all occurrences of data items need be stored only once, thereby
eliminating the storage of redundant data. System developers and database designers often use
data normalization to minimize data redundancy.
Data duplication is reduced: As data is integrated, present on different locations so chances of
data duplication are much reduced and date is updated form.
Data is easy to understand: As data is managed according to the needs of the user and it is in very
easy format so that you have no difficulty in using the data through database management system
Components of Database Management system:
Data: is the unprocessed fact.
DBMS: is a collection of software (tool) that is used to manage the database and its user.
Hardware: It consists of secondary storage disks on which the database resides.
People: this component is composed of the people in organization that are responsible or play a
role designing, implementing, managing, administrating and using the resource in the database.
Hardware: are component that one touch and feel.
Procedure: this is rules and regulation on how to use and design the database.
3
Benefits of the Database
Data can be shared: two or more users can access and use same data instead of storing data in
redundant manner for each user.
Improved accessibility of data: by using structured query languages, the users can easily access
data without programming experience.
Redundancy can be reduced: isolated data is integrated in database to decrease the redundant data
stored at different applications.
Quality data can be maintained: the different integrity constraints in the database approach will
maintain the quality leading to better decision making
Inconsistency can be avoided: controlled data redundancy will avoid inconsistency of the data in
the database to some extent.
Transaction support can be provided: basic demands of any transaction support systems are
implanted in a full scale DBMS.
Integrity can be maintained: data at different applications will be integrated together with
additional constraints to facilitate shared data resource.
Security majors can be enforced: the shared data can be secured by having different levels of
clearance and other data security mechanisms.
Improved decision support: the database will provide information useful for decision making.
Standards can be enforced: the different ways of using and dealing with data by different unite of
an organization can be balanced and standardized by using database approach.
Compactness: since it is an electronic data handling method, the data is stored compactly (no
voluminous papers).
Speed: data storage and retrieval is fast as it will be using the modern fast computer systems.
Less labor: unlike the other data handling methods, data maintenance will not demand much
resource.
Centralized information control: since relevant data in the organization will be stored at one
repository, it can be controlled and managed at the central level.
Database Languages
So far, we have got to know about database, database management system descriptions and the
components of a database system. In this section, it is explained how 'a data gets into a database system'
and 'how the information gets to the users'. More correctly formulated the following questions will be
answered:
How does an application interact with a database management system?
How does a user look at a database system?
How can a user query a database system and view the results in his/her application?
4
Data Definition Language (DDL)
For describing data and data structures a suitable description tool, a data definition language (DDL), is
needed. With this help a data scheme can be defined and also changed later.
Typical DDL operations (with their respective keywords in the structured query language SQL):
Creation of tables and definition of attributes (CREATE TABLE ...)
Change of tables by adding or deleting attributes (ALTER TABLE …)
Deletion of whole table including content (!) (DROP TABLE …) etc
Data Manipulation Language (DML)
Additionally a language for the descriptions of the operations with data like store, search, read, change,
etc. the so-called data manipulation, is needed. Such operations can be done with a data manipulation
language (DML). Within such languages keywords like insert, modify, update, delete, select, etc. are
common.
Typical DML operations (with their respective keywords in the structured query language SQL):
Add data (INSERT)
Change data (UPDATE)
Delete data (DELETE)
Query data (SELECT) Etc….
Data Control Language (DCL)
A Data Control Language (DCL) is a computer language and a subset of SQL, used to control access to
data in a database.
Examples of DCL commands include:
GRANT used to allow specified users to perform specified tasks.
REVOKE used to cancel previously granted or denied permissions
Data Definition
A data dictionary is a "centralized repository of information about data such as meaning, relationships to
other data, origin, usage, and format. The term may have one of several closely related meanings
pertaining to database and database management system:
a document describing a database or collection of databases
an integral component of a DBMS that is required to determine its structure
a piece of middleware that extends or supplants the native data dictionary of a DBMS
Application of Database
Application and format of database varies from organization to organization and from person to person.
Database can be applied:
Bank, airlines,
insurance
5
Metrology
Geographical Analysis etc.
Database Designer, administrator and User
As people are one of the components in database system environment, there are group of roles played by
different stakeholders of the designing and operation of a database system.
1. Database Administrator (DBA)
Responsible to oversee, control and manage the database resources (the
database itself, the DBMS and other related software)
Authorizing access to the database
Coordinating and monitoring the use of the database
Responsible for determining and acquiring hardware and software resources
Accountable for problems like poor security, poor performance of the system
Involves in all steps of database development
We can have further classifications of this role in big organizations having huge amount of data and user
requirement.
Data Administrator (DA): is responsible on management of data resources. DA involves
in database planning, development and maintenance of standards policies and procedures
at the logical design.
Database Administrator (DBA): is more technically oriented role. DBA responsible on
the physical realization of the database. Involves in physical design, implementation,
security and integrity control of the database.
2. Database Designer (DBD)
Identifies the data to be stored and choose the appropriate structures to represent and
store the data.
Should understand the user requirement and should choose how the user views the
database.
Involve on the design phase before the implementation of the database system.
We have two distinctions of database designers, one involving in the logical and conceptual design and
another involving in physical design.
1. Logical and Conceptual DBD
Identifies data (entity, attributes and relationship) relevant to the organization
Identifies constraints on each data.
Understand data and business rules in the organization.
Sees the database independent of any data model at conceptual level and consider one
specific data mode l at logical design phase.
6
2. Physical DBD
Take logical design specification as input and decide how it should be physically
realized.
Map the logical data model on the specified DBMS with respect to tables and
integrity constraints. (DBMS dependent designing).
Select specific storage structure and access path to the database.
Design security measures required on the database.
3. Application Programmer and Systems Analyst
System analyst determines the user requirement and how the user wants to view the
database.
The application programmer implements these specifications as programs; code, test,
debug, document and maintain the application program.
Determines the interface on how to retrieve, insert, update and delete data in the
database
The application could use any high level programming language according to the
availability, the facility and the required service.
4. End Users
Workers, whose job requires accessing the database frequently for various purposes, in general user
of the database can be deleting, updating, inserting or modifying data from the database.
There are different group of users in this category.
1. Naïve Users:
Sizable proportion of users.
Unaware of the DBMS.
Only access the database based on their access level and demand
Use standard and pre-specified types of queries.
2. Sophisticated Users
Are users familiar with the structure of the Database and facilities of the DBMS.
Have complex requirements
Have higher level queries
Are most of the time engineers, scientists, business analysts, etc
3. Casual Users
Users who access the database occasionally.
Need different information from the database each time.
Use sophisticated database queries to satisfy their needs.
Casual user most of the time middle to high level managers.
7
Flat-file database versus relational database
A lot of the sites that we visit on the web today are generated by a script of some description, and a great
deal of them will use a database in one form or another. Like it or loathe it, building pages dynamically
from databases is a technique that is here to stay. There are two main types of database; flat-file and
relational. Which is the best one to use for a particular job will depend on factors such as the type and the
amount of data to be processed; not to mention how frequently it will be used.
Flat-file Database
The flat-file style of database are ideal for small amounts of data that needs to be human readable or
edited by hand. Essentially all they are made up of is a set of strings in one or more files that can be
parsed to get the information they store; great for storing simple lists and data values, but can get
complicated when you try to replicate more complex data structures. That's not to say that it is impossible
to store complex data in a flat-file database; just that doing so can be more costly in time and processing
power compared to a relational database. The methods used for storing the more complex data types, are
also likely to render the file unreadable and un-editable to anyone looking after the database.
The typical flat-file database is split up using a common delimiter. If the data is simple enough, this could
be a comma, but more complex strings are usually split up using tabs, new lines or a combination of
characters not likely to be found in the record itself.
One of the main problems with using flat files for even a semi-active database is the fact that it is very
prone to corruption. There is no inherent locking mechanism that detects when a file is being used or
modified, and so this has to be done on the script level. Even if care is taken to lock and unlock the file on
each access, a busy script can cause a "race condition" and it is possible for a file to be wiped clean by
two or more processes that are fighting for the lock; the timing of your file locks will become more and
more important as a site gets busy.
Relational Database
The relational databases such as MySQL, Microsoft SQL Server and Oracle, have a much more logical
structure in the way that it stores data. Tables can be used to represent real world objects, with each fields
acting like an attribute. For example, a table called books could have the columns title, author and ISBN,
which describe the details of each book where each row in the table is a new book.
The "relation" comes from the fact that the tables can be linked to each other, for example the author of a
book could be cross-referenced with the authors table (assuming there was one) to provide more
information about the author. These kind of relations can be quite complex in nature, and would be hard
to replicate in the standard flat-file format.
One major advantage of the relational model is that, if a database is designed efficiently, there should be
no duplication of any data; helping to maintain database integrity. This can also represent a huge saving
in file size, which is important when dealing with large volumes of data. Having said that, joining large
8
tables to each other to get the data required for a query can be quite heavy on the processor; so in some
cases, particularly when data is read only, it can be beneficial to have some duplicate data in a relational
database.
Relational databases also have functions "built in" that help them to retrieve, sort and edit the data in
many different ways. These functions save script designers from having to worry about filtering out the
results that they get, and so can go quite some way to speeding up the development and production of
web applications.