0% found this document useful (0 votes)
16 views21 pages

Introduction to Database Management Systems

The document introduces Database Management Systems (DBMS), explaining data, databases, and the role of DBMS in managing data efficiently. It outlines the characteristics, advantages, and disadvantages of DBMS, compares it with traditional file systems, and describes different types of databases including centralized, distributed, relational, NoSQL, cloud, object-oriented, hierarchical, and network databases. Additionally, it covers the concept of RDBMS and its foundational principles based on the relational model proposed by E.F. Codd.

Uploaded by

rockboyaim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views21 pages

Introduction to Database Management Systems

The document introduces Database Management Systems (DBMS), explaining data, databases, and the role of DBMS in managing data efficiently. It outlines the characteristics, advantages, and disadvantages of DBMS, compares it with traditional file systems, and describes different types of databases including centralized, distributed, relational, NoSQL, cloud, object-oriented, hierarchical, and network databases. Additionally, it covers the concept of RDBMS and its foundational principles based on the relational model proposed by E.F. Codd.

Uploaded by

rockboyaim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

INTRODUCTION OF DATABASE MANAGEMENT SYSTEM

What is Data?

Data is a collection of a distinct small unit of information. It can be used in a variety of forms like text,
numbers, media, bytes, etc. it can be stored in pieces of paper or electronic memory, etc.

Word 'Data' is originated from the word 'datum' that means 'single piece of information.' It is plural of the
word datum.

In computing, Data is information that can be translated into a form for efficient movement and processing.
Data is interchangeable.

What is Database

The database is a collection of inter-related data which is used to retrieve, insert and delete the data
efficiently. It is also used to organize the data in the form of a table, schema, views, and reports, etc.

For example: The college Database organizes the data about the admin, staff, students and faculty etc.

Using the database, you can easily retrieve, insert, and delete the information.

Database Management System

o Database management system is a software which is used to manage the database. For
example: MySQL, Oracle, etc are a very popular commercial database which is used in different
applications.
o DBMS provides an interface to perform various operations like database creation, storing data in it,
updating data, creating a table in the database and a lot more.
o It provides protection and security to the database. In the case of multiple users, it also maintains
data consistency.

DBMS allows users the following tasks:

o Data Definition: It is used for creation, modification, and removal of definition that defines the
organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the actual data in the
database.
o Data Retrieval: It is used to retrieve the data from the database which can be used by applications
for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data integrity,
enforcing data security, dealing with concurrency control, monitoring performance and recovering
information corrupted by unexpected failure.

Characteristics of DBMS

o It uses a digital repository established on a server to store and manage the information.
o It can provide a clear and logical view of the process that manipulates data.
o DBMS contains automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case of failure.
o It can reduce the complex relationship between data.
o It is used to support manipulation and processing of data.
o It is used to provide security of data.
o It can view the database from different viewpoints according to the requirements of the user.

Advantages of DBMS

o Controls database redundancy: It can control data redundancy because it stores all the data in one
single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the data among multiple
users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of the database
system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic backup of data
from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like graphical user interfaces,
application program interfaces

Disadvantages of DBMS

o Cost of Hardware and Software: It requires a high speed of data processor and large memory size
to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.
o Higher impact of failure: Failure is highly impacted the database because in most of the
organization, all the data stored in a single database and if the database is damaged due to electric
failure or database corruption then the data may be lost forever.
DIFFERENCE BETWEEN FILE SYSTEM AND DBMS
File System Approach

File based systems were an early attempt to computerize the manual system. It is also called a traditional
based approach in which a decentralized approach was taken where each department stored and controlled
its own data with the help of a data processing specialist. The main role of a data processing specialist was
to create the necessary computer file structures, and also manage the data within structures and design some
application programs that create reports based on file data.

In the above figure:

Consider an example of a student's file system. The student file will contain information regarding the
student (i.e. roll no, student name, course etc.). Similarly, we have a subject file that contains information
about the subject and the result file which contains the information regarding the result.

Some fields are duplicated in more than one file, which leads to data redundancy. So to overcome this
problem, we need to create a centralized system, i.e. DBMS approach.

DBMS:

A database approach is a well-organized collection of data that are related in a meaningful way which can be
accessed by different users but stored only once in a system. The various operations performed by the
DBMS system are: Insertion, deletion, selection, sorting etc.
In the above figure,

In the above figure, duplication of data is reduced due to centralization of data.

There are the following differences between DBMS and File systems:

Basis DBMS Approach File System Approach

Meaning DBMS is a collection of data. In The file system is a collection of data. In


DBMS, the user is not required to this system, the user has to write the
write the procedures. procedures for managing the database.

Sharing of data Due to the centralized approach, data Data is distributed in many files, and it may
sharing is easy. be of different formats, so it isn't easy to
share data.

Data Abstraction DBMS gives an abstract view of data The file system provides the detail of the
that hides the details. data representation and storage of data.

Security and DBMS provides a good protection It isn't easy to protect a file under the file
Protection mechanism. system.

Recovery DBMS provides a crash recovery The file system doesn't have a crash
Mechanism mechanism, i.e., DBMS protects the mechanism, i.e., if the system crashes while
user from system failure. entering some data, then the content of the
file will be lost.

Manipulation DBMS contains a wide variety of The file system can't efficiently store and
Techniques sophisticated techniques to store and retrieve the data.
retrieve the data.

Concurrency DBMS takes care of Concurrent In the File system, concurrent access has
Problems access of data using some form of many problems like redirecting the file
locking. while deleting some information or
updating some information.

Where to use Database approach used in large File system approach used in large systems
systems which interrelate many files. which interrelate many files.

Cost The database system is expensive to The file system approach is cheaper to
design. design.

Data Redundancy Due to the centralization of the In this, the files and application programs
and Inconsistency database, the problems of data are created by different programmers so
redundancy and inconsistency are that there exists a lot of duplication of data
controlled. which may lead to inconsistency.

Structure The database structure is complex to The file system approach has a simple
design. structure.
Data In this system, Data Independence In the File system approach, there exists no
Independence exists, and it can be of two types. Data Independence.
o Logical Data Independence
o Physical Data Independence

Integrity Integrity Constraints are easy to Integrity Constraints are difficult to


Constraints apply. implement in file system.

Examples Oracle, SQL Server, Sybase etc. Cobol, C++ etc.


TYPES OF DATABASES
There are various types of databases used for storing different varieties of data:

1) Centralized Database
It is the type of database that stores data at a centralized database system. It comforts the users to access the
stored data from different locations through several applications. These applications contain the
authentication process to let users access data securely. An example of a Centralized database can be Central
Library that carries a central database of each library in a college/university.

Advantages of Centralized Database


o It has decreased the risk of data management, i.e., manipulation of data will not affect the core data.
o Data consistency is maintained as it manages data in a central repository.
o It provides better data quality, which enables organizations to establish data standards.
o It is less costly because fewer vendors are required to handle the data sets.

Disadvantages of Centralized Database


o The size of the centralized database is large, which increases the response time for fetching the data.
o It is not easy to update such an extensive database system.
o If any server failure occurs, entire data will be lost, which could be a huge loss.

2) Distributed Database

Unlike a centralized database system, in distributed systems, data is distributed among different database
systems of an organization. These database systems are connected via communication links. Such links help
the end-users to access the data easily. Examples of the Distributed database are Apache Cassandra, HBase,
Ignite, etc.

We can further divide a distributed database system into:


o Homogeneous DDB: Those database systems which execute on the same operating system and use
the same application process and carry the same hardware devices.
o Heterogeneous DDB: Those database systems which execute on different operating systems under
different application procedures, and carries different hardware devices.

Advantages of Distributed Database


o Modular development is possible in a distributed database, i.e., the system can be expanded by
including new computers and connecting them to the distributed system.
o One server failure will not affect the entire data set.

3) Relational Database

This database is based on the relational data model, which stores data in the form of rows(tuple) and
columns(attributes), and together forms a table(relation). A relational database uses SQL for storing,
manipulating, as well as maintaining the data. E.F. Codd invented the database in 1970. Each table in the
database carries a key that makes the data unique from others. Examples of Relational databases are
MySQL, Microsoft SQL Server, Oracle, etc.

Properties of Relational Database

There are following four commonly known properties of a relational model known as ACID properties,
where:

A means Atomicity: This ensures the data operation will complete either with success or with failure. It
follows the 'all or nothing' strategy. For example, a transaction will either be committed or will abort.

C means Consistency: If we perform any operation over the data, its value before and after the operation
should be preserved. For example, the account balance before and after the transaction should be correct,
i.e., it should remain conserved.

I means Isolation: There can be concurrent users for accessing data at the same time from the database.
Thus, isolation between the data should remain isolated. For example, when multiple transactions occur at
the same time, one transaction effects should not be visible to the other transactions in the database.

D means Durability: It ensures that once it completes the operation and commits the data, data changes
should remain permanent.

4) NoSQL Database
Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of data sets. It is not a
relational database as it stores data not only in tabular form but in several different ways. It came into
existence when the demand for building modern applications increased. Thus, NoSQL presented a wide
variety of database technologies in response to the demands. We can further divide a NoSQL database into
the following four types:

a. Key-value storage: It is the simplest type of database storage where it stores every single item as a
key (or attribute name) holding its value, together.
b. Document-oriented Database: A type of database used to store data as JSON-like document. It
helps developers in storing data by using the same document-model format as used in the application
code.
c. Graph Databases: It is used for storing vast amounts of data in a graph-like structure. Most
commonly, social networking websites use the graph database.
d. Wide-column stores: It is similar to the data represented in relational databases. Here, data is stored
in large columns together, instead of storing in rows.

Advantages of NoSQL Database


o It enables good productivity in the application development as it is not required to store data in a
structured format.
o It is a better option for managing and handling large data sets.
o It provides high scalability.
o Users can quickly access data from the database through key-value.

5) Cloud Database

A type of database where data is stored in a virtual environment and executes over the cloud computing
platform. It provides users with various cloud computing services (SaaS, PaaS, IaaS, etc.) for accessing the
database. There are numerous cloud platforms, but the best options are:

o Amazon Web Services(AWS)


o Microsoft Azure
o Kamatera
o PhonixNAP
o ScienceSoft
o Google Cloud SQL, etc.
6) Object-oriented Databases
The type of database that uses the object-based data model approach for storing data in the database system.
The data is represented and stored as objects which are similar to the objects used in the object-oriented
programming language.

7) Hierarchical Databases

It is the type of database that stores data in the form of parent-children relationship nodes. Here, it organizes
data in a tree-like structure.

Data get stored in the form of records that are connected via links. Each child record in the tree will contain
only one parent. On the other hand, each parent record can have multiple child records.

8) Network Databases

It is the database that typically follows the network data model. Here, the representation of data is in the
form of nodes connected via links between them. Unlike the hierarchical database, it allows each record to
have multiple children and parent nodes to form a generalized graph structure.
RDBMS (RELATIONAL DATABASE
MANAGEMENT SYSTEM)
RDBMS stands for Relational Database Management System.

All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL, and
Microsoft Access are based on RDBMS.

It is called Relational Database Management System (RDBMS) because it is based on the relational model
introduced by E.F. Codd.

How it works

Data is represented in terms of tuples (rows) in RDBMS.

A relational database is the most commonly used database. It contains several tables, and each table has its
primary key.

Due to a collection of an organized set of tables, data can be accessed easily in RDBMS.

Brief History of RDBMS

From 1970 to 1972, E.F. Codd published a paper to propose using a relational database model.

RDBMS is originally based on E.F. Codd's relational model invention.

Following are the various terminologies of RDBMS:

What is table/Relation?

Everything in a relational database is stored in the form of relations. The RDBMS database uses tables to
store data. A table is a collection of related data entries and contains rows and columns to store data. Each
table represents some real-world objects such as person, place, or event about which information is
collected. The organized collection of data into a relational table is known as the logical view of the
database.
Properties of a Relation:

o Each relation has a unique name by which it is identified in the database.


o Relation does not contain duplicate tuples.
o The tuples of a relation have no specific order.
o All attributes in a relation are atomic, i.e., each cell of a relation contains exactly one value.

A table is the simplest example of data stored in RDBMS.

Let's see the example of the student table.

ID Name AGE COURSE

1 Ajeet 24 [Link]

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA

5 Vimal 26 BSC

What is a row or record?

A row of a table is also called a record or tuple. It contains the specific information of each entry in the
table. It is a horizontal entity in the table. For example, The above table contains 5 records.

Properties of a row:

o No two tuples are identical to each other in all their entries.


o All tuples of the relation have the same format and the same number of entries.
o The order of the tuple is irrelevant. They are identified by their content, not by their position.

Let's see one record/row in the table.

ID Name AGE COURSE

1 Ajeet 24 [Link]

What is a column/attribute?

A column is a vertical entity in the table which contains all information associated with a specific field in a
table. For example, "name" is a column in the above table which contains all information about a student's
name.

Properties of an Attribute:

o Every attribute of a relation must have a name.


o Null values are permitted for the attributes.
o Default values can be specified for an attribute automatically inserted if no other value is specified
for an attribute.
o Attributes that uniquely identify each tuple of a relation are the primary key.

Name

Ajeet

Aryan

Mahesh

Ratan

Vimal

What is data item/Cells?

The smallest unit of data in the table is the individual data item. It is stored at the intersection of tuples and
attributes.

Properties of data items:

o Data items are atomic.


o The data items for an attribute should be drawn from the same domain.

In the below example, the data item in the student table consists of Ajeet, 24 and Btech, etc.

ID Name AGE COURSE

1 Ajeet 24 [Link]

Degree:

The total number of attributes that comprise a relation is known as the degree of the table.

For example, the student table has 4 attributes, and its degree is 4.

ID Name AGE COURSE

1 Ajeet 24 [Link]

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA
5 Vimal 26 BSC

Cardinality:

The total number of tuples at any one time in a relation is known as the table's cardinality. The relation
whose cardinality is 0 is called an empty table.

For example, the student table has 5 rows, and its cardinality is 5.

ID Name AGE COURSE

1 Ajeet 24 [Link]

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA

5 Vimal 26 BSC

Domain:

The domain refers to the possible values each attribute can contain. It can be specified using standard data
types such as integers, floating numbers, etc. For example, An attribute entitled Marital_Status may be
limited to married or unmarried values.

NULL Values

The NULL value of the table specifies that the field has been left blank during record creation. It is different
from the value filled with zero or a field that contains space.

Data Integrity

There are the following categories of data integrity exist with each RDBMS:

Entity integrity: It specifies that there should be no duplicate rows in a table.

Domain integrity: It enforces valid entries for a given column by restricting the type, the format, or the
range of values.

Referential integrity specifies that rows cannot be deleted, which are used by other records.

User-defined integrity: It enforces some specific business rules defined by users. These rules are different
from the entity, domain, or referential integrity.
DIFFERENCE BETWEEN DBMS AND RDBMS

No. DBMS RDBMS

1) DBMS applications store data as file. RDBMS applications store data in a tabular form.

2) In DBMS, data is generally stored in In RDBMS, the tables have an identifier called primary
either a hierarchical form or a key and the data values are stored in the form of tables.
navigational form.

3) Normalization is not present in Normalization is present in RDBMS.


DBMS.

4) DBMS does not apply any RDBMS defines the integrity constraint for the purpose
security with regards to data of ACID (Atomocity, Consistency, Isolation and
manipulation. Durability) property.

5) DBMS uses file system to store data, in RDBMS, data values are stored in the form of tables,
so there will be no relation between so a relationship between these data values will be stored
the tables. in the form of a table as well.

6) DBMS has to provide some uniform RDBMS system supports a tabular structure of the data
methods to access the stored and a relationship between them to access the stored
information. information.

7) DBMS does not support distributed RDBMS supports distributed database.


database.

8) DBMS is meant to be for small RDBMS is designed to handle large amount of data. it
organization and deal with small data. supports multiple users.
it supports single user.

9) Examples of DBMS are file Example of RDBMS are mysql, postgre, sql
systems, xml etc. server, oracle etc.
DBMS ARCHITECTURE
o The DBMS design depends upon its architecture. The basic client/server architecture is used to deal
with a large number of PCs, web servers, database servers and other components that are connected
with networks.
o The client/server architecture consists of many PCs and a workstation which are connected via the
network.
o DBMS architecture depends upon how users are connected to the database to get their request done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of two
types like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can directly sit on
the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy tool for
end users.
o The 1-Tier architecture is used for development of the local application, where programmers can
directly communicate with the database for the quick response.

2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on the
client end can directly communicate with the database at the server side. For this interaction, API's
like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and transaction
management.
o To communicate with the DBMS, client-side application establishes a connection with the server
side.
Fig: 2-tier Architecture

3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this architecture,
client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further communicates
with the database system.
o End user has no idea about the existence of the database beyond the application server. The database
also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
THREE SCHEMA ARCHITECTURE
o The three schema architecture is also called ANSI/SPARC architecture or three-level architecture.
o This framework is used to describe the structure of a specific database system.
o The three schema architecture is also used to separate the user applications and physical database.
o The three schema architecture contains three-levels. It breaks the database down into three different
categories.

The three-schema architecture is as follows:

In the above diagram:

o It shows the DBMS architecture.


o Mapping is used to transform the request and response between various database levels of
architecture.
o Mapping is not good for small DBMS because it takes more time.
o In External / Conceptual mapping, it is necessary to transform the request from external level to
conceptual schema.
o In Conceptual / Internal mapping, DBMS transform the request from the conceptual to internal level.

Objectives of Three schema Architecture

The main objective of three level architecture is to enable multiple users to access the same data with a
personalized view while storing the underlying data only once. Thus it separates the user's view from the
physical structure of the database. This separation is desirable for the following reasons:

o Different users need different views of the same data.


o The approach in which a particular user needs to see the data may change over time.
o The users of the database should not worry about the physical implementation and internal workings
of the database such as data compression and encryption techniques, hashing, optimization of the
internal structures etc.
o All users should be able to access the same data according to their requirements.
o DBA should be able to change the conceptual structure of the database without affecting the user's
o Internal structure of the database should be unaffected by changes to physical aspects of the storage.

1. Internal Level

o The internal level has an internal schema which describes the physical storage structure of the
database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored in a block.
o The physical level is used to describe complex low-level data structures in detail.

The internal level is generally is concerned with the following activities:

 Storage space allocations.


For Example: B-Trees, Hashing etc.
 Access paths.
For Example: Specification of primary and secondary keys, indexes, pointers and sequencing.
 Data compression and encryption techniques.
 Optimization of internal structures.
 Representation of stored fields.

2. Conceptual Level
o The conceptual schema describes the design of a database at the conceptual level. Conceptual level is
also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and also describes what
relationship exists among those data.
o In the conceptual level, internal details such as an implementation of the data structure are hidden.
o Programmers and database administrators work at this level.

3. External Level

o At the external level, a database contains several schemas that sometimes called as subschema. The
subschema is used to describe the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is interested and hides the
remaining database from that user group.
o The view schema describes the end user interaction with database systems.
DATA INDEPENDENCE
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence

o Logical data independence refers characteristic of being able to change the conceptual schema
without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the data would not be
affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence

o Physical data independence can be defined as the capacity to change the internal schema without
having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the Conceptual structure
of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.
DATA MODEL SCHEMA AND INSTANCE
o The data which is stored in the database at a particular moment of time is called an instance of the
database.
o The overall design of a database is called schema.
o A database schema is the skeleton structure of the database. It represents the logical view of the
entire database.
o A schema contains schema objects like table, foreign key, primary key, views, columns, data types,
stored procedure, etc.
o A database schema can be represented by using the visual diagram. That diagram shows the database
objects and relationship with each other.
o A database schema is designed by the database designers to help programmers whose software will
interact with the database. The process of database creation is called data modeling.

A schema diagram can display only some aspects of a schema like the name of record type, data type, and
constraints. Other aspects can't be specified through the schema diagram. For example, the given figure
neither show the data type of each data item nor the relationship among various files.

You might also like