Unit - 1
Unit - 1
A Database Management System (DBMS) is a collection of interrelated data and a set ofprograms
to access those data. The collection of data, usually referred to as the database, contains
information relevant to an enterprise.
The primary goal of a DBMS is to provide a way to store and retrieve database information that
is both convenient and efficient.
A database is a collection of related information stored, so that it is available to many users for
different purposes.
Management system is a collection of programs that enables users to create and maintain the
database.
DBMS can also be defined as an interface between the application program and the operating
system to access or manipulate the database.
DBMS is a software system that allows access to data contained in the [Link] objective is to
provide a convenient and effective method of defining, storing and retrieving the information
contained in the database.
The DBMS interfaces with application programs so that the data contained in the database can be
used by multiple application programs and users.
The primary goal of a DBMS is to provide an environment that is congenial and efficient to
retrieve and store [Link] allows users to store, update and retrieve data in abstract terms.
APPLICATIONS OF DBMS
----------------------------------------------------------------
2. Reduction in Inconsistency
3. Sharing of Data
4. Enforcement of Standards
8. Efficient System
----------------------------------------------------------------
1.6 DISADVANTAGES OF DBMS
2. Cost of Software
3. Cost of Hardware
----------------------------------------------------------------
Key features:
Simple Data Storage: Data is stored in files, and you access it by opening the
file.
File Organization: Files are organized into folders (directories), making it easy to
store and retrieve them.
File Formats: Data is saved in different formats like text, images, or documents.
Structured Data Storage: Data is stored in tables, rows, and columns, making it
easy to organize and access.
Data Integrity: DBMS ensures consistency and accuracy of data across tables
and records.
Advanced Querying: You can use SQL (Structured Query Language) to perform
complex queries and retrieve specific data.
Multi-User Access: DBMS supports concurrent access from multiple users,
ensuring smooth data handling in large organizations.
Simplicity: File systems are easy to use and understand. They are great for
simple data storage and retrieval tasks, especially when no complex queries are
required.
Low Cost: Setting up and using a file system is relatively inexpensive, as no
specialized software (other than the operating system) is needed.
Flexibility: You can store any type of file in a file system, from text files to
multimedia.
The three-schema architecture of DBMS hides the details of the database from
the user. Its other name is Three Levels of Abstraction. The database
administrator should be able to change the structure of the database according to
need without affecting the user’s view. This effective three-schema architecture
has three layers, which are given below
This is the highest level of database abstraction. The View / External level of
database abstraction refers to how data is presented to specific users or groups
of users. Instead of showing the entire database, it provides tailored views that
display only the information relevant to the user.
In simple words:
DDL stands for Data Definition Language. It is used to create and update the
schema in the database. It is used to update the database schema.
Create: This command is used to create the structure of new objects in the
database.
Alter: Alter command is used to change the structure of the database.
Drop: The drop command is used to delete objects from the database.
Truncate: Truncate command is used to remove all records from a table.
Rename: Rename command is used to rename an object.
2. Data Manipulation Language (DML)
DCL stands for Data Control Language. DCL is used to retrieve the saved data.
The DCL execution is transactional. It also has rollback parameters.
Transaction control language is used to update the changes made by the data
manipulation (DML) statement.
A Database User is defined as a person who interacts with data daily for updating,
reading, and modifying the given data. Database users can access and retrieve data from
the database through the Database Management System (DBMS) applications and
interfaces.
Database users are categorized based on their interaction with the database. There are
seven types of database users in DBMS. Below mentioned are the types of database
users:
Naïve users are normal users who do not have knowledge of database operations.
They use the database through predefined applications.
They do not write SQL queries.
Examples:
Bank teller
Railway reservation clerk
ATM user
Characteristics:
Limited access
Easy user interface
Perform routine tasks
2. Application Programmers
Examples:
Software developers
Web application developers
Responsibilities:
Examples:
Data analysts
Engineers
Researchers
4. Specialized Users
Examples:
The Database Administrator (DBA) is the person who has complete control over the
database system.
Definition:
Responsibilities of DBA
1. Database Design
3. Data Integrity
5. Performance Monitoring
6. Concurrency Control
7. Database Maintenance
Lets say your account is A and your friend’s account is B, you are transferring 10000
from A to B, the steps of the transaction are:
Transaction Operations
A - Atomicity
C - Consistency
I - Isolation
D - Durability
Transaction States
There are six major types of Transaction states, which are given below
1. Hardware
It includes:
2. Software
It includes:
DBMS software (e.g., MySQL, Oracle, PostgreSQL, SQL Server)
Operating System (Windows, Linux, UNIX)
Application programs that interact with the database
Utility programs for backup, recovery, and performance tuning
3. Data
Data is the core component of any DBMS. It represents real-world information stored in
the database.
Data includes:
User data (actual stored information like student records, employee details)
Metadata (data about data, such as table structure, data types, constraints)
Indexes for fast data access
Without data, a DBMS has no purpose.
4. Procedures
Procedures are the rules and instructions that guide users and administrators on how to
use and manage the database.
They include:
Procedures help maintain consistency, security, and smooth operation of the DBMS.
6. Database Users
Different types of users interact with the DBMS for different purposes.
Functions:
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The 3-Tier architecture is used in case of large web application.
Components of a DSS
Database Management System (DBMS)
data may come from internal or external database.
Model Management System
use to make decisions. Such models are used for designing manufacturing
facility, analyzing the financial health of an organization, forecasting demand of
a product or service, etc.
Support Tools
interfaces, graphical analysis, error correction mechanism, facilitates the user
interactions with the system.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity
doesn't contain any key attribute of its own. The weak entity is represented by a
double rectangle.
b. Composite Attribute
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a
multivalued attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is
known as one to one relationship.
For example, A female can marry to one male, and a male can marry to one
female
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of
an entity on the right associates with the relationship then this is known as a one-
to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by
the only specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of
an entity on the right associates with the relationship then it is known as a many-
to-one relationship.
For example, Student enrolls for only one course, but a course can have many
students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one
instance of an entity on the right associates with the relationship then it is known
as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many
employees.
o A mapping constraint is a data constraint that expresses the number of entities to which
another entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two entity sets.
o For binary relationship set R on an entity set A and B, there are four possible mapping
cardinalities. These are as follows:
1. One to one (1:1)
2. One to many (1:M)
3. Many to one (M:1)
4. Many to many (M:M)
In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and
an entity in E2 is associated with at most one entity in E1.
In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and
an entity in E2 is associated with any number of entities in E1.
In many-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with any number of entities in E1.
Participation Constraints
Total Participation
participation is represented by double lines.
Partial participation
participation is represented by single lines.
However, users often mislead the concept of the elements and the design process of the
ER diagram. Thus, it leads to a complex structure of the ER diagram and certain issues
that does not meet the characteristics of the real-world enterprise model.
Here, we will discuss the basic design issues of an ER database schema in the following
points:
The use of an entity set or attribute depends on the structure of the real-world
enterprise that is being modelled and the semantics associated with its attributes. It
leads to a mistake when the user use the primary key of an entity set as an attribute of
another entity set. Instead, he should use the relationship to do so. Also, the primary key
attributes are implicit in the relationship set, but we designate it in the relationship sets.
It is difficult to examine if an object can be best expressed by an entity set or relationship set. To
understand and determine the right use, the user need to designate a relationship set for
describing an action that occurs in-between the entities. If there is a requirement of
representing the object as a relationship set, then its better not to mix it with the entity set.
Generally, the relationships described in the databases are binary relationships. However, non-
binary relationships can be represented by several binary relationships. For example, we can
create and represent a ternary relationship 'parent' that may relate to a child, his father, as well
as his mother. Thus, it is possible to represent a non-binary relationship by a set of distinct
binary relationships.
The cardinality ratios can become an affective measure in the placement of the
relationship attributes. So, it is better to associate the attributes of one-to-one or one-
to-many relationship sets with any participating entity sets, instead of any relationship
set.
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.
For example, ID is used as a key in the Student table because it is unique for each
student. In the PERSON table, passport_number, license_number, SSN are keys since
they are unique for each person.
Super key is an attribute set that can uniquely identify a tuple. A super key is a superset
of a candidate key.
o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key. The
candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of
the attributes, like SSN, Passport_Number, License_Number, etc., are considered a
candidate key.
o It is the key used to identify one and only one instance of an entity uniquely. An entity
can contain multiple keys, as we saw in the PERSON table. The key which is most suitable
from those lists becomes a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each employee. In
the EMPLOYEE table, we can even select License_Number and Passport_Number as
primary keys since they are also unique.
o For each entity, the primary key selection is based on requirements and developers.
o Foreign keys are the column of the table used to point to the primary key of another
table.
o Every employee works in a specific department in a company, and employee and
department are two different entities. So we can't store the department's information in
the employee table. That's why we link these two tables through the primary key of one
table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in
the EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.
For example, employee relation has two attributes, Employee_Id and PAN_No, that act
as candidate keys. In this relation, Employee_Id is chosen as the primary key, so the
other candidate key, PAN_No, acts as the Alternate key.
Whenever a primary key consists of more than one attribute, it is known as a composite
key. This key is also known as Concatenated Key.
For example, the primary key, which is composed of Emp_ID, Emp_role, and Proj_ID, is
large in employee relations. So it would be better to add a new virtual attribute to
identify each tuple in the relation uniquely.
E-R Diagram:
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data
models used for understanding the structure of the database:
1) Relational Data Model: This type of model designs the data in the form of rows and
columns within a table. Thus, a relational model uses tables for representing data and
in-between relationships. Tables are also called relations. This model was initially
described by Edgar F. Codd, in 1969. The relational data model is the widely used model
which is primarily used by commercial data processing applications.
4) Semistructured Data Model: This type of data model is different from the other
three data models (explained above). The semistructured data model allows the data
specifications at places where the individual data items of the same type may have
different attributes sets. The Extensible Markup Language, also known as XML, is widely
used for representing the semistructured data. Although XML was initially designed for
including the markup information to the text document, it gains importance because of
its application in the exchange of data.
Database Schema
A database schema is the skeleton structure that represents the logical view of the
entire database. It defines how the data is organized and how the relations among
them are associated. It formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a
descriptive detail of the database, which can be depicted by means of schema
Database Instance
It is important that we distinguish these two terms individually. Database schema is the
skeleton of database. It is designed when the database doesn't exist at all. Once the
database is operational, it is very difficult to make any changes to it. A database
schema does not contain any data or information.
A database instance is a state of operational database with data at any given time. It
contains a snapshot of the database. Database instances tend to change with time. A
DBMS ensures that its every instance (state) is in a valid state, by diligently following
all the validations, constraints, and conditions that the database designers have
imposed
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual
view.
o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
o Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.
o A DBMS has appropriate languages and interfaces to express database queries and
updates.
o Database languages can be used to read, store and update the data in the database.