Introduction to Database Management Systems
Introduction to Database Management Systems
Model
A Database Management System (DBMS) is a software solution designed to efficiently manage, organize,
and retrieve data in a structured manner. It serves as a critical component in modern
computing, enabling organizations to store, manipulate, and secure their data effectively. From small
applications to enterprise systems, DBMS plays a vital role in supporting data-driven decision-making and
operational efficiency.
What is a DBMS?
A DBMS is a system that allows users to create, modify, and query databases while ensuring data
integrity, security, and efficient data access. Unlike traditional file systems, DBMS minimizes data
redundancy, prevents inconsistencies, and simplifies data management with features like concurrent access
and backup mechanisms. It organizes data into tables, views, schemas, and reports, providing a structured
approach to data management.
Example:
A university database can store and manage student information, faculty records, and administrative data,
allowing seamless retrieval, insertion, and deletion of information as required.
Key Features of DBMS
1. Data Modeling: Tools to create and modify data models, defining the structure and relationships within
the database.
2. Data Storage and Retrieval: Efficient mechanisms for storing data and executing queries to retrieve it
quickly.
3. Concurrency Control: Ensures multiple users can access the database simultaneously without conflicts.
4. Data Integrity and Security: Enforces rules to maintain accurate and secure data, including access
controls and encryption.
5. Backup and Recovery: Protects data with regular backups and enables recovery in case of system
failures.
Types of DBMS
There are several types of Database Management Systems (DBMS), each tailored to different data structures,
scalability requirements, and application needs. The most common types are as follows:
1. Relational Database Management System (RDBMS)
RDBMS organizes data into tables (relations) composed of rows and columns. It uses primary keys to
uniquely identify rows and foreign keys to establish relationships between tables. Queries are written in SQL
(Structured Query Language), which allows for efficient data manipulation and retrieval.
Examples: MySQL, Oracle, Microsoft SQL Server and Postgre SQL.
2. NoSQL DBMS
NoSQL systems are designed to handle large-scale data and provide high performance for scenarios
where relational models might be restrictive. They store data in various non-relational formats, such as key-
value pairs, documents, graphs, or columns. These flexible data models enable rapid scaling and are well-
suited for unstructured or semi-structured data.
Examples: MongoDB, Cassandra, DynamoDB and Redis.
3. Object-Oriented DBMS (OODBMS)
OODBMS integrates object-oriented programming concepts into the database environment, allowing data to
be stored as objects. This approach supports complex data types and relationships, making it ideal for
applications requiring advanced data modeling and real-world simulations.
Examples: ObjectDB, db4o.
Database Languages
Database languages are specialized sets of commands and instructions used to define, manipulate, and
control data within a database. Each language type plays a distinct role in database management, ensuring
efficient storage, retrieval, and security of data. The primary database languages include:
1. Data Definition Language (DDL)
DDL is the short name for Data Definition Language, which deals with database schemas and descriptions, of
how the data should reside in the database.
CREATE: to create a database and its objects like (table, index, views, store procedure, function, and
triggers)
ALTER: alters the structure of the existing database
DROP: delete objects from the database
TRUNCATE: remove all records from a table, including all spaces allocated for the records are removed
COMMENT: add comments to the data dictionary
RENAME: rename an object
2. Data Manipulation Language (DML)
DML focuses on manipulating the data stored in the database, enabling users to retrieve, add, update, and
delete data.
SELECT: retrieve data from a database
INSERT: insert data into a table
UPDATE: updates existing data within a table
DELETE: Delete all records from a database table
MERGE: UPSERT operation (insert or update)
CALL: call a PL/SQL or Java subprogram
EXPLAIN PLAN: interpretation of the data access path
LOCK TABLE: concurrency Control
3. Data Control Language (DCL)
DCL commands manage access permissions, ensuring data security by controlling who can perform certain
actions on the database.
GRANT: Provides specific privileges to a user (e.g., SELECT, INSERT).
REVOKE: Removes previously granted permissions from a user.
4. Transaction Control Language (TCL)
TCL commands oversee transactional data to maintain consistency, reliability, and atomicity.
ROLLBACK: Undoes changes made during a transaction.
COMMIT: Saves all changes made during a transaction.
SAVEPOINT: Sets a point within a transaction to which one can later roll back.
5. Data Query Language (DQL)
DQL is a subset of DML, specifically focused on data retrieval.
SELECT: The primary DQL command, used to query data from the database without altering its
structure or contents.
Database system applications: -
Database systems have a wide range of applications across various industries. They are used to efficiently store,
manage, and retrieve data, enabling better decision-making and improved operational efficiency. Key areas of
application include finance, healthcare, retail, and e-commerce, among others.
E-commerce:
Databases are essential for managing product catalogs, customer information, orders, and payment processing
on e-commerce platforms.
Healthcare:
They are used to store and manage patient records, medical history, appointments, and other critical
healthcare information.
Finance:
Banks, financial institutions, and other related businesses rely on databases to manage accounts, transactions,
customer data, and financial instruments.
Retail:
Retailers use databases to track inventory, manage sales data, and analyze customer behavior.
Education:
Educational institutions utilize databases for student information management, course enrollment, grades, and
other administrative tasks.
Transportation:
Airline and railway reservation systems rely on databases to manage bookings, schedules, and passenger
information.
Manufacturing:
Databases are crucial for managing supply chains, production processes, inventory, and quality control in
manufacturing.
Human Resources:
Companies use databases to manage employee information, payroll, benefits, and other HR-related data.
Government:
Databases play a vital role in government for managing public records, tax information, and other citizen
data.
Telecommunications:
Telecom companies use databases to track call details, network usage, and customer information.
Social Media:
Social media platforms rely on databases to store user profiles, posts, messages, and social connections.
Real Estate:
Real estate platforms use databases to manage property listings, sales records, and customer information.
Abstraction is one of the main features of database systems. Hiding irrelevant details from user and providing
abstract view of data to users, helps in easy and efficient user-database interaction. In the previous tutorial, we
discussed the three level of DBMS architecture, The top level of that architecture is “view level”. The view level
provides the “view of data” to the users and hides the irrelevant details such as data relationship, database
schema, constraints, security etc from the user.
To fully understand the view of data, you must have a basic knowledge of data abstraction and instance &
schema. Refer these two tutorials to learn them in detail.
1. Data abstraction:Database systems are made-up of complex data structures. To ease the user interaction
with database, the developers hide internal irrelevant details from users. This process of hiding irrelevant
details from user is called data abstraction.
2. Instance and schema: Design of a database is called the schema. Schema is of three types: Physical
schema, logical schema and view schema. The data stored in database at a particular moment of time is
called instance of database. Database schema defines the variable declarations in tables that belong to a
particular database; the value of these variables at a moment of time is called the instance of that
database.
Integrity Manager: It checks the integrity constraints when the database is modified.
Transaction Manager: It controls concurrent access by performing the operations in a scheduled way
that it receives the transaction. Thus, it ensures that the database remains in the consistent state before
and after the execution of a transaction.
File Manager: It manages the file space and the data structure used to represent information in the
database.
Buffer Manager: It is responsible for cache memory and the transfer of data between the secondary
storage and main memory.
3. Disk Storage
It contains the following essential components:
Data Files: It stores the actual data in the database.
Data Dictionary: It contains the information about the structure of database objects such as tables,
constraints, and relationships. It is the repository of information that governs the metadata.
Indices: Provides faster data retrieval by allowing the DBMS to find records quickly, improving query
performance.
Data Models in DBMS: -
Data models in DBMS (Database Management System) are frameworks that define how data is structured,
stored, and managed within a database. They provide a way to represent real-world entities and their
relationships, enabling efficient data storage, retrieval, and manipulation. Different data models exist, each with
its own strengths and weaknesses, catering to various needs and complexities of data management.
1. Hierarchical Model:
Data is organized in a tree-like structure with parent-child relationships, where each child has only one
parent.
2. Network Model:
Extends the hierarchical model by allowing a child to have multiple parents, creating a more flexible
structure.
3. Relational Model:
The most widely used model, it represents data in tables (relations) with rows (records) and columns
(attributes), using relationships between tables via primary and foreign keys.
A conceptual model that uses entities, attributes, and relationships to represent data and their connections,
often used for database design.
5. Object-Oriented Model:
Represents data as objects with attributes and methods, similar to object-oriented programming, allowing for
complex data structures and relationships.
6. NoSQL Models:
A category of models that deviate from the relational model, often used for large-scale data and unstructured
data, including key-value, document, graph, and column-family models.
7. Semi-structured Model:
A model that does not have a rigid structure like relational models, but still contains organizational properties
like tags to define data.
8. Flat Model:
A simple model where all data is stored in a single table or a two-dimensional array.
Entity in DBMS: -
An entity is a "thing" or "object" in the real world. An entity contains attributes, which describe that entity.
So anything about which we store information is called an entity. Entities are recorded in the database and
must be distinguishable, i.e., easily recognized from the group.
For example: A student, An employee, or bank a/c, etc. all are entities.
Entity Set
An entity set is a collection of similar types of entities that share the same attributes.
For example: All students of a school are a entity set of Student entities.
Key Terminologies used in Entity Set:
Attributes: Attributes are the houses or traits of an entity. They describe the data that may be connected
with an entity.
Entity Type: A category or class of entities that share the same attributes is referred to as an entity kind.
Entity Instance: An entity example is a particular incidence or character entity within an entity type.
Each entity instance has a unique identity, often known as the number one key.
Primary Key: A primary key is a unique identifier for every entity instance inside an entity kind.
It can be classified into two types:
Strong Entity Set
Strong entity sets exist independently and each instance of a strong entity set has a unique primary
key.
Example of Strong Entity includes:
Car Registration Number
Model
Name etc.
Weak Entity Set
A weak entity cannot exist on its own; it is dependent on a strong entity to identify it. A weak entity does not
have a single primary key that uniquely identifies it; instead, it has a partial key.
Example of Weak Entity Set includes:
Laptop Color
RAM, etc.
Attributes in DBMS: -
Attributes are properties or characteristics of an entity. Attributes are used to describe the entity. The attribute
is nothing but a piece of data that gives more information about the entity. Attributes are used to distinguish
one entity from the other entity. Attributes help to categorize the entity and the entity can be easily retrieved
and manipulate the entity. Attributes can help the database to be more structural and hierarchical. An entity
with no attribute is of no use in the database.
Example
Let's take the student as an entity. Students will have multiple attributes such as roll number, name, and
class.
These attributes are used to describe the student in more detail.
As shown in the figure, roll_no, name, and class are the attributes of the entity Student.
All three attributes give meaning to the entity.
Types Of Attribute
There are 8 types of attributes in DBMS.
Simple Attribute.
Composite Attribute.
Single Valued Attribute.
Multivalued Attribute.
Key Attribute.
Derived Attribute.
Stored Attribute.
Complex Attribute.
Simple Attribute
Simple attributes are those attributes that cannot be divided into more attributes. Simple attributes state
the simple information about the entity such as name, roll_no, class, age, etc.
Simple attributes are widely used for storing information about the entity.
Example
Here in the below example, Student has roll_no, class, and name as attributes that cannot be divided into
more sub-attributes.
These types of attributes are called simple attributes.
Simple attributes are mainly used to create all other types of attributes.
Composite Attribute
When 2 or more than 2 simple attributes are combined to make an attribute then that attribute is called
a Composite attribute.
The composite attribute is made up of multiple attributes. After combining these attributes, the composed
attributes are formed.
Complex attributes are used where data is complex and needs to be stored in a complex structure.
Example
Here if we look at the below example, address is the attribute derived from the 3 simple attributes i.e.
City, State, and Street.
To get the value of the address attribute, we have first to know those city, state, and street attributes.
This type of attribute is known as a composite attribute.
Multivalued Attribute
An attribute which can have multiple values is known as a multivalued attribute. Multivalued attributes
have multiple values for the single instance of an entity.
Keu of entity is associated with multiple values. It does not have only one value. It is the opposite of the
single-valued attribute.
Example
Here the student has an attribute named phone_no. One student can have multiple phone_no, so we can
say that phone_no can have multiple values.
These types of attributes are known as multi-valued attributes.
Multi-valued attributes are used when more than 1 entries for one attribute need to be stored in the
Database.
Key Attribute
The attribute which has unique values for every row in the table is known as a Key Attribute. The key
attribute has a very crucial role in the database.
The key attribute is a distinct and unique characteristic of the entity that can be used to identify the entity
uniquely.
Example
For students, we can identify every student with roll_no because each student will have a unique roll_no.
This indicates that roll_no will be a Key attribute for the Student entity.
All operations on the database can be performed only using Key Attributes.
Derived Attribute
The attribute that can be derived from the other attributes and does not require to be already present in
the database is called a Derived Attribute.
Derived attributes are not stored in the Database directly. It is calculated by using the stored attributes in
the database.
Example
Here the student has multiple attributes including DOB and age. It is observed that age can be calculated
with the help of the DOB attribute.
So age is a derived attribute that is derived from an attribute named DOB.
Stored Attribute
If the data of the attribute remains constant for every instance of entity then it is called a Stored
Attribute.
The value of the attribute present in the database does not get updated and it remains constant once it is
stored.
These attributes are used to store permanent information about an entity which will remain constant
throughout the lifetime of the entity.
Example
The student has 3 attributes as shown above. Her name and DOB will remain the same throughout his/her
education. So the student has a fixed value attribute that will never change in the future.
These attributes are known as stored attributes.
Complex Attribute
When multi-valued and composite attributes together form an attribute then it is called a Complex
attribute.
Complex attributes can have an unlimited number of sub-attributes.
Example
Here for the student, we created an attribute named contact_info which further decomposed into
phone_no + Address.
The address is a composite attribute which is further divided into simple attributes and phone_no is a
multivalued attribute.
This indicates that the contact_info attribute is made from the multi-valued and composite attribute.
This type of attribute is known as the Complex Attribute.
Relationships in DBMS: -
In database management systems (DBMS), relationships define how tables (or entities) are logically connected
to each other, allowing for the efficient storage and retrieval of related data. These relationships are crucial for
modeling real-world scenarios and ensuring data integrity.
Types of Relationships:
One-to-one (1:1):
A single record in one table is associated with exactly one record in another table. For example, a person
might have only one passport, and each passport belongs to one person.
A record in one table can be associated with multiple records in another table, but a record in the second table
is associated with only one record in the first. For example, one customer can place multiple orders, but each
order is placed by only one customer.
Many-to-many (N:N):
Multiple records in one table can be associated with multiple records in another table. For instance, many
students can enroll in many courses, and each course can be taken by many students. These relationships
often require a "join table" to implement.
Self-referencing (Recursive):
A relationship within a single table, where a record is related to another record within the same table. For
example, an employee table where an employee can be a manager, and the manager is also an employee in
the same table.
Constraints in dbms: -
- In a Database Management System (DBMS), constraints are rules or conditions that are applied to data
to ensure its accuracy, consistency, and integrity.
- They define limitations on the data that can be stored and the operations that can be performed on the
data.
- Constraints help maintain data quality, prevent invalid data entries, and ensure that relationships
between data in different tables are valid.
Types of Constraints:
Domain Constraints:
These constraints define the valid values that can be stored in a column. For example, a domain constraint
might specify that a column for age can only accept positive integers, or that an email column must follow a
specific format.
These constraints ensure that each record (row) in a table is uniquely identifiable. This is typically achieved
through the use of a primary key, which cannot contain NULL values and must be unique.
These constraints ensure the consistency of relationships between tables. They specify that foreign keys
(columns referencing primary keys in other tables) must either be NULL or refer to an existing primary key
value in the related table.
Key Constraints:
These constraints enforce uniqueness and identify key attributes in a table. Common key constraints include
Primary Key, Unique Key, and Foreign Key.
These constraints ensure that a specific column cannot contain NULL values.
CHECK Constraints:
These constraints define a condition that must be met for the data in a column. For example, a check
constraint might limit the value of a column to be within a specific range.
DEFAULT Constraints:
These constraints specify a default value for a column if no other value is provided during data insertion.
Mapping Constraints:
These constraints define how entities in a database are related to each other, particularly within an Entity-
Relationship (ER) model.
Keys in DBMS: -
Keys are crucial in a Database Management System (DBMS) for several reasons:
Uniqueness: Keys ensure that each record in a table is unique and can be identified distinctly.
Data Integrity: Keys prevent data duplication and maintain the consistency of the data.
Efficient Data Retrieval: By defining relationships between tables, keys enable faster querying and
better data organization. Without keys, it would be extremely difficult to manage large datasets, and
queries would become inefficient and prone to errors.
Different Types of Database Keys
1. Super Key
2. Candidate Key
3. Primary Key
4. Alternate Key
5. Foreign Key
6. Composite Key
Super Key:- The set of one or more attributes (columns) that can uniquely identify a tuple (record) is
known as Super Key. It may include extra attributes that aren't essential for uniqueness but still
uniquely identify the row.
A super key is a group of single or multiple keys that uniquely identifies rows in a table. It supports
NULL values in rows.
A super key can contain extra attributes that aren’t necessary for uniqueness. For example, if the
"STUD_NO" column can uniquely identify a student, adding "SNAME" to it will still form a valid super
key, though it's unnecessary.
Example: Consider the STUDENT table
A super key could be a combination of STUD_NO and PHONE, as this combination uniquely identifies
a student.
2. Candidate Key
The minimal set of attributes that can uniquely identify a tuple is known as a candidate key. For Example,
STUD_NO in STUDENT relation.
A candidate key is a minimal super key, meaning it can uniquely identify a record but contains no extra
attributes.
It is a super key with no repeated data is called a candidate key.
The minimal set of attributes that can uniquely identify a record.
A candidate key must contain unique values, ensuring that no two rows have the same value in the
candidate key’s columns.
Every table must have at least a single candidate key.
A table can have multiple candidate keys but only one primary key.
Example: For the STUDENT table below, STUD_NO can be a candidate key, as it uniquely identifies each
record.
A composite candidate key example: {STUD_NO, COURSE_NO} can be a candidate key for
a STUDENT_COURSE table.
3. Primary Key
There can be more than one candidate key in relation out of which one can be chosen as the primary key. For
Example, STUD_NO, as well as STUD_PHONE, are candidate keys for relation STUDENT but STUD_NO
can be chosen as the primary key (only one out of many candidate keys).
A primary key is a unique key, meaning it can uniquely identify each record (tuple) in a table.
It must have unique values and cannot contain any duplicate values.
A primary key cannot be NULL, as it needs to provide a valid, unique identifier for every record.
A primary key does not have to consist of a single column. In some cases, a composite primary
key (made of multiple columns) can be used to uniquely identify records in a table.
Databases typically store rows ordered in memory according to primary key for fast access of records
using primary key.
Example:
STUDENT table -> Student(STUD_NO, SNAME, ADDRESS, PHONE) , STUD_NO is a primary
key
4. Alternate Key
An alternate key is any candidate key in a table that is not chosen as the primary key. In other words, all
the keys that are not selected as the primary key are considered alternate keys.
An alternate key is also referred to as a secondary key because it can uniquely identify records in a
table, just like the primary key.
An alternate key can consist of one or more columns (fields) that can uniquely identify a record, but it is
not the primary key
Eg:- SNAME, and ADDRESS is Alternate keys
Example: In the STUDENT table, both STUD_NO and PHONE are candidate keys. If STUD_NO is chosen
as the primary key, then PHONE would be considered an alternate key.
5. Foreign Key
A foreign key is an attribute in one table that refers to the primary key in another table. The table that
contains the foreign key is called the referencing table, and the table that is referenced is called
the referenced table.
A foreign key in one table points to the primary key in another table, establishing a relationship
between them.
It helps connect two or more tables, enabling you to create relationships between them. This is essential
for maintaining data integrity and preventing data redundancy.
They act as a cross-reference between the tables.
For example, DNO is a primary key in the DEPT table and a non-key in EMP
Example: Consider the STUDENT_COURSE table
1 005 C001
2 056 C005
Here, STUD_NO in the STUDENT_COURSE table is a foreign key that references the STUD_NO primary
key in the STUDENT table.
Explanation:
Unlike the Primary Key of any given relation, Foreign Key can be NULL as well as may contain
duplicate tuples i.e. it need not follow uniqueness constraint. For Example, STUD_NO in the
STUDENT_COURSE relation is not unique.
It has been repeated for the first and third tuples. However, the STUD_NO in STUDENT relation is a
primary key and it needs to be always unique, and it cannot be null.
[Link]
ation between Primary Key and Foreign Key
6. Composite Key
Sometimes, a table might not have a single column/attribute that uniquely identifies all the records of a table.
To uniquely identify rows of a table, a combination of two or more columns/attributes can be used. It still can
give duplicate values in rare cases. So, we need to find the optimal set of attributes that can uniquely identify
rows in a table.
It acts as a primary key if there is no primary key in a table
Two or more attributes are used together to make a composite key .
Different combinations of attributes may give different accuracy in terms of identifying the rows
uniquely.
Example: In the STUDENT_COURSE table, {STUD_NO, COURSE_NO} can form a composite key to
uniquely identify each record.
F
[Link] Types of Keys
Database design is the organization of data according to a database model. Properly designed databases are easu
to maintain, improves data consistency.
The database design process can be divided into six steps. The ER model(Entity Relationship model) is most
relevant to the first three steps.
1. Requirement analysis
2. Conceptual database design
3. Logical database design
4. Schema refinement
5. Physical database design
6. Application and security design
1. Requirement analysis
It is necessary to understand what data need to be stored in the database, what applications must be
built, what are all those operations that are frequently used by the system.
The requirement analysis is an informal process and it requires proper communication with user
groups.
There are several methods for organizing and presenting information gathered in this step. Some
automated tools can also be used for this purpose.
The information gathered, is used to develop a high-level description of the data to be stored in the
database
This is a steps in which E-R Model i.e. Entity Relationship model is built.
The goal of this design is to create a simple description of data that matches with the requirements of
users.
This is a step in which ER model in converted to relational database schema, sometimes called as the
logical schema in the relational data model.
4. Schema refinement
In this step, relational database schema is analysed to identify the potential problems and to refine it.
The schema refinement can be done with the help of normalizing and restructuring the relations.
Using design methodologies like UML(Unified Modeling Language) try to address the complete
software design of the database can be accomplished.
The role of each entity in every process must be reflected in the application task.
For each role, there must be the provision for accessing and prohibiting some part of database.
Thus, some access rules must be enforced on the application (which is accessing the database) to
protect the security features.
Introduction of ER Model: -
The Entity-Relationship Model (ER Model) is a conceptual model for designing a databases. This
model represents the logical structure of a database, including entities, their attributes and relationships
between them.
Entity: An object that is stored as data such as Student, Course or Company.
Attribute: Properties that describes an entity such as StudentID, CourseName, or EmployeeEmail.
Relationship: A connection between entities such as "a Student enrolls in a Course".
1. Identify Entities: The very first step is to identify all the Entities. Represent these entities in a Rectangle
and label them accordingly.
2. Identify Relationships: The next step is to identify the relationship between them and represent them
accordingly using the Diamond shape. Ensure that relationships are not directly connected to each other.
3. Add Attributes: Attach attributes to the entities by using ovals. Each entity can have multiple attributes
(such as name, age, etc.), which are connected to the respective entity.
4. Define Primary Keys: Assign primary keys to each entity. These are unique identifiers that help
distinguish each instance of the entity. Represent them with underlined attributes.
5. Remove Redundancies: Review the diagram and eliminate unnecessary or repetitive entities and
relationships.
6. Review for Clarity: Review the diagram make sure it is clear and effectively conveys the relationships
between the entities.