Intro Dbms
Intro Dbms
A Database Management System (DBMS) is software that allows users to efficiently store,
retrieve, and manage data in databases. It serves as an interface between the database and its
users or application programs, ensuring that data is organized and easily accessible.
Purpose of DBMS
1. Data Organization: Organizes data in structured formats such as tables, records, and
fields.
2. Data Access: Provides efficient methods to query and manipulate data.
3. Data Security: Protects data from unauthorized access.
4. Data Integrity: Ensures accuracy and consistency of data over its lifecycle.
5. Concurrent Access: Allows multiple users to access the database simultaneously
without conflicts.
Types of Databases
1. Relational Databases: Store data in tables with rows and columns (e.g., MySQL,
PostgreSQL, Oracle).
2. NoSQL Databases: Handle unstructured or semi-structured data (e.g., MongoDB,
Cassandra).
3. Hierarchical Databases: Organize data in a tree-like structure.
4. Network Databases: Represent data as interconnected nodes.
5. Object-Oriented Databases: Store data as objects, similar to object-oriented
programming.
Advantages of DBMS
1. Entity and Attributes: Represents real-world objects (entities) and their properties
(attributes).
2. Primary Key: A unique identifier for a record in a table.
3. Foreign Key: Establishes a relationship between two tables.
4. Normalization: Process of organizing data to reduce redundancy.
5. SQL (Structured Query Language): A language used to interact with relational
databases.
Applications of DBMS
Chapter - 2
Introduction to DBMS and Relational Model
Advantages of the DBMS Approach
Data Independence
1. Logical Data Independence: Changes in the logical schema do not affect the external
schema.
2. Physical Data Independence: Changes in physical storage do not affect the logical
schema.
Database Languages
1. Data Definition Language (DDL): Defines the database structure (e.g., CREATE, ALTER,
DROP).
2. Data Manipulation Language (DML): Manipulates data (e.g., INSERT, UPDATE,
DELETE).
3. Data Control Language (DCL): Manages access control (e.g., GRANT, REVOKE).
Transaction Management
Data Dictionary
A metadata repository that stores information about database schema, tables, columns,
relationships, and constraints.
Helps in understanding and managing the database structure.
Relational Model
Domains
A domain defines the possible values a column can have (e.g., age domain: 0–120).
Relation
Kinds of Relations
Relational Databases
Organizes data into relations (tables) and enforces constraints for integrity and
consistency.
Modification of Database
The set of one or more attributes (columns) that can uniquely identify a tuple (record) is known
as Super Key. For Example, STUD_NO, (STUD_NO, STUD_NAME), etc.
A super key is a group of single or multiple keys that uniquely identifies rows in a table.
It supports NULL values in rows.
A super key can contain extra attributes that aren’t necessary for uniqueness. For
example, if the “STUD_NO” column can uniquely identify a student, adding “SNAME”
to it will still form a valid super key, though it’s unnecessary.
Example:
Table STUDENT
Candidate Key
The minimal set of attributes that can uniquely identify a tuple is known as a candidate key. For
Example, STUD_NO in STUDENT relation.
A candidate key is a minimal super key, meaning it can uniquely identify a record but
contains no extra attributes.
A candidate key must contain unique values, ensuring that no two rows have the same
value in the candidate key’s columns.
A table can have multiple candidate keys but only one primary key.
Example:
Table STUDENT
The candidate key can be simple (having only one attribute) or composite as well.
Example:
Table STUDENT_COURSE
Note: In SQL Server a unique constraint that has a nullable column, allows the value ‘ null ‘ in
that column only once . That’s why the STUD_PHONE attribute is a candidate here, but can not
be a ‘null’ value in the primary key attribute.
Primary Key
There can be more than one candidate key in relation out of which one can be chosen as the
primary key. For Example, STUD_NO, as well as STUD_PHONE, are candidate keys for
relation STUDENT but STUD_NO can be chosen as the primary key (only one out of many
candidate keys).
A primary key is a unique key, meaning it can uniquely identify each record (tuple) in a
table.
It must have unique values and cannot contain any duplicate values.
A primary key cannot be NULL, as it needs to provide a valid, unique identifier for every
record.
A primary key does not have to consist of a single column. In some cases, a composite
primary key (made of multiple columns) can be used to uniquely identify records in a table.
Databases typically store rows ordered in memory according to primary key for fast access of
records using primary key.
Example:
Table STUDENT
Alternate Key
An alternate key is any candidate key in a table that is not chosen as the primary key. In other
words, all the keys that are not selected as the primary key are considered alternate keys.
An alternate key is also referred to as a secondary key because it can uniquely identify
records in a table, just like the primary key.
An alternate key can consist of one or more columns (fields) that can uniquely identify a
record, but it is not the primary key
Eg:- SNAME, and ADDRESS is Alternate keys
Example:
Foreign Key
A foreign key is an attribute in one table that refers to the primary key in another table. The
table that contains the foreign key is called the referencing table, and the table that is referenced
is called the referenced table.
A foreign key in one table points to the primary key in another table, establishing a
relationship between them.
It helps connect two or more tables, enabling you to create relationships between them.
This is essential for maintaining data integrity and preventing data redundancy.
For example, DNO is a primary key in the DEPT table and a non-key in EMP
Example:
Table STUDENT_COURSE
It may be worth noting that, unlike the Primary Key of any given relation, Foreign Key can be
NULL as well as may contain duplicate tuples i.e. it need not follow uniqueness constraint. For
Example, STUD_NO in the STUDENT_COURSE relation is not unique. It has been repeated for
the first and third tuples. However, the STUD_NO in STUDENT relation is a primary key and it
needs to be always unique, and it cannot be null.
Composite Key
Sometimes, a table might not have a single column/attribute that uniquely identifies all the
records of a table. To uniquely identify rows of a table, a combination of two or more
columns/attributes can be used. It still can give duplicate values in rare cases. So, we need to
find the optimal set of attributes that can uniquely identify rows in a table.
Example:
Relational Algebra
It is a procedural Language. It consists of a set of operators that can be performed on relations.
Relational Algebra forms the basis for many other high-level data sub-languages like SQL and
QBE.
Relational algebra has mainly 9 types of operators.
UNION
INTERSECTION
MINUS
TIMES
SELECTION
PROJECTION
JOIN
DIVISION
RENAME
1. UNION (U): A and B are two relations. It displays total values (Attributes) in both relations.
It avoids duplicate values in both relations. U symbol can be used.
Syntax:
A UNION B (or) A U B
Example:
Syntax:
A INTERSECT B (or) A ∩ B
Example:
3. DIFFERENCE (─): A and B are two relations. It displays elements in relation A not
in relation B.
Syntax:
A MINUS B (OR) A ─ B
Example:
4. CARTESIAN PRODUCT(X): A and B are two relations. It has a new relation consisting of
all pair wises combinations of all elements in A and B. The relation A has “m” elements and
relation B has “n” elements, then the resultant relation will be “ m * n “.
Syntax:
A TIMES B (OR) A X B
Example:
5. SELECTION (σ): Selection operation chooses the subset of tuples from the relation that
satisfies the given condition.
(σ)θ(R)
θ: Selection condition
In general, the select condition is a Boolean condition (i.e. an expression using logical
connective) of terms that have the form attribute1 OP attribute2 where OP is the comparison
operators <,>,=,>= etc.
Syntax:
6. PROJECTION (π): It displays some specified columns in a relation. “π” operator can be
used to select some specified columns in a relation. It selects tuples that satisfy the given
predicate from a relation. It displays some specified columns by using some conditions.
Syntax:
Example:
7. JOIN( ): It combines two or more relations. It can be mainly divided into mainly 4 types.
These are mainly
Inner Join
Outer Join
8. DIVIDE (÷): It divides the tuple from one relation to another relation
Syntax:
A DIVIDE B (OR) A ÷ B
Example:
Syntax:
Example:
ρ(STUDENT, MARKS)
It changes the “student” relation to “Marks” relation.
It also renames the specified column.
It changes the old-column name to new-column name.
Tables/Relations: The basic building block of the relational model is the table or
relation, which represents a collection of related data. Each table consists of columns,
also known as attributes or fields, and rows, also known as tuples or records.
Primary Keys: In the relational model, each row in a table must have a unique identifier,
which is known as the primary key. This ensures that each row is unique and can be
accessed and manipulated easily.
Foreign Keys: Foreign keys are used to link tables together and enforce referential
integrity. They ensure that data in one table is consistent with data in another table.
Normalization: The process of organizing data into tables and eliminating redundancy is
known as normalization. Normalization is important in the relational model because it
helps to ensure that data is consistent and easy to maintain.
Codd’s Rules: Codd’s Rules are a set of 12 rules that define the characteristics of a true
relational DBMS. These rules ensure that the DBMS is consistent, reliable, and easy to
use.
Atomicity, Consistency, Isolation, Durability (ACID): The ACID properties are a set
of properties that ensure that transactions are processed reliably in the relational model.
Transactions are sets of operations that are executed as a single unit, ensuring that data is
consistent and accurate.
Extensibility: Relational Algebra provides a flexible and extensible framework that can
be extended with new operators and functions. This allows developers to customize and
extend the language to meet their specific needs.
Limited Expressiveness: Relational Algebra has a limited set of operators, which can
make it difficult to express certain types of queries. It may be necessary to use more
advanced techniques, such as subqueries or joins, to express complex queries.
Lack of Flexibility: Relational Algebra is designed for use with relational databases,
which means that it may not be well-suited for other types of data storage or management
systems. This can limit its flexibility and applicability in certain contexts.
Limited Data Types: Relational Algebra is designed for use with simple data types, such
as integers, strings, and dates. It may not be well-suited for more complex data types,
such as multimedia files or spatial data.
Relational Algebra is a powerful and useful tool for managing and manipulating data in
relational databases, it has some limitations and disadvantages that should be carefully
considered when using it.
They generally follow the 8-9 rules of Codd. E.F. Codd has proposed 13 rules which are
popularly known as Codd’s 12 rules. These rules are stated as follows:
Rule 0: Foundation Rule– For any system that is advertised as, or claimed to be, a
relational database management system, that system must be able to manage databases
entirely through its relational capabilities.
Rule 1: Information Rule– Data stored in the Relational model must be a value of some
cell of a table.
Rule 2: Guaranteed Access Rule– Every data element must be accessible by the table
name, its primary key, and the name of the attribute whose value is to be determined.
Rule 3: Systematic Treatment of NULL values– NULL value in the database must
only correspond to missing, unknown, or not applicable values.
Rule 4: Active Online Catalog– The structure of the database must be stored in an
online catalog that can be queried by authorized users.
Rule 6: View Updating Rule- Different views created for various purposes should be
automatically updatable by the system.
Rule 7: High-level insert, update and delete rule- Relational Model should support
insert, delete, update, etc. operations at each level of relations. Also, set operations like
Union, Intersection, and minus should be supported.
Rule 10: Integrity Independence- Integrity constraints modified at the database level
should not enforce modification at the application level.
Rule 11: Distribution Independence- Distribution of data over various locations should
not be visible to end-users.
Rule 12: Non-Subversion Rule- Low-level access to data should not be able to bypass
the integrity rule to change data.
Chapter -3
Introduction of ER Model
We typically follow the below steps for designing a database for an application.
Gather the requirements (functional and data) by asking questions to the database users.
Do a logical or conceptual design of the database. This is where ER model plays a role.
It is the most used graphical representation of the conceptual design of a database.
Physical Database Design (Like indexing) and external design (like views)
The Entity Relationship Model is a model for identifying entities (like student, car or company)
to be represented in the database and representation of how those entities are related. The ER
data model specifies enterprise schema that represents the overall logical structure of a database
graphically.
ER diagrams provide the purpose of real-world modeling of objects which makes them
intently useful.
What is Entity?
An Entity may be an object with a physical existence – a particular person, car, house, or
employee – or it may be an object with a conceptual existence – a company, a job, or a university
course.
Types of Entity –
There are two types of entity:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on
other Entity in the Schema. It has a primary key, that helps in identifying it uniquely, and it is
represented by a rectangle. These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set. But some
entity type exists for which key attributes can’t be defined. These are called Weak Entity types .
For Example, A company may store the information of dependents (Parents, Children, Spouse)
of an Employee. But the dependents can’t exist without the employee. So Dependent will be a
Weak Entity Type and Employee will be Identifying Entity type for Dependent, which means it
is Strong Entity Type.
A weak entity type is represented by a Double Rectangle. The participation of weak entity types
is always total. The relationship between the weak entity type and its identifying strong entity
type is called identifying relationship and it is represented by a double diamond.
What is Attributes?
Attributes are the properties that define the entity type. For example, Roll_No, Name, DOB,
Age, Address, and Mobile_No are the attributes that define entity type Student. In ER diagram,
the attribute is represented by an oval.
Types of Attributes
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key attribute.
For example, Roll_No will be unique for each student. In ER diagram, the key attribute is
represented by an oval with underlying lines.
2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For example,
the Address attribute of the student Entity type consists of Street, City, State, and Country. In ER
diagram, the composite attribute is represented by an oval comprising of ovals.
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No (can
be more than one for a given student). In ER diagram, a multivalued attribute is represented by a
double oval.
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a derived
attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived attribute is
represented by a dashed oval.
The Complete Entity Type Student with its Attributes can be represented as:
Entity-Relationship Set
A set of relationships of the same type is known as a relationship set. The following relationship
set depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered in C3.
Relationship Set
1. Unary Relationship: When there is only ONE entity set participating in a relation, the
relationship is called a unary relationship. For example, one person is married to only one
person.
Unary Relationship
2. Binary Relationship: When there are TWO entities set participating in a relationship, the
relationship is called a binary relationship. For example, a Student is enrolled in a Course.
Binary Relationship
3. Ternary Relationship: When there are three entity sets participating in a relationship, the
relationship is called a ternary relationship.
4. N-ary Relationship: When there are n entities set participating in a relationship, the
relationship is called an n-ary relationship.
What is Cardinality?
The number of times an entity of an entity set participates in a relationship set is known as
cardinality . Cardinality can be of different types:
1. One-to-One: When each entity in each entity set can take part only once in the relationship,
the cardinality is one-to-one. Let us assume that a male can marry one female and a female can
marry one male. So, the relationship will be one-to-one.
2. One-to-Many: In one-to-many mapping as well where each entity can be related to more than
one entity and the total number of tables that can be used in this is 2. Let us assume that one
surgeon department can accommodate many doctors. So, the Cardinality will be 1 to M. It means
one department has many Doctors.
3. Many-to-One: When entities in one entity set can take part only once in the relationship set
and entities in other entity sets can take part more than once in the relationship set, cardinality is
many to one. Let us assume that a student can take only one course, but one course can be taken
by many students. So the cardinality will be n to 1. It means that for one course there can be n
students but for one student, there will be only one course.
In this case, each student is taking only 1 course, but 1 course has been taken by many students.
4. Many-to-Many: When entities in all entity sets can take part more than once in the
relationship cardinality is many to many. Let us assume that a student can take more than one
course and one course can be taken by many students. So, the relationship will be many to many.
Participation Constraint
Participation Constraint is applied to the entity participating in the relationship set.
1. Total Participation – Each entity in the entity set must participate in the relationship. If each
student must enroll in a course, the participation of students will be total. Total participation is
shown by a double line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT participate in the
relationship. If some courses are not enrolled by any of the students, the participation in the
course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having total
participation and Course Entity set having partial participation.
Every student in the Student Entity set participates in a relationship but there exists a course C4
that is not taking part in the relationship.
How to Draw ER Diagram?
The very first step is Identifying all the Entities, and place them in a Rectangle, and
labeling them accordingly.
The next step is to identify the relationship between them and place them accordingly
using the Diamond, and make sure that, Relationships are not connected to each other.
The Library Management System database keeps track of readers with the following
considerations –
The system keeps track of the staff with a single point authentication system comprising
login Id and password.
Staff maintains the book catalog with its ISBN, Book title, price(in INR), category(novel,
general, story), edition, author Number and details.
A publisher has publisher Id, Year when the book was published, and name of the book.
Readers are registered with their user_id, email, name (first name, last name), Phone no
(multiple entries allowed), communication address. The staff keeps track of readers.
Readers can return/reserve books that stamps with issue date and return date. If not
returned within the prescribed time period, it may have a due date too.
Staff also generate reports that has readers id, registration no of report, book no and
return/issue info.
Note:
This Library ER diagram illustrates key information about the Library, including entities such as
staff, readers, books, publishers, reports, and authentication system. It allows for understanding
the relationships between entities.
Book Entity : It has authno, isbn number, title, edition, category, price. ISBN is the
Primary Key for Book Entity.
Reader Entity : It has UserId, Email, address, phone no, name. Name is composite
attribute of firstname and lastname. Phone no is multi valued attribute. UserId is the
Primary Key for Readers entity.
Authentication System Entity : It has LoginId and password with LoginID as Primary
Key.
Reports Entity : It has UserId, Reg_no, Book_no, Issue/Return date. Reg_no is the
Primary Key of reports entity.
Staff Entity : It has name and staff_id with staff_id as Primary Key.
Reserve/Return Relationship Set : It has three attributes: Reserve date, Due date,
Return date.
A reader can reserve N books but one book can be reserved by only one reader. The
relationship 1:N.
A publisher can publish many books but a book is published by only one publisher. The
relationship 1:N.
Library card when the student has to show or submit his/her identity as proof.
The overall processing unit will contain the following output that a system will produce or
generate:
The book will be the output as the book demanded by the students will be given to them.
Information on the demanded book should be displayed by the library information system
that can be used by the student while selecting the book which makes it easier for the
student.
1. Level 0 DFD –
2. Level 1 DFD – At this level, the system has to show or exposed with more details of
processing. The processes that are important to be carried out are:
o Book delivery
o Search by topic
List of authors, List of Titles, List of Topics, the bookshelves from which books can be
located are some information that is required for these processes. Data store is used to
represent this type of information.
3. Level 2 DFD –
4. Out of scope:
Other activities like purchasing of new books, replacement of old books or charging a fine are
not considered in the above system.
Generalization
Generalization is the process of extracting common properties from a set of entities and creating
a generalized entity from it. It is a bottom-up approach in which two or more entities can be
generalized to a higher-level entity if they have some attributes in common. For Example,
STUDENT and FACULTY can be generalized to a higher-level entity called PERSON as shown
in Figure 1. In this case, common attributes like P_NAME, and P_ADD become part of a higher
entity (PERSON), and specialized attributes like S_FEE become part of a specialized entity
(STUDENT).
Generalization
Specialization
In specialization, an entity is divided into sub-entities based on its characteristics. It is a top-
down approach where the higher-level entity is specialized into two or more lower-level entities.
For Example, an EMPLOYEE entity in an Employee management system can be specialized into
DEVELOPER, TESTER, etc. as shown in Figure 2. In this case, common attributes like
E_NAME, E_SAL, etc. become part of a higher entity (EMPLOYEE), and specialized attributes
like TES_TYPE become part of a specialized entity (TESTER).
Attribute inheritance : It allows lower level entities to inherit the attributes of higher
level entities and vice versa. In diagram Car entity is an inheritance of Vehicle entity ,So
Car can acquire attributes of Vehicle. Example:car can acquire Model attribute of
Vehicle.
Example of Relation
Aggregation
An ER diagram is not capable of representing the relationship between an entity and a
relationship which may be required in some scenarios. In those cases, a relationship with its
corresponding entities is aggregated into a higher-level entity. Aggregation is an abstraction
through which we can represent relationships as higher-level entity sets.
For Example, an Employee working on a project may require some machinery. So, REQUIRE
relationship is needed between the relationship WORKS_FOR and entity MACHINERY. Using
aggregation, WORKS_FOR relationship with its entities EMPLOYEE and PROJECT is
aggregated into a single entity and relationship REQUIRE is created between the aggregated
entity and MACHINERY.
Difference Between Entity, Entity Set and Entity Type
Entity Entity Type Entity Set
Any particular row (a record) The name of a relation All rows of a relation
in a relation (table) is known (table) in RDBMS is an (table) in RDBMS is
as an entity. entity type entity set