0% found this document useful (0 votes)
12 views503 pages

Dbms Unit1 Merge

The document outlines a course on Database Management Systems, detailing its objectives, course outline, outcomes, and applications. Key topics include relational algebra, SQL querying, normalization principles, and modern data management systems like NoSQL databases. The course aims to equip students with practical skills in database design, query optimization, and data integrity management.

Uploaded by

udaygsneaker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views503 pages

Dbms Unit1 Merge

The document outlines a course on Database Management Systems, detailing its objectives, course outline, outcomes, and applications. Key topics include relational algebra, SQL querying, normalization principles, and modern data management systems like NoSQL databases. The course aims to equip students with practical skills in database design, query optimization, and data integrity management.

Uploaded by

udaygsneaker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Database Management Systems

UE23CS351A -Introduction
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgment: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems
Course Objectives:

Course Objectives:
● To introduce fundamental concepts of database systems, relational algebra, and basic
SQL querying.
● To develop the ability to use advanced SQL techniques and understand the strategies for
efficient query processing and optimization.
● To equip students with normalization principles, functional dependencies, and
transaction management to ensure data integrity.
● To understand the modern data management systems including NoSQL - graph and
vector databases.

[Link]
2
Database Management Systems
Course Outline

Unit 1: Introduction to Database Management


Purpose, View of data, Database design- Introduction to databases, Database application
architecture, E-R Model, reducing ER to a relational schema. Relational operations
(Algebra), Unary Operations - Unity, Binary, Aggregate Functions, Grouping, SQL
overview, Data definition, Structure of SQL queries, constraints and keys.
14 Hours
Unit 2: Relational Model and Database Design
Additional Basic Operations, Set Operations, Null Values, Aggregate Functions, Nested
Subqueries, Database Modification, Join expressions, Views, Triggers, Functions, and
Procedures, CTE, windows functions and full text search functions.
14 Hours
3
Database Management Systems
Course Outline

Unit 3: Advanced Design Concepts and Implementation


Introduction to strategies of Query processing and Query optimization, Functional Dependencies,
Inference Rules, Normal Forms Based on Primary Keys (1NF, 2NF, and 3NF), General Definitions of
Second and Third Normal Forms Boyce-Codd Normal Form, Higher Normal Forms.
14 Hours

Unit 4: Advanced Databases


Database transactions, Concurrency control, Locking, Introduction to NoSQL databases, Key-value
database, Graph databases (Neo4j), Introduction to Vector databases, Vector search with
Relational and NOSQL databases, Database Application Connectivity.
14 Hours

4
Database Management Systems
Course Outline

Lab/Hands-on sessions

• Draw an ER diagram for a given problem statement and Conversion of an ER diagram


into Relational schema
• DDL and DML
• SQL – Joins: inner, outer; Sub queries: correlated and uncorrelated
• SQL – Creating Functions and Procedures
• SQL – Creating Triggers, Transaction isolation
• NoSQL database queries
• Vector database
• Designing Database application
Tools/ Languages: MySQL Workbench, Python, Redis, Neo4J, ERwin or any draw tool for
ER modeling

5
Database Management Systems
Course Outcomes

At the end of this course, the student will be able to:

● Demonstrate an ability to explain the structure of a database system, ER


diagrams, relational algebra, and write basic SQL queries involving constraints
and keys
● Construct and optimize SQL queries including nested subqueries, joins, views,
triggers, and stored procedures
● Apply normalization techniques using functional dependencies and design
schemas, and ensure the implementation of transaction and concurrency control
mechanisms.
● Create applications of graph and vector databases.

6
Database Management Systems
Textbooks

Textbook 1 (TB1) Textbook 2 (TB2)

7
Database Management Systems
Textbooks

Text Book(s):
1. “Fundamentals of Database Systems”, Ramez Elmasri, Shamkant B Navathe, Pearson, 7th
Edition, 2017.
2. “Database System Concepts”, Silberschatz, H Korth and S Sudarshan, McGrawHill, 7th
Edition.

3. Essential Concepts of Vector Databases: Mastering Vector Databases for Advanced Data
Management by Paulo Dichone, 1st edition, Packt publisher
Reference Book(s):
1. Database Management Systems, R Ramakrishnan, J Gehrke, 3rd Edition, McGraw Hill,
2002.
2. Data on the Web: From Relations to Semistructured Data and XML, S Abiteboul, P
Buneman, D Suciu, Morgan Kauffman, 1999. 8
Database Management Systems
Course Evaluation Policy

9
Database Management Systems
Introduction

➔ Have you ever wondered on how a large e-commerce website like Amazon is able to retrieve
the products that you would want to purchase?
➔ Nowadays social media (eg: Instagram) one of the most active platforms where people post
their day to day activities etc. how and where is all these information stored? By typing a
person's name we can get to see their entire profile, posts that person has posted how is this
possible?
➔ In a bank, multiple transactions take place everyday. Even if 2 transactions are happening
simultaneously from account, still the money in the accounts is properly deducted and
credited how does this happen?

10
Database Management Systems
Introduction

➔ Suppose you visit your doctor for a general checkup. Now by typing in your name or
your ID, all your medical history details would be visible to the doctor. How does this
happen?
➔ When you search for a specific topic on Google, how does the search engine quickly
retrieve relevant web pages from millions of websites and present them to you?

11
Database Management Systems
Introduction

All the above questions have a single answer and that is DATABASE
Before going ahead and understanding what database exactly and how is designed and
used for all the other application, let us try answering a few more questions:
● What do you mean by Data?
● How is it different from Information?
In current world, there is a lot of data that is available, how are all these stored or
organized so that we can get the required information?

12
Database Management Systems
What Do you mean by Data & Information?

Data
• Data represents the raw material, like individual puzzle
pieces, that serves as the foundation for knowledge.
• Data is a raw and unorganized fact that is required to be
processed to make it meaningful.

Information

• When data is processed, organized, structured, or presented


in a given context to make it useful is called information.
• Data is a collection of facts, while information puts those facts
into context
• It is organized and is utilized by humans in some significant
way to make decisions and draw some conclusions Data doesn’t depend on
information, Information depends
on data
13
Database Management Systems
What is a Database?

Database
 A database is a collection of related data representing some aspect of the real world, also
called the mini-world or the universe of discourse (UoD).
 It is a logically coherent collection, meaning it is not just a random assortment of data but
is organized with inherent meaning and structure.
 Databases are designed, built, and populated for a specific purpose, catering to the needs
of applications or systems that interact with them.
 They can vary in size and complexity, ranging from small databases used by individual
applications to large-scale enterprise databases handling vast amounts of data.

Could you think of what do these individual terms mean w.r.t student information Database?
14
Database Management Systems
What is a Database

15
Database Management Systems
Database Management System

Database Management System


• A modern database system is a complex software system whose task is to
manage a large, complex collection of data.
• DBMS contains information about a particular enterprise
○ Collection of interrelated data
○ Set of programs to access the data
○ An environment that is both convenient and efficient to use
• Database systems are used to manage collections of data that are:
○ Highly valuable
○ Relatively large
○ Accessed by multiple users and applications, often at the same time.
Let us break these statements down further and understand them in the context of
the student information database

17
Database Management Systems
Database Management System

Database Management System

• A general-purpose software system that facilitates the processes of defining, constructing,


manipulating, and sharing databases among various users and applications
○ Defining a database involves specifying the data types, structures, and constraints of
the data to be stored in the database. By defining the structure and data types, the
DBMS ensures the consistency and integrity of the stored data.
➢ The "Students" table may include fields such as "Student ID" (numeric), "Name"
(text), "Email" (text), "Date of Birth" (date), and "Major" (text).
➢ The "Courses" table may have fields like "Course ID" (numeric), "Course Name"
(text), "Credits" (numeric), and "Instructor" (text).

18
Database Management Systems
Database Management System

Database Management System

• A general-purpose software system that facilitates the processes of defining,


constructing, manipulating, and sharing databases among various users and
applications
○ Constructing the database is the process of storing the data on some
storage medium that is controlled by the DBMS. It organizes the data in a
way that allows for efficient retrieval and manipulation. The DBMS ensures
data is securely stored and can be accessed by authorized users or
applications.
➢ The DBMS may store the Student Information Database on a server
with appropriate data storage capacity and performance capabilities.
19
Database Management Systems
Database Management System

Database Management System

• A general-purpose software system that facilitates the processes of defining,


constructing, manipulating, and sharing databases among various users and
applications
○ Manipulating a database includes functions such as querying the database to
retrieve specific data, updating the database to reflect changes in the mini
world, and generating reports from the data. It ensures that any changes made
to the database follow predefined constraints and maintain data consistency.
➢ A university administrator may use SQL queries to retrieve specific
information, such as "Retrieve all students enrolled in Computer Science
courses." The DBMS processes the query, retrieves the relevant data from
the database, and presents it to the user.
➢ Allows authorized users to update the database, such as adding new
students or updating grades. 20
Database Management Systems
Database Management System

Database Management System

• A general-purpose software system that facilitates the processes of defining,


constructing, manipulating, and sharing databases among various users and
applications
○ Sharing a database allows multiple users and programs to access the
database simultaneously, provided they have the necessary permissions
➢ Faculty members, administrative staff, and students can access relevant
information concurrently, as long as they have the necessary
permissions.
➢ For example, faculty members can access student details for their
courses, while administrators can update and maintain the overall
database.
21
Database Management Systems
Questions

Some questions to revise


1) How do you define data, and how does it differ from information in the context of
database management systems?
2) List out the data types you would take to define a database for an online bookstore.
3) Explain the role of a DBMS in managing highly valuable and relatively large
collections of data. Why is this important for organizations?
4) What are some of the common operations performed on a database? How do these
operations interact with the data stored in the database?

22
Database Management Systems
Why study databases

Databases help us to:


• Store large amounts of data (file structure, disk management)
• Understand the data (data models)
• Keeps data secure (security, recovery)
• Find required data and use/manipulate it (query languages, concurrency
control, and data analysis tools)
• Get accurate information (as databases have built-in constraints and checks help
in this)
• Maintain Data integrity (ensures data is accurate and consistent)

Data Integrity Example: Ensuring a student’s ID is unique (Entity Integrity), all enrollments refer to valid students (Referential Integrity),
ages are valid numbers (Domain Integrity), and business rules like discount prices are logically correct (User-Defined Integrity).
23
Database Management Systems
Database Applications

1) Enterprises
● Customer Relationship Management (CRM): Databases store customer interactions,
preferences, and sales data.
● Supply Chain Management: Databases track inventory levels, supplier information,
and logistics.
● Human Resources: Manage employee data, payroll, and benefits using databases.
● Financial Systems: Handle accounting, budgeting, and financial reporting through
database solutions.
● Example: Salesforce CRM is widely used by enterprises to manage customer
relationships and sales data within a database.

24
Database Management Systems
Database Applications

2) Manufacturing
● Production Tracking: Databases monitor production schedules, inventory levels, and
quality control.
● Supply Chain Management: Manage supplier information, procurement, and
logistics using databases.
● Product Lifecycle Management: Track product design, development, and updates in
a database.
● Example: Toyota uses databases to manage their Just-In-Time (JIT) manufacturing
processes, ensuring efficient production and inventory management.

25
Database Management Systems
Database Applications

3) Banking and Finance


● Account Management: Databases store customer accounts, transactions, and
balance information.
● Loan Processing: Manage loan applications, approvals, and repayments through a
database system.
● Fraud Detection: Monitor transactions for unusual activity and potential fraud using
database systems.
● Example: JPMorgan Chase uses advanced database systems to manage customer
accounts, process transactions, and detect fraud.

26
Database Management Systems
Database Applications

7) Web-based Services
● User Data Management: Databases store user profiles, preferences, and activity
logs.
● Content Management: Manage website content, including articles, images, and
videos, using a database system.
● Analytics: Track user interactions, traffic patterns, and conversion rates through
databases.
● Example: Netflix uses a sophisticated database system to manage user preferences,
content recommendations, and streaming data.

27
Database Management Systems
Database Applications

8) Document Databases
● Data Storage: Document databases store semi-structured data in formats like JSON, BSON, or
XML, allowing for flexible schema design and easy data retrieval.
● Content Management: Manage collections of documents such as articles, patents, research
papers, and other unstructured content.
● Scalability: Document databases are designed to handle large volumes of data and can scale
horizontally across multiple servers.
● Use Cases: Often used for content management systems, catalogues, and data lakes where data
doesn't fit neatly into relational models.
● Example: MongoDB is a popular document database used by companies like The New York Times
to manage and deliver vast amounts of news articles and media content.

28
Database Management Systems
Database Applications

9) Navigation Systems
● Location Data: Databases store geographic information including locations of places of
interest, routes, and geographic features.
● Real-Time Updates: Manage real-time traffic data, route optimization, and point-of-interest
information.
● User Interaction: Provide features like route planning, navigation guidance, and local
searches.
● Integration: Often integrated with other systems such as weather services, local businesses,
and emergency services to enhance user experience.
● Example: Google Maps uses a complex database system to manage and update location
data, provide navigation services, and offer real-time traffic updates to millions of users
worldwide.

29
Database Management Systems
How are databases used?

There are two modes in which databases are used today

• Online Transaction Processing (OLTP):


○ Used by a large number of users for small data retrieval and updates
○ common in most database applications like banking, universities, and airlines.

• Data Analytics/Online Analytical Processing(OLAP):


○ Involves processing data to draw conclusions and create predictive models for business
decisions.
○ Examples include loan approval, targeted advertisements, and manufacturing
decisions.
○ Data mining combines AI and statistical techniques for efficient analysis of large
databases.

30
Database Management Systems
View of Data

• A database system is a collection of interrelated data and a


set of programs that allow users to access and modify these
data.
• A major purpose of a database system is to provide users
with an abstract view of the data.
○ Data models
➢ A collection of conceptual tools for describing data,
data relationships, data semantics and constraints.
○ Data abstraction
➢ Hide the complexity of data structures to represent
data in the database from users through several
levels of data abstraction.
• The structure of a database is defined by its data model,
which includes tools to describe data, relationships,
semantics, and consistency constraints.
31
Database Management Systems
Questions

Some questions to be explored??


1) List and describe three real-world applications of databases in different industries. How do
these applications benefit from using databases?
2) What are the two modes in which databases are used today?
3) Discuss the various data models. How does the relational model differ from the semi-
structured and object-based models?
4) Explain the significance of document databases in storing and managing semi-structured data.
How does MongoDB exemplify the use of document databases in real-world applications?

32
Database Management Systems
How are databases used?

Some online resources -

● [Link]
● [Link]
● [Link]
● [Link]

33
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

34
Database Management Systems
Purpose of Database System
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgment: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and S
Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
File-processing Systems

File-based System:
● One way to keep information on a computer is to store it in permanent files.
● A company system has a number of application programs; each of them is
designed to manipulate data files.
● These application programs have been written at the request of the users in the
organization.
● New applications are added to the system as the need arises.

3
Database Management Systems
Challenges of File-processing Systems

Let us take an example of a University Organization system. We need to store


information about the following entities:
• Instructors, Students, Departments, Courses offered.
• One of the ways to keep this information is to store it in operating file systems
• To allow the users to manipulate the information, the system would have a
number of application programs that manipulate the files including:
○ Addition of new students, instructors, and courses
○ Register students for different courses
○ Assign grades to students, compute GPA, and generate transcripts.

4
Database Management Systems
Challenges of File-processing Systems

Scenario: The university decides to add a new major.

What specific steps must be taken to integrate a new major into the university's existing system?

1. Create a New Department:


○ Set up a new department for the major.
○ Create new permanent files for department information.
2. Develop New Application Programs:
○ Write programs to manage rules for the new major.
○ Ensure programs handle course registration, grading, and transcripts for the new major.

This typical file-processing system is supported by a conventional operating system. The system stores
permanent records in various files, and it needs different application programs to extract records from
and add records to, the appropriate files.

5
Database Management Systems
Challenges of File-processing Systems

What is the need for a Database Management System when we can do the university
organization using a simple file-processing system?

Consider the following situations:


• Double Major Students :

What if there is a student who has enrolled for a double major (Physics and Computer
Science) how many times do we store all his details?
• Updating information :

What happens if a student changes their address or phone number? Is it guaranteed that
the update reflects in all relevant files across departments?

6
Database Management Systems
Challenges of File-processing Systems

● Efficient Data Retrieval:

The university transport department wants to find out all students who live within a
particular postal code area. Can he quickly retrieve the data or does he have to build
another application for it?
● Data Format Consistency:

What if files that store these data are written in different formats is it easy to access the
data in this form?
● Adding New Constraints:

Suppose a new constraint is to be added, can it be done efficiently without having to make
extensive changes to existing application programs?

7
Database Management Systems
Challenges of File-processing Systems and the Purpose of Database Systems

There are a lot of drawbacks to the file-processing system :


• Data Redundancy and Inconsistency

• Difficulty in accessing the data

• Data isolation

• Integrity problems

• Atomicity problems

• Concurrent access anomalies

• Security problems

8
Database Management Systems
Purpose of Database Systems

1. Data Redundancy & Inconsistency


Consider the question of the student enrolled for a
double major. What would happen in this situation in
case of file system?
Solution to this problem:
Details of the students would be duplicated for each - Store information in centralized
department - Data Redundancy location thus Minimizes
redundancy
This would lead to the following problems: - Changes can be made at a single
spot, thus ensuring consistency
• Takes up more storage space - This can be achieved through a
• May also lead to Data Inconsistency in case of database system
changes

9
Database Management Systems
Purpose of Database Systems

2. Difficulty in accessing the data


Consider that the university wants to find all the students
living in a specific postal code area. How will we execute The solution to this problem:
this in traditional file system? - Central information store for fast
• Either we need to get a list of all students and retrieval
- Use a powerful query language
manually sift through it
for answering new queries
• Or create another special program for this task - Efficient ways to search and filter
Both the above-mentioned are time-consuming and data for speeding up data
inconvenient retrieval
- This can be achieved through a
Thus we can see that Conventional file-processing database system
systems are not designed for quick and efficient data
retrieval, especially when new types of queries arise.

10
Database Management Systems
Purpose of Database Systems

3. Data Isolation

Consider a concurrency situation, The solution to this problem:


where identifying when and how changes made by one
operation become visible to other concurrent users. - Ability to allow multiple
transactions to access same
This is a challenge because New applications has to
retrieve the appropriate data, which might be stored in data without interfering with
various files. each other
- Consistent view of the data
A file-based system must manage, or prevent,
concurrency by the application programs.

When an application opens a file, it will lock the file. This


means that no one else has access to the file at the same
time. 11
Database Management Systems
Purpose of Database Systems

4. Integrity problems

Scenario: Each department has an account for research The solution to this problem:
funds, and the account balance must always be above 0. - Specify integrity constraints at the
How can we implement this? database level to specify rules directly in
the schema
Each application program is created for maintaining - System should enforce constraints
departmental accounts but adding new constraints
automatically
becomes challenging as it requires editing all the existing - To add a new constraint, updating the
application programs.
schema alone will be sufficient, and the
system will handle the rest
automatically.
- This would be easily solved by using
DBMS which provides all these facilities

12
Database Management Systems
Purpose of Database Systems

5. Atomicity problems:
In a computer system failure can happen unexpectedly
Consider a banking system, a sum of 5000 rupees is getting
transferred from account A to B.
What happens internally?
The solution to this problem:
● An amount of 5000 Rs. gets deducted from account A - Atomicity should be ensured
● The amount (5000 Rs.) gets credited to account B - Automatic rollback in case of
Now either both the steps should be performed or none failure
should be executed. - DBMS would ensure this and
In a file system, The program might successfully deduct 5000 makes it reliable and consistent
from account A but fail to credit it to account B. even if failure occurs
This results in an inconsistent database state, with money
missing from one account and not added to the other.

13
Database Management Systems
Purpose of Database Systems

6. Concurrent Access Anomalies


Consider a bank transaction example. There is a bank
account (Account A) with a balance of $10,000.

● The solution to this problem:


Two bank clerks attempt to deduct money from - Transaction and Locking to be
Account A at almost the same time - one deducts
implemented
$500 and the other $100. -
● Transactions ensure atomicity
Since they work concurrently, they both read the - Locking prevents conflicting
initial balance of $10,000, subtract their respective
changes.
amounts, and write back the results. -
● The DBMS would provide these
Depending on which one writes the new balance
and maintains data consistency
last, the account could end up with either $9,500 or
and integrity
$9,900 instead of the correct value of $9,400.

14
Database Management Systems
Purpose of Database Systems

7. Security problems
Imagine a university using a file-based system.
Over time, they've added various application The solution to this problem:
programs without a central security system. - Create a provision for administrator to
define access controls and permissions for
As a result, payroll personnel might each user or group.
accidentally see and even change academic - These permissions should enforced at the
records, compromising data privacy and database level, regardless of the
confidentiality because everyone has the same application programs used.
permissions. - DBMS supports various methods for
introducing security like User
Authentication, Data Encryption, Auditing
and Logging, Centralized security
management

15
Database Management Systems
Characteristics of Database Systems

• Self-describing nature of a database system


• The database system contains not only the database itself but also a complete
definition or description of the database structure and constraints (Metadata).
• Insulation between programs and data, and data abstraction
• The structure of data files is stored in the DBMS catalog separately from the
access programs (program-data independence)
• Support multiple views of the data
• DBMSs support different types of users, each of whom may require a different
perspective or view of the database. A view may be a subset of the database or it
may contain virtual data that is derived from the database files but is not explicitly
stored.
• Sharing of data and multi user transaction processing
• A multi user DBMS, allows multiple users to access the database at the same time.
It ensures that data changes are effected in a controlled manner so that the result
of the changes are correct 16
Database Management Systems
Advantages of Database Systems

• Controlling redundancy in data storage.


• Sharing of data among multiple users.
• Restricting unauthorized access to data
• Providing persistent storage for program Objects
○ E.g., Object-oriented DBMSs make program objects persistent
• Providing Storage Structures and search techniques for efficient Query Processing
• Providing backup and recovery
• Providing multiple interfaces to different classes of users.

17
Database Management Systems
Advantages of Database Systems

• Representing complex relationships among data.


• Enforcing integrity constraints on the database.
• Drawing inferences and actions from the stored data using deductive and active rules
and triggers
• Potential for enforcing standards
• Reduced application development time
• Flexibility to change data structures
• Availability of current information
• Economies of scale

18
Database Management Systems
Q&A

Identify the Problem: Case Study Applications


1. Scenario 1: Banking Transactions:

At a bank, two customers attempt to withdraw money from the same account
simultaneously. Due to a lack of concurrency control, the system fails to manage the
simultaneous transactions correctly, resulting in the account being overdrawn.

Problem Type: Concurrent Access Anomalies

19
Database Management Systems
Q&A

Scenario 2: Customer Records:

A company stores customer information in multiple spreadsheets, with each department


maintaining its own copy. Occasionally, customer addresses and contact details differ
between spreadsheets, leading to confusion and errors in communication.

Problem Type: Data Redundancy and Inconsistency

20
Database Management Systems
Q&A

Scenario 3: Inventory and Orders:

A small business uses separate systems to manage inventory and orders. The inventory
system is not updated in real-time with sales data, leading to situations where the business
oversells products, unaware that the stock has been depleted.

Problem Type: Data Isolation

21
Database Management Systems
Q&A

Scenario 4: Student Grades:

A university records student grades manually, and sometimes errors are made in entering
the data, such as typing grades incorrectly. There is no system in place to check for or
correct these errors automatically.

But, if grade is entered as ‘x’ which is not a valid grade then constraints can be added to
handle.

Problem Type: Integrity Problems

22
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]
23
Database Management Systems
Database Design
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and
S Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Data Abstraction

● DBMS plays an important role in many areas. One of the main requirements is that it must retrieve the
data efficiently.
● This need for efficiency has led the database developers to use complex data structures to represent
data in a database
● Many database-system users are not computer trained and do not know how to write complex queries
for getting information they are looking for.
● Hence developer hide all these complexity from the users
through several layers of data abstraction, in order to simplify
the users’ interaction with the system
• The different levels of data abstraction:
○ Physical Level
○ Logical level
○ View level
Database Management Systems
Data Abstraction

Let us now have a look at what happens at each level:

Physical Level :
• It is the lowest level of data abstraction
• It describes how the data is actually stored in the computer

Logical Level :
• This describes what kind of data is stored in the database and how different pieces
of information are related to each other
• The logical level thus describes the entire database in terms of a small number of
relatively simple structures.
• This level provides a more organized and simpler view of the data

4
Database Management Systems
Data Abstraction

Let us now have a look at what happens at each level:

View Level :
• This is the highest level of abstraction that individual users interact with and that
only displays a part of the entire database

• Even though the logical level uses simpler structures, complexity remains because of
the variety of information stored in a large database. Many users of the database
system do not need all this information; instead, they need to access only a part of
the database.
• At this level, different users might see different parts of the database, depending on
their needs and permissions

5
Database Management Systems
Data Abstraction

Now we can notice that:


• Although the implementation of simple structures at the logical level may involve
complex physical-level structures, the user at the logical level does not need to be
aware of this.
• Similarly the user who views the database for various purposes need not be aware
of the Logical level implementation, for them it would be like an interface and need
not worry about the logical view

6
Database Management Systems
Data Abstraction

PHYSICAL DATA INDEPENDENCE:


• It refers to the characteristic of being able to modify the physical
schema without any alterations to the conceptual or logical
schema, done for optimization purposes.
Example :
• the Conceptual structure of the database would not be affected by
any change in the storage size of the database system server.
Changing from sequential to random access files is one such
example.
• These alterations or modifications to the physical structure may
include:
○ Utilizing new storage devices.
○ Modifying data structures used for storage.
○ Altering indexes or using alternative file organization techniques
etc. 7
Database Management Systems
Data Abstraction

LOGICAL DATA INDEPENDENCE


• It refers characteristic of being able to modify the logical schema
without affecting the external schema or application program.
• The user view of the data would not be affected by any changes to
the conceptual view of the data.

Example:
• These changes may include the insertion or deletion of attributes,
• altering table structures entities or relationships to the logical
schema, etc.

8
Database Management Systems
Data Abstraction

Now let us understand this with the help of an example of a


University database
• Let the university have the given Record types/ tables as shown
Now for the given database,
• Physical level:
○ Specify how student records, course information, and
department data are stored on the hard drive or in memory.
• Logical Level:
○ It would define entities like "Department," "Student," and
"Course" and their attributes (e.g., department name,
student ID, course code).
○ It also outlines the relationships between these entities,
such as "Students belong to Departments" and "Courses are
taken by Students.“
9
Database Management Systems
Data Abstraction

• View Level:
○ A faculty member may have access to a view showing only the
courses they are teaching and the students enrolled in those
courses.
○ A student, on the other hand, may have a view displaying only
the courses they are registered for and their respective grades.
○ These views provide a simplified and customized perspective of
the database for different users.

10
Database Management Systems
Data Abstraction

Let us consider the situation where the University decides to migrate


its database from one storage system to another or optimize the
storage layout to improve performance. What would be the
consequence of this?

Without Physical data independence:


• Minor changes to the storage system or hardware upgrades could
potentially break existing queries, applications, and interfaces.
• As a result, any update to the storage system would require
extensive modifications to the application layer, leading to higher
maintenance costs and longer development cycles.

11
Database Management Systems
Data Abstraction

Let us consider the situation where the University decides to migrate


its database from one storage system to another or optimize the
storage layout to improve performance. What would be the
consequence of this?

With physical data independence:


• the database administrators can make these changes without
altering the logical schema, which means users' queries and
applications will continue to work seamlessly.
• It allows for easier maintenance, upgrades, and performance
enhancements without disrupting the application layer.

12
Database Management Systems
Data Abstraction

Let's say the University decides to reorganize its department


structure and add new attributes to the "Department" entity. Then
what would be the effect?

Without Logical data independence:


• Any alteration to the logical schema, such as adding or removing
attributes, or changing relationships between entities, would
result in changes propagating throughout the applications that
rely on the database.
• This would force developers to modify all affected application
code to accommodate the changes in the schema.
• Consequently, even minor changes to the data model could lead
to widespread disruption and downtime for the applications,
increasing the risk of introducing errors and inconsistencies.
13
Database Management Systems
Data Abstraction

Let's say the University decides to reorganize its department


structure and add new attributes to the "Department" entity. Then
what would be the effect?

With Logical data independence:


• The database designers can make these modifications without
impacting the existing application programs that access the
database.
• This flexibility allows for easier adaptability to changes in the
University's requirements and business rules without causing
application disruptions.

14
Database Management Systems
Instances and Schema

• Databases change over time as information is inserted and deleted or modified.

• The collection of information stored in the database at a particular moment is called an instance of
the database.
• The overall design of the database is called the database schema.

• Let us try to understand what that exactly means with an analogy to a program written in any
programming language
• Database Schema => variable declarations along with the associated type definitions

• Instances => Value of the variable at a particular point in time in the program

15
Database Management Systems
Three Schema Architecture

16
Database Management Systems
Database State

● The actual data in a database may change quite frequently.


● The data in the database at a particular moment in time is called a database state or
snapshot.
● It is also called the current set of occurrences or instances in the database.
● In a given database state, each schema construct has its own current set of instances.
● Many database states can be constructed to correspond to a particular database schema.

21
Database Management Systems
Database State V/S Schema

● The distinction between database schema and database state is very important.
● When we define a new database, we specify its database schema only to the DBMS.
● At this point, the corresponding database state is the empty state with no data.
● We get the initial state of the database when the database is first populated or loaded with the
initial data.
● From then on, every time an update operation is applied to the database, we get another
database state. At any point in time, the database has a current state.

22
Database Management Systems
Database State V/S Schema

A simplified COMPANY relational database schema

Sample database state for the relational database schema

23
Database Management Systems
Why Database Design ?

Scenario:

Imagine you are tasked with designing a database for a new online bookstore. The
bookstore needs to manage a large catalog of books, customer information, orders,
and reviews. Customers should be able to search for books, place orders, leave
reviews, and track their order history.

Question
What factors would you need to consider while designing the database system to ensure it
meets all the needs of the online bookstore?

(Hint :
●Types of data Eg: books, customers, orders, reviews?
●How will you ensure data integrity and security?)

24
Database Management Systems
Database Design

● Database design mainly involves the design of the database schema.


● The design of a complete database application environment that meets the needs of the
enterprise being modeled requires attention to a broader set of issues.
● A high-level data model gives designers a conceptual framework to specify user data
requirements and structure the database accordingly.

The various steps involved in the design process are :

25
Database Management Systems
Database Design

Let us now try and understand what happens at each step:

1. USER REQUIREMENT SPECIFICATION


• Interact with domain experts and users to fully understand the data needs and functional
requirements of the database.
• Identify the operations or transactions that users will perform on the data.
• Document and review the user requirements to ensure clarity and accuracy.
2. CONCEPTUAL DESIGN PHASE
• Choose a high-level data model (e.g., entity-relationship model) to conceptualize the data
requirements and relationships.
• Develop a conceptual schema that provides a detailed overview of the enterprise's data structure.
• Review the schema to ensure all data requirements are met and remove any redundant features.

26
Database Management Systems
Database Design

3. LOGICAL DESIGN PHASE


• Map the high-level conceptual schema to the implementation data model of the database system that will
be used (e.g., relational model).
• Use normalization algorithms to generate a set of tables that properly group attributes and reduce data
redundancy.
• Confirm that the logical design meets the functional requirements specified in the user requirements
phase.

4. PHYSICAL DESIGN PHASE


• Specify the physical features of the database, including file organization and internal storage structures.
• Consider performance optimization and choose appropriate indexing techniques to enhance data
retrieval speed.
• Review and refine the physical design to ensure it aligns with the requirements and constraints of the
database system and hardware.
27
Database Management Systems
Typical DBMS Component Modules

28
Database Management Systems
Typical DBMS Component Modules

• SDM - controls access to DBMS info that is stored on disk (dotted lines)
• DDL Compiler - processes schema definitions specified in the DDL and
stores metadata in catalog
• Query Compiler – handles high level queries - parses and compiles a query
and calls to runtime processor for execution
• Pre Compiler – extracts DML commands from application program
• Run time db processor – executes privileged commands/queries/canned
transactions.

29
Database Management Systems
Database System Utilities

• Loading utility – Load existing data files into a database. Includes data conversion tools
from old format to new db format
• Backup utility - Back up of the database periodically. (Incremental/Total)

• Database storage reorganization - Reorganizing database file structures and create new
access paths to improve performance.
• Performance monitoring utilities. – monitors db usage and provides statistics to DBA

30
Database Management Systems
Database Architecture

Scenario:
Imagine you are responsible for the database system of a popular online multiplayer game. The game
has thousands of players online simultaneously, from different parts of the world. Each player needs
real-time updates for their game stats, friends list, and in-game purchases.

Question:
How would you design the database system to ensure efficient, secure, and real-time data access for all
players?

31
Database Management Systems
Database Architecture

• Centralized databases (Centralized architecture)

Characteristics:
• Single Location: All data is stored and managed on a central
server or database.
• Shared Memory: Suitable for systems with multiple CPUs
that access a common shared memory space.
• Resource Allocation: Resources like CPU, memory, and
storage are centralized, making it easier to manage and
maintain.

Use Cases:
• Small to Medium-Sized Applications: Suitable for
applications with moderate data needs and user loads.
• Legacy Systems: Often used in older systems where scaling is
not as critical or feasible.
Centralized/Shared Memory DB

32
Database Management Systems
Database Architecture

Client Server Architecture:

• Client-server architecture is a model where a client (the


user or application) interacts with a server (the
database system) to access and manage data.

• The client is the user-facing component that interacts


with the server to request data or perform operations.
(web browser, mobile apps, desktop applications)

• The server is the backend component that manages the


database, processes requests from clients, and handles
data storage and retrieval. (MySQL, Oracle, Postgres)

33
Database Management Systems
Database Architecture

Parallel databases:

• Designed to run on a cluster of multiple machines


• Enables better scalability and higher processing capabilities.
• Many cores shared memory and Shared Disk

Parallel databases are ideal for high-performance applications


requiring extensive data processing, such as large-scale data
warehousing, real-time analytics, and high-throughput transactional
systems.

Example: How do you think NASA manages and processes the vast
amounts of satellite data it collects?
Answer: NASA uses parallel databases and high-performance
computing (HPC) systems. NASA can efficiently handle and analyze
massive datasets from satellite missions, enabling quick and
insightful analysis for complex tasks like climate modeling and NASA’s HPC Cluster
environmental monitoring.
34
Database Management Systems
Application Architectures

Distributed Database:
A distributed database is a database system in which data is stored
across multiple physical locations. These locations can be spread
across various machines within a single data center or across
different geographic locations.
Advantages:
• Scalability: Can handle large amounts of data and high traffic by
distributing the load across multiple nodes.
• Fault Tolerance: Provides high availability and fault tolerance
through data replication and distribution.
• Performance: Can improve performance by distributing queries
and data access across multiple servers.

35
Database Management Systems
Two-Tier Client Server Architectures

• Client: This is the user interface where users


interact with the application. It can be a
desktop application or a web-based interface.
• Server: This is where the database is stored
and managed. The server handles data
storage, retrieval, and management tasks.
• Client programs send query and transaction
requests via ODBC or JDBC API, the server
In a two-tier database architecture, there are two processes these requests and sends the
main components: the client (user interface) and results back to the client, which then
the server (database). processes and displays the results.

36
Database Management Systems
Two-Tier Client Server Architectures

Clients
• Provide appropriate interfaces through a client software module to access and utilize the various server resources.
• Clients may be diskless machines or PCs or Workstations with disks with only the client software installed.
• Connected to the servers via some form of a network.
• (LAN: local area network, wireless network, etc.)
Servers
• Provides database query and transaction services to the clients
• Relational DBMS servers are often called SQL servers, query servers, or transaction servers
• Applications running on clients utilize an Application Program Interface (API) to access server databases via
standard interface such as:
• ODBC: Open Database Connectivity standard
• JDBC: for Java programming access
• Client and server must install appropriate client module and server module software for ODBC or JDBC

37
Database Management Systems
Three-Tier Client Server Architectures

● Common for Web applications


● Intermediate Layer called Application Server or Web Server:
○ Stores the web connectivity software and the business
logic part of the application used to access the
corresponding data from the database server
○ Acts like a conduit for sending partially processed data
between the database server and the client.
● Three-tier Architecture Can Enhance Security:
○ Database server only accessible via middle tier
○ Clients cannot directly access database server

38
Database Management Systems
Questions

Scenarios for Architecture Selection:


1. Weather Forecasting System: A weather forecasting system collects data from
various sensors and satellites, processes the data to generate forecasts, and
distributes the information to users through a web interface and mobile application.
2. E-Learning Platform: An e-learning platform offers online courses to students
around the world. The platform needs a database to manage course content, track
student progress, and support interactive features such as quizzes and discussion
forums.
3. Social Media Platform: A social media platform with millions of users worldwide
needs a database system that can handle high read and write loads, store large
amounts of user-generated content, and provide fast access to data.

39
Database Management Systems
Questions

Scenarios for Architecture Selection: Solutions -


• Weather Forecasting System: Parallel database architecture.
This architecture can handle large volumes of data from sensors and satellites, process it
efficiently, and distribute forecasts quickly through web and mobile interfaces.
• E-Learning Platform: 3-tier architecture.
This architecture supports the management of course content, tracking of student
progress, and interactive features like quizzes and discussion forums, ensuring scalability
and efficient data access.
• Social Media Platform: Distributed database architecture.
This architecture supports high read and write loads and large amounts of user-
generated content, providing fast access to data for users worldwide.

40
Database Management Systems
Database Users

People who work with a database can be categorized as:


1. Database Users
2. Database Administrators.

Database Users
There are four different types of database-system users, differentiated by the way they
expect to interact with the system. Different types of user interfaces have been designed
for different types of users.
1. Naïve Users
2. Application Programmers
3. Sophisticated Users
4. Database Administrators

41
Database Management Systems
Database Users

Naïve Users:
• Naive users also known as Parametric End users, don't have any knowledge of databases
but still frequently use the database applications to get the desired results. With the help
of the interface provided by the DBMS applications.

• For Example: Have you ever tried to book a ticket for your favorite movie? It's
straightforward and easy.
• But, do you have any idea about the complex database operations happening behind the
scenes to make this process smooth and efficient?
• In database terminology, you are a naive user or a parametric end user. This means you
interact with the database through a predefined interface without needing to know
anything about how the database works.

42
Database Management Systems
Database Users

Application programmers :
• Application Programmers also known as Back-End Developers, are computer
professional users who are responsible for developing the application programs.
• Example: For an e-commerce site, application programmers connect the app to the
database, write queries to display products, manage user registrations and orders,
handle database errors, and secure user data.

Sophisticated Users:
• Sophisticated users are individuals who possess a deep understanding of database
management systems (DBMS) and are proficient in using Data Definition Language
(DDL) and Data Manipulation Language (DML) commands.
• Example: Data engineers and developers often access databases directly using SQL
queries to fetch, delete, update, or insert data, making them sophisticated users.

43
Database Management Systems
Database Administrator

One of the main reasons for using DBMSs is to have central control of both the data and the
programs that access those data.
Database Administrator is an individual or a team of users who defines the database schema and
takes charge of controlling various levels of the database within the organization.
Functions of a Database Administrator (DBA):
1. Schema definition: DBA creates the original database schema using data definition statements
(DDL).
2. Storage structure and access-method definition: DBA specifies parameters for data physical
organization and index creation.
3. Schema and physical-organization modification: DBA makes changes to reflect organization's
needs or enhance performance.
4. Granting authorization for data access: DBA regulates user access by granting different types of
authorization.
5. Routine maintenance: DBA performs periodic backups, ensures disk space availability, and
monitors database performance.
44
Database Management Systems
Q&A

Questions:

1. "Application programmers are responsible for defining storage structures and access
methods in the database.“ – True/False

2. "A __________ user interacts with a system using predefined interfaces and does
not need to write code.“

3. "The role of a __________ includes creating the original database schema and
managing user access.“

4. "What type of user would most likely use a query language to retrieve specific data
from a database?“

45
Database Management Systems
Q&A

Questions:

1. "Application programmers are responsible for defining storage structures and access
methods in the database.“ – False (Database Administrators handle this)

2. "A Naïve user interacts with a system using predefined interfaces and does not need
to write code.“

3. "The role of Database Administrator a includes creating the original database


schema and managing user access.“

4. "What type of user would most likely use a query language to retrieve specific data
from a database?“ - Sophisticated User

46
Database Management Systems
Q&A

Some more questions for revision:

1. What are the four main phases of database design? Briefly explain what happens in
each phase.
2. What is the role of the storage manager in a DBMS? List any three components it
manages.
3. What are the three main steps in query processing, and why is query optimization
important?
4. Define a transaction in DBMS. What are the key properties a transaction must
maintain?

47
Database Management Systems
Q&A

5. Define the three levels of data abstraction in a DBMS and explain with an example
from a university database.
6. What is the difference between physical and logical data independence? Give one
example for each.
7. Differentiate between database schema and database instance with an analogy from
programming.
8. What is a view schema? How does it help with security and user customization?

48
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

49
Database Management Systems
Database Design
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and
S Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Data Abstraction

● DBMS plays an important role in many areas. One of the main requirements is that it
must retrieve the data efficiently.
● This need for efficiency has led the database developers to use complex data
structures to represent data in a database
● Many database-system users are not computer trained and do not know how to
write complex queries for getting information they are looking for.
● Hence developer hide all these complexity from the users
through several layers of data abstraction, in order to simplify
the users’ interaction with the system
• The different levels of data abstraction:
○ Physical Level
○ Logical level
○ View level
Database Management Systems
Data Abstraction

Let us now have a look at what happens at each level:

Physical Level :
• It is the lowest level of data abstraction
• It describes how the data is actually stored in the computer

Logical Level :
• This describes what kind of data is stored in the database and how different pieces of
information are related to each other
• The logical level thus describes the entire database in terms of a small number of
relatively simple structures.
• This level provides a more organized and simpler view of the data

4
Database Management Systems
Data Abstraction

Let us now have a look at what happens at each level:

View Level :
• This is the highest level of abstraction that individual users interact with and that only
displays a part of the entire database
• Even though the logical level uses simpler structures, complexity remains because of
the variety of information stored in a large database. Many users of the database
system do not need all this information; instead, they need to access only a part of
the database.
• At this level, different users might see different parts of the database, depending on
their needs and permissions

5
Database Management Systems
Data Abstraction

Now we can notice that:


• Although the implementation of simple structures at the logical level may involve
complex physical-level structures, the user at the logical level does not need to be
aware of this.
• Similarly the user who views the database for various purposes need not be aware of
the Logical level implementation, for them it would be like an interface and need not
worry about the logical view

6
Database Management Systems
Data Abstraction

PHYSICAL DATA INDEPENDENCE:


• It refers to the characteristic of being able to modify the physical
schema without any alterations to the conceptual or logical schema,
done for optimization purposes.
Example :
• the Conceptual structure of the database would not be affected by
any change in the storage size of the database system server.
Changing from sequential to random access files is one such
example.
• These alterations or modifications to the physical structure may
include:
○ Utilizing new storage devices.
○ Modifying data structures used for storage.
○ Altering indexes or using alternative file organization techniques
etc. 7
Database Management Systems
Data Abstraction

LOGICAL DATA INDEPENDENCE


• It refers characteristic of being able to modify the logical schema
without affecting the external schema or application program. The
user view of the data would not be affected by any changes to the
conceptual view of the data.

Example:
• These changes may include the insertion or deletion of attributes,
• altering table structures entities or relationships to the logical
schema, etc.

8
Database Management Systems
Data Abstraction

Now let us understand this with the help of an example of a


University database
• Let the university have the given Record types/ tables as shown
Now for the given database,
• Physical level:
○ Specify how student records, course information, and
department data are stored on the hard drive or in memory.
• Logical Level:
○ It would define entities like "Department," "Student," and
"Course" and their attributes (e.g., department name,
student ID, course code).
○ It also outlines the relationships between these entities, such
as "Students belong to Departments" and "Courses are taken
by Students.“
9
Database Management Systems
Data Abstraction

Now let us understand this with the help of an example of a University


database
• Let the university have the given Record types/ tables as shown
Now for the given database,
• View Level:
○ A faculty member may have access to a view showing only the
courses they are teaching and the students enrolled in those
courses.
○ A student, on the other hand, may have a view displaying only
the courses they are registered for and their respective grades.
○ These views provide a simplified and customized perspective of
the database for different users.
Similarly , how could physical and logical data independence be of an
advantage in the case of the university database?
10
Database Management Systems
Data Abstraction

Let us consider the situation where the University decides to migrate


its database from one storage system to another or optimize the
storage layout to improve performance. What would be the
consequence of this?

Without Physical data independence:


• Minor changes to the storage system or hardware upgrades could
potentially break existing queries, applications, and interfaces.
• As a result, any update to the storage system would require
extensive modifications to the application layer, leading to higher
maintenance costs and longer development cycles.

11
Database Management Systems
Data Abstraction

Let us consider the situation where the University decides to migrate


its database from one storage system to another or optimize the
storage layout to improve performance. What would be the
consequence of this?

With physical data independence:


• the database administrators can make these changes without
altering the logical schema, which means users' queries and
applications will continue to work seamlessly.
• It allows for easier maintenance, upgrades, and performance
enhancements without disrupting the application layer.

12
Database Management Systems
Data Abstraction

Let's say the University decides to reorganize its department


structure and add new attributes to the "Department" entity. Then
what would be the effect?

Without Logical data independence:


• Any alteration to the logical schema, such as adding or removing
attributes, or changing relationships between entities, would
result in changes propagating throughout the applications that
rely on the database.
• This would force developers to modify all affected application
code to accommodate the changes in the schema.
• Consequently, even minor changes to the data model could lead
to widespread disruption and downtime for the applications,
increasing the risk of introducing errors and inconsistencies.
13
Database Management Systems
Data Abstraction

Let's say the University decides to reorganize its department


structure and add new attributes to the "Department" entity. Then
what would be the effect?

With Logical data independence:


• The database designers can make these modifications without
impacting the existing application programs that access the
database.
• This flexibility allows for easier adaptability to changes in the
University's requirements and business rules without causing
application disruptions.

14
Database Management Systems
Instances and Schema

• Databases change over time as information is inserted and deleted or modified.

• The collection of information stored in the database at a particular moment is called


an instance of the database.
• The overall design of the database is called the database schema.

• Let us try to understand what that exactly means with an analogy to a program
written in any programming language
• Database Schema => variable declarations along with the associated type
definitions
• Instances => Value of the variable at a particular point in time in the program

15
Database Management Systems
Three Schema Architecture

16
Database Management Systems
Database State

● The actual data in a database may change quite frequently.


● The data in the database at a particular moment in time is called a database state or
snapshot.
● It is also called the current set of occurrences or instances in the database.
● In a given database state, each schema construct has its own current set of
instances.
● Many database states can be constructed to correspond to a particular database
schema.

21
Database Management Systems
Database State V/S Schema

● The distinction between database schema and database state is very important.
● When we define a new database, we specify its database schema only to the DBMS.
● At this point, the corresponding database state is the empty state with no data.
● We get the initial state of the database when the database is first populated or
loaded with the initial data.
● From then on, every time an update operation is applied to the database, we get
another database state. At any point in time, the database has a current state.

22
Database Management Systems
Database State V/S Schema

A simplified COMPANY relational database schema

Sample database state for the relational database schema

23
Database Management Systems
Why Database Design ?

Scenario:

Imagine you are tasked with designing a database for a new online bookstore. The
bookstore needs to manage a large catalog of books, customer information, orders,
and reviews. Customers should be able to search for books, place orders, leave
reviews, and track their order history.

Question
What factors would you need to consider while designing the database system to ensure it
meets all the needs of the online bookstore?

(Hint :
●Types of data Eg: books, customers, orders, reviews?
●How will you ensure data integrity and security?)

24
Database Management Systems
Database Design

● Database design mainly involves the design of the database schema.


● The design of a complete database application environment that meets the needs of
the enterprise being modeled requires attention to a broader set of issues.
● A high-level data model gives designers a conceptual framework to specify user data
requirements and structure the database accordingly.

The various steps involved in the design process are :

25
Database Management Systems
Database Design

Let us now try and understand what happens at each step:

1. USER REQUIREMENT SPECIFICATION


• Interact with domain experts and users to fully understand the data needs and
functional requirements of the database.
• Identify the operations or transactions that users will perform on the data.
• Document and review the user requirements to ensure clarity and accuracy.

2. CONCEPTUAL DESIGN PHASE


• Choose a high-level data model (e.g., entity-relationship model) to conceptualize
the data requirements and relationships.
• Develop a conceptual schema that provides a detailed overview of the enterprise's
data structure.
• Review the schema to ensure all data requirements are met and remove any 26
redundant features.
Database Management Systems
Database Design

3. LOGICAL DESIGN PHASE


• Map the high-level conceptual schema to the implementation data model of the
database system that will be used (e.g., relational model).
• Use normalization algorithms to generate a set of tables that properly group
attributes and reduce data redundancy.
• Confirm that the logical design meets the functional requirements specified in the
user requirements phase.

4. PHYSICAL DESIGN PHASE


• Specify the physical features of the database, including file organization and
internal storage structures.
• Consider performance optimization and choose appropriate indexing techniques
to enhance data retrieval speed.
• Review and refine the physical design to ensure it aligns with the requirements
and constraints of the database system and hardware. 27
Database Management Systems
Typical DBMS Component Modules

28
Database Management Systems
Typical DBMS Component Modules

• SDM - controls access to DBMS info that is stored on disk (dotted


lines)
• DDL Compiler - processes schema definitions specified in the DDL and
stores metadata in catalog
• Query Compiler – handles high level queries - parses and compiles a
query and calls to runtime processor for execution
• Pre Compiler – extracts DML commands from application program
• Run time db processor – executes privileged
commands/queries/canned transactions.

29
Database Management Systems
Database System Utilities

• Loading utility – Load existing data files into a database. Includes data
conversion tools from old format to new db format
• Backup utility - Back up of the database periodically.
(Incremental/Total)
• Database storage reorganization - Reorganizing database file
structures and create new access paths to improve performance.
• Performance monitoring utilities. – monitors db usage and provides
statistics to DBA

30
Database Management Systems
Database Architecture

Scenario:
Imagine you are responsible for the database system of a popular online multiplayer game. The game
has thousands of players online simultaneously, from different parts of the world. Each player needs
real-time updates for their game stats, friends list, and in-game purchases.

Question:
How would you design the database system to ensure efficient, secure, and real-time data access for all
players?

31
Database Management Systems
Database Architecture

• Centralized databases (Centralized architecture)

Characteristics:
• Single Location: All data is stored and managed on a central
server or database.
• Shared Memory: Suitable for systems with multiple CPUs
that access a common shared memory space.
• Resource Allocation: Resources like CPU, memory, and
storage are centralized, making it easier to manage and
maintain.

Use Cases:
• Small to Medium-Sized Applications: Suitable for
applications with moderate data needs and user loads.
• Legacy Systems: Often used in older systems where scaling is
not as critical or feasible.
Centralized/Shared Memory DB

32
Database Management Systems
Database Architecture

Client Server Architecture:

• Client-server architecture is a model where a client (the


user or application) interacts with a server (the
database system) to access and manage data.

• The client is the user-facing component that interacts


with the server to request data or perform operations.
(web browser, mobile apps, desktop applications)

• The server is the backend component that manages the


database, processes requests from clients, and handles
data storage and retrieval. (MySQL, Oracle, Postgres)

33
Database Management Systems
Database Architecture

Parallel databases:

• Designed to run on a cluster of multiple machines


• Enables better scalability and higher processing capabilities.
• Many cores shared memory and Shared Disk

Parallel databases are ideal for high-performance applications


requiring extensive data processing, such as large-scale data
warehousing, real-time analytics, and high-throughput transactional
systems.

Example: How do you think NASA manages and processes the vast
amounts of satellite data it collects?
Answer: NASA uses parallel databases and high-performance
computing (HPC) systems. NASA can efficiently handle and analyze
massive datasets from satellite missions, enabling quick and
insightful analysis for complex tasks like climate modeling and NASA’s HPC Cluster
environmental monitoring.
34
Database Management Systems
Application Architectures

Distributed Database:
A distributed database is a database system in which data is stored
across multiple physical locations. These locations can be spread
across various machines within a single data center or across
different geographic locations.
Advantages:
• Scalability: Can handle large amounts of data and high traffic by
distributing the load across multiple nodes.
• Fault Tolerance: Provides high availability and fault tolerance
through data replication and distribution.
• Performance: Can improve performance by distributing queries
and data access across multiple servers.

35
Database Management Systems
Two-Tier Client Server Architectures

• Client: This is the user interface where users


interact with the application. It can be a
desktop application or a web-based interface.
• Server: This is where the database is stored
and managed. The server handles data
storage, retrieval, and management tasks.
• Client programs send query and transaction
requests via ODBC or JDBC API, the server
In a two-tier database architecture, there are two processes these requests and sends the
main components: the client (user interface) and results back to the client, which then
the server (database). processes and displays the results.

36
Database Management Systems
Two-Tier Client Server Architectures

Clients
• Provide appropriate interfaces through a client software module to access and utilize the
various server resources.
• Clients may be diskless machines or PCs or Workstations with disks with only the client
software installed.
• Connected to the servers via some form of a network.
• (LAN: local area network, wireless network, etc.)
Servers
• Provides database query and transaction services to the clients
• Relational DBMS servers are often called SQL servers, query servers, or transaction servers
• Applications running on clients utilize an Application Program Interface (API) to access server
databases via standard interface such as:
• ODBC: Open Database Connectivity standard
• JDBC: for Java programming access
• Client and server must install appropriate client module and server module software for ODBC
or JDBC
37
Database Management Systems
Three-Tier Client Server Architectures

● Common for Web applications


● Intermediate Layer called Application Server or Web Server:
○ Stores the web connectivity software and the business
logic part of the application used to access the
corresponding data from the database server
○ Acts like a conduit for sending partially processed data
between the database server and the client.
● Three-tier Architecture Can Enhance Security:
○ Database server only accessible via middle tier
○ Clients cannot directly access database server

38
Database Management Systems
Questions

Scenarios for Architecture Selection:


1. Weather Forecasting System: A weather forecasting system collects data from
various sensors and satellites, processes the data to generate forecasts, and
distributes the information to users through a web interface and mobile application.
2. E-Learning Platform: An e-learning platform offers online courses to students
around the world. The platform needs a database to manage course content, track
student progress, and support interactive features such as quizzes and discussion
forums.
3. Social Media Platform: A social media platform with millions of users worldwide
needs a database system that can handle high read and write loads, store large
amounts of user-generated content, and provide fast access to data.

39
Database Management Systems
Questions

Scenarios for Architecture Selection: Solutions -


• Weather Forecasting System: Parallel database architecture.
This architecture can handle large volumes of data from sensors and satellites, process it
efficiently, and distribute forecasts quickly through web and mobile interfaces.
• E-Learning Platform: 3-tier architecture.
This architecture supports the management of course content, tracking of student
progress, and interactive features like quizzes and discussion forums, ensuring scalability
and efficient data access.
• Social Media Platform: Distributed database architecture.
This architecture supports high read and write loads and large amounts of user-
generated content, providing fast access to data for users worldwide.

40
Database Management Systems
Database Users

People who work with a database can be categorized as:


1. Database Users
2. Database Administrators.

Database Users
There are four different types of database-system users, differentiated by the way they
expect to interact with the system. Different types of user interfaces have been designed
for different types of users.
1. Naïve Users
2. Application Programmers
3. Sophisticated Users
4. Database Administrators

41
Database Management Systems
Database Users

Naïve Users:
• Naive users also known as Parametric End users, don't have any knowledge of databases
but still frequently use the database applications to get the desired results. With the help
of the interface provided by the DBMS applications.

• For Example: Have you ever tried to book a ticket for your favorite movie? It's
straightforward and easy.
• But, do you have any idea about the complex database operations happening behind the
scenes to make this process smooth and efficient?
• In database terminology, you are a naive user or a parametric end user. This means you
interact with the database through a predefined interface without needing to know
anything about how the database works.

42
Database Management Systems
Database Users

Application programmers :
• Application Programmers also known as Back-End Developers, are computer
professional users who are responsible for developing the application programs.
• Example: For an e-commerce site, application programmers connect the app to the
database, write queries to display products, manage user registrations and orders,
handle database errors, and secure user data.

Sophisticated Users:
• Sophisticated users are individuals who possess a deep understanding of database
management systems (DBMS) and are proficient in using Data Definition Language
(DDL) and Data Manipulation Language (DML) commands.
• Example: Data engineers and developers often access databases directly using SQL
queries to fetch, delete, update, or insert data, making them sophisticated users.

43
Database Management Systems
Database Administrator

One of the main reasons for using DBMSs is to have central control of both the data and the
programs that access those data.
Database Administrator is an individual or a team of users who defines the database schema
and takes charge of controlling various levels of the database within the organization.
Functions of a Database Administrator (DBA):
1. Schema definition: DBA creates the original database schema using data definition
statements (DDL).
2. Storage structure and access-method definition: DBA specifies parameters for data physical
organization and index creation.
3. Schema and physical-organization modification: DBA makes changes to reflect organization's
needs or enhance performance.
4. Granting authorization for data access: DBA regulates user access by granting different types
of authorization.
5. Routine maintenance: DBA performs periodic backups, ensures disk space availability, and
monitors database performance.
44
Database Management Systems
Q&A

Questions:

1. "Application programmers are responsible for defining storage structures and access
methods in the database.“ – True/False

2. "A __________ user interacts with a system using predefined interfaces and does
not need to write code.“

3. "The role of a __________ includes creating the original database schema and
managing user access.“

4. "What type of user would most likely use a query language to retrieve specific data
from a database?“

45
Database Management Systems
Q&A

Questions:

1. "Application programmers are responsible for defining storage structures and access
methods in the database.“ – False (Database Administrators handle this)

2. "A Naïve user interacts with a system using predefined interfaces and does not need
to write code.“

3. "The role of Database Administrator a includes creating the original database


schema and managing user access.“

4. "What type of user would most likely use a query language to retrieve specific data
from a database?“ - Sophisticated User

46
Database Management Systems
Q&A

Some more questions for revision:

1. What are the four main phases of database design? Briefly explain what happens in
each phase.
2. What is the role of the storage manager in a DBMS? List any three components it
manages.
3. What are the three main steps in query processing, and why is query optimization
important?
4. Define a transaction in DBMS. What are the key properties a transaction must
maintain?

47
Database Management Systems
Q&A

5. Define the three levels of data abstraction in a DBMS and explain with an example
from a university database.
6. What is the difference between physical and logical data independence? Give one
example for each.
7. Differentiate between database schema and database instance with an analogy from
programming.
8. What is a view schema? How does it help with security and user customization?

48
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

49
Database Management Systems
Entity-Relationship Model
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
1
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and S
Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Recap – Design Phases

• As we have seen before the design phases


would involve the following steps:

○ Characterizing User Data Needs


○ Choosing a Data Model and Translating
Requirements
○ Review and Refinement of Schema
○ Functional Requirements Specification
○ Logical-Design Phase
○ Physical-Design Phase

3
Database Management Systems
E-R model

The Entity-Relationship (E-R) model is a data modeling technique designed to assist in


database design.
It enables the creation of an enterprise schema that captures the logical structure of a
database.
Purpose and Function: The E-R model aids in representing real-world entities, their
meanings, and interactions within a conceptual schema for database design.

4
Database Management Systems
E-R model

• E-R diagrams provide a graphical representation of the database's logical structure.


They use symbols to depict entities, relationships, and attributes, making complex
relationships easier to understand.
• E-R diagrams are valued for their simplicity and clarity, making them a widely used
tool in database design.
• By mapping out the entities, attributes, and relationships before implementation, E-R
Diagrams help in identifying potential issues, redundancies, or inconsistencies in the
database design. This leads to more efficient and optimized database structures,
reducing the need for costly modifications later.
• E-R Diagrams serve as a comprehensive documentation tool that records the structure
of the database.

5
Database Management Systems
Entity

• What is an Entity?
○ An entity is a distinct "thing" or "object" in the real world.
○ It's different from other objects and can be a person, course, or more.

• Properties of an Entity:
○ An entity has specific attributes or properties.
○ Some properties must hold unique values to tell entities apart.
• Example - Student Entity:
○ In a university, each student is an entity.
○ A unique “Studnt_ID" property distinguishes individuals.

6
Database Management Systems
Entity

• Unique Identification of entity:


○ Unique property values identify each entity.
○ For instance, a student’s “student_id" value like PES001 makes them distinct.
○ Courses can also be entities in a university setting.
○ A "course_id" attribute uniquely identifies each course.
• Concrete and Abstract Entities:
○ Entities can be concrete (like people or books) or abstract (like courses or
reservations).
○ Both types serve as distinct objects in the system.

7
Database Management Systems
Entity Sets

• Entity Set ( or Entity Collection ):


○ An entity set is a group of entities of the same kind sharing common attributes or
properties.
○ For instance, "instructor" represents all university instructors, and "student"
represents all university students.
E1 is an entity having Entity Type Student and the set of all students is called Entity Set.
• Extension vs. Entity Set:
○ The real collection of entities is the "extension" of an entity set.
○ Similar to the difference between a relation and a relation instance.
• Non-Disjoint Entity Sets:
○ Entity sets can overlap these sorts of entity sets are called non-disjoint sets.
○ Example: "person" entity set includes instructors, students, both, or neither.

8
Database Management Systems
Entity Sets

9
Database Management Systems
Attributes

• An entity's attributes provide distinct values for identification.


○ For instructors, the ID attribute stands out as a unique identifier.
• It prevents confusion when multiple instructors share the same name.
• How do we decide on the unique identifier for the case of Instructors?
○ Government-issued IDs were historically used, but they raise security and
privacy concerns.
○ Enterprises opt to establish their own identifiers to ensure security and
privacy.

11
Database Management Systems
Attributes

• Entities are represented by specific attributes.


• Attributes are qualities or characteristics of each entity.
• They signify the type of information stored for each entity.
• Every entity possesses its own set of values for attributes.
• Example: Instructors have unique values for attributes like ID, name, dept name, and
salary.

12
Database Management Systems
Value Set (Domain)

● For each attribute there are a set of permitted values that can be taken which is called DOMAIN
or VALUE SET of that attribute.

○ If the range of ages allowed for employees is between 16 and 70, we can specify the value
set of the Age attribute of EMPLOYEE to be the set of integer numbers between 16 and 70.
○ We can specify the value set for the Name attribute to be the set of strings of alphabetic
characters separated by blank characters, and so on.

● Mathematically, an attribute A of entity set E whose value set is V can be defined as a function
from E to the power set6 P(V) of V:
A : E → P(V)

13
Database Management Systems
Types of Attributes

• An attribute used in E-R model, can be characterized as:


○ Simple / Composite attribute
○ Single-valued / Multivalued attributes
○ Derived attributes

14
Database Management Systems
Simple and Composite Attribute

• Formally saying the attributes that cannot be further subdivided into components is a
simple attribute
○ Ex: The ID of the student or Instructor, Roll number of the student, phone
number, email id, etc.

• On the contrary those attributes that can be split into components are called a
composite attribute
○ Ex: The address can be further split into house number, street number, city,
state, country, and pin code,
○ The name can also be split into first name middle name, and last name.

15
Database Management Systems
Composite Attributes

• Let us consider the university database. Like any other organization, the management
would keep a record of all the addresses of the student, instructors and the staff

• Now address would contain the entire street, city, state, and postal_code information.
• Let us say the address of the student is :
○ "123 Main Street, Yeshwanthpur, Bangalore, 560063"

• Now this is how all the information is stored in one attribute.


○ Let us say we would want to group students through the postal code for allotting
the busses. How do we do it?
○ Currently the postal code is not an attribute alone or stored specifically as an
attribute so we cannot get the postal code alone as it is a part of the address

• Thus just having the address stored as one attribute is not always appropriate,
especially when we would want to retrieve students on a part of the address.

16
Database Management Systems
Composite Attributes

• For solving this issue, we try to store these types of attributes by dividing the attribute
into subparts.

• These attributes which can be divided into subparts are called ‘Composite attributes’.

• Using composite attributes in a design schema is a good choice if a user will wish to
refer to an entire attribute on some occasions, and to only a component of the
attribute on other occasions.

17
Database Management Systems
Representation of the Composite attribute

• Let us consider the examples discussed above, and how are they represented it in the
E-R diagram

○ Name could be subdivided into First Name, Middle Name, Last Name

○ The address can be further split into street, city, state, country, and pin code

18
Database Management Systems
Representation of the Composite attribute

• Composite attributes may appear as a hierarchy.


• In the composite attribute address, its component attribute street can be further
divided into street number, street name, and apartment number.

19
Database Management Systems
Multivalued Attributes

• Suppose the university has decided to add the phone number to the instructor
relation.
• An instructor may have Zero, One, or Several phone numbers, and different instructors
may have different number of phone numbers.
• Same is the case with the instructor’s mail_id.
• Till now all the attributes we have studied were having only a single value and not
multiple values. So how do we deal with these type of situation ?

• These types of attributes which can have multiple values are called as multivalued
Attributes

• Examples for the same are :


○ Instructor Phone numbers
○ Instructors mail_id

20
Database Management Systems
Single-valued and Multivalued Attributes

• The attribute which takes up only a single value for each entity instance is a single-
valued attribute.
○ Example: The name of the instructor, age of the student, etc.

• The attribute which takes up more than a single value for each entity instance is a
multi-valued attribute.
○ Example: Phone number of the instructor, Mail id, etc.

21
Database Management Systems
Representation of Multivalued Attributes

• The multivalued attribute is represented in the E-R diagram as shown below:


• The multivalued attribute is represented by a double ellipse.

• Considering the instructor’s Phone_no as a multivalued attribute and ID as an single-


valued attribute:

22
Database Management Systems
Complex Attributes

• Those attributes, which can be formed by the nesting of composite and multi-valued
attributes, are called “Complex Attributes“.
• These attributes are rarely used in DBMS.

• Let us take an example of the same.


○ Let us consider that the student relation contains an attribute called PreviousDegrees
that stores all the previous education qualifications of the students
○ PreviousDegrees of a STUDENT is a composite multi-valued attribute denoted by
{PreviousDegrees (College, Year, Degree, Field)}
○ Multiple PreviousDegrees values can exist. Each has four subcomponent attributes:
College, Year, Degree, Field
○ Therefore it is a complex attribute

23
Database Management Systems
Question

Given ,an attribute Address_EmPhone consisting of email, phone and Address.


Can you identify which are the different attribute types that make up this
attribute.

24
Database Management Systems
Derived Attributes

• Consider the following schema for the Instructor relation


○ Instructor (ID, Name, Department, Date_of_Birth, Age)

• Can you identify any problem in the above schema?


• If you have noticed, we know that the age can be calculated from the Date of birth, i.e.
we can take the system’s current date and the Date of birth of the instructor and we can
find the age.

• So there is no need to store this information in the actual database and can be
calculated.

25
Database Management Systems
Stored and Derived Attributes

• So as shown above, the value of the age attribute can be derived from the date of
Birth.

• Therefore the Age attribute is called a Derived Attribute


• And the Date_of_Birth attribute is called as a stored attribute.

• The stored attribute are those attributes that are actually stored in the database.
○ Example: Date_of_Birth is a stored attribute

• An attribute that can be derived from other attributes is derived attributes.


○ Example: the Age attribute of the instructor is a derived attribute

26
Database Management Systems
Representation of Derived Attribute

• Considering the above example of the instructor relation, Date_of_Birth is a stored


attribute and Age is an derived attribute
• In the below E-R Diagram, Age (derived attribute) is represented by a dashed oval.
• The E-R representation would be:

27
Database Management Systems
Null Values

• Let us consider the diagram representation shown:

• What if the student doesn’t have a middle name?

• In such situations the value of the middle name would be set to NULL

28
Database Management Systems
Null Values

• An attribute takes a null value when an entity does not have a value for it.
• So what does it mean if an attribute is NULL?

• The Null value may indicate “not applicable”—that is, the value does not exist for the entity.
• Null can also designate that an attribute value is unknown. An unknown value may be
either:
○ missing (the value does exist, but we do not have that information)
○ not known (we do not know whether or not the value actually exists).

• A null value for the apartment number attribute could mean:


○ Address does not include an apartment number (not applicable),
○ That an apartment number exists but we do not know what it is (missing),
○ We do not know whether or not an apartment number is part of the instructor’s
address (unknown).

29
Database Management Systems
Descriptive Attributes

• Let us now consider the scenario that every student enrolls for certain courses.
• Now as we know along with what each student enrolls for which course, the university must also
keep track of the grades that the student would score for that particular course.
• Now where do we place this grade attribute? In the Student entity or the Courses entity?
• Actually if we observe this attribute is associated with the relation ‘enroll’ i.e. only if the student
enrolls for the particular course, he can get a grade. Such type of attributes are called Descriptive
attributes
• So it wouldn’t be correct to place it in any of the entities
• Then where do we place them?

30
Database Management Systems
Descriptive Attributes

• A relationship may also have attributes called descriptive attributes.


• An attribute of a relationship set is represented in an E-R diagram by an oval
connected to the relationship’s diamond
• The below diagram shows the discussed example in the form of an ER diagram

31
Database Management Systems
Redundant attributes

• In the university organization, consider the following entity sets:


• Student
• Instructor
• Attributes are selected to capture values in the database;
• Instructors: (ID, name, dept_name, salary)
• Students: (ID, name, dept_name, tot cred)
• Additional attributes like phone number, office number are optional based on the
designer's understanding.

32
Database Management Systems
Redundant attributes

• Consider the entity department in the university database:


• Department (dept_name, building, budget)
• Relationship sets connect entities together, like instructor and department.
• Redundancy can occur due to relationship sets, leading to attribute removal for
clarity.
• The dept_name attribute, redundant in both instructor and department, is removed.
• Instead we represent this using a relationship set (inst_dept) between instructor and
department.

33
Database Management Systems
Redundant attributes

• Redundant dept name attribute in instructor is removed since it's a primary key in the department table.
• In a relational schema, dept_name may be reintroduced in the instructor's relation if each instructor has
one department.
• If instructors have multiple departments, a separate relation (inst_dept) records these relationships.
• Treating instructor-department connection as a relationship prevents assumptions of single department
association.

34
Database Management Systems
Redundant attributes

• Similarly, student entity set is related to the department entity set through a
relationship set (student dept).
• No need for dept name attribute in student entity set.

35
Database Management Systems
Redundant attributes

• Consider Course offerings (sections) and associated time slots.


• Each time slot has a unique time slot id and weekly meetings (day, start time, end time). Weekly meeting
times are modeled as a multivalued composite attribute.
• Entity sets: section and time slot.
• Section: (course id, sec id, semester, year, building, room number, time slot id)
• Time slot: (time slot id, {(day, start time, end time)}).
• Entities related through relationship set sec time slot. Multivalued composite
attribute
• Issue of redundancy with time slot id in both entity sets.
• Time slot id is the primary key for the time slot entity.
• Therefore, Remove the time_slot_id attribute from the section entity.
• A good entity relationship model would not contain any redundant attributes
• Redundancy removal enhances data clarity and avoids duplication
36
Database Management Systems
Relationship and Relationship sets

• What is a Relationship:
○ A relationship signifies connections between entities.
○ Example: "advisor" links instructor Katz to student Shankar.
• Relationship Set:
○ A relationship set contains relationships of the same type.
○ Example: "advisor" relationship set links students and their advisors.

• Entity Sets and Relationships:


○ Consider "instructor" and "student" entity sets.
○ "advisor" relationship set signifies student-advisor connections.

37
Database Management Systems
Relationship and Relationship sets

• Relationship Instance:
○ A relationship instance in an E-R schema represents an association between the named
entities in the real-world enterprise that is being modeled.
○ As an illustration, the individual instructor entity Katz, who has instructor ID 45565, and the
student entity Shankar, who has student ID 12345, participate in a relationship instance of
advisor.
○ This relationship instance represents that in the university, the instructor Katz is advising
student Shankar

38
Database Management Systems
Relationship and Relationship sets

• Representation in E-R Diagram:


○ In an E-R diagram, relationships are depicted by diamonds.
○ Lines connect the diamond to relevant entity sets (rectangles).
• For example let us try and represent the advisor relation

39
Database Management Systems
Relationship sets

• A relationship type R among n entity types E1, E2, . . . , En defines a set of associations—or a
relationship set—among entities from these entity types.
• Similar to the case of entity types and entity sets, a relationship type and its corresponding
relationship set are customarily referred to by the same name, R.
• Mathematically, the relationship set R is a set of relationship instances ri, where each ri associates n
individual entities (e1, e2, . . . , en), and each entity ej in ri is a member of entity set Ej , 1 ≤ j ≤ n.
• Hence, a relationship set is a mathematical relation on E1, E2, . . . , En; alternatively, it can be
defined as a subset of the Cartesian product of the entity sets E1 × E2 × . . . × En.
• Each of the entity types E1, E2, . . . , En is said to participate in the relationship type R; similarly, each
of the individual entities e1, e2, . . . , en is said to participate in the relationship instance ri = (e1, e2, .
. . , en).

40
Database Management Systems
Recursive Relationships

• The function that an entity plays in a relationship is called that entity’s role.
• Since entity sets participating in a relationship set are generally distinct, roles are implicit and are not
usually specified.
• However, they are useful when the meaning of a relationship needs clarification. Such is the case
when the entity sets of a relationship set are not distinct; that is, the same entity set participates in a
relationship set more than once, in different roles.
• In this type of relationship set, sometimes called a recursive relationship set, explicit role names are
necessary to specify how an entity participates in a relationship instance.

41
Database Management Systems
Recursive Relationships

• Let us try and understand it with the help of an example


• consider the entity-set course that records information about all the courses offered in the
university. Now some courses that are offered by the university have some prerequisite
courses which are needed to be completed
• To depict the situation where one course (C2) is a prerequisite for another course (C1) we
have a relationship set prereq that is modeled by ordered pairs of course entities.

• So how do we indicate this in the ER diagram?

42
Database Management Systems
Recursive Relationships

• We indicate roles in E-R diagrams by labeling the lines that connect diamonds to
rectangles.
• The diagram below shows the role indicators course id and prereq id between the
course entity set and the prereq relationship set.

43
Database Management Systems
Recursive Relationship - Employee - Supervisor Example

Can you think of another example of a recursive-relationship?

44
Database Management Systems
Relationship Set and it’s degree

• The relationship sets advisor is an example of a binary relationship set—that is, one that
involves two entity sets.
• Most of the relationship sets in a database system are binary.
• Occasionally, however, relationship sets could even involve more than two entity sets. The
number of entity sets that participate in a relationship set is the degree of the relationship set.
○ A binary relationship set is of degree 2;
○ A ternary relationship set is of degree 3.

45
Database Management Systems
Relationship Set Example

46
Database Management Systems
Ternary Relationships

• Let us consider an example, suppose that we have an entity set project that represents all the
research projects carried out in the university.
• Consider the entity sets instructor, student, and project.
• Each project can have multiple associated students and multiple associated instructors.
Furthermore, each student working on a project must have an associated instructor who guides
the student on the project.
• Suppose we were to represent the information of which instructor is guiding which student for
which project
• How do we represent this or model this in an ER diagram?

47
Database Management Systems
Ternary Relationship Example
• To represent the above situation, we would have to relate the three entity sets through a ternary
relationship set proj_guide, This ternary relation relates entity sets instructor, student, and project.
• An instance of projguide indicates that a particular student is guided by a particular instructor on a
particular project.
• Note that a student could have different instructors as guides for different projects, which cannot be
captured by a binary relationship between students and instructors.

48
Database Management Systems
Ternary Relationship Example

49
Database Management Systems
Redundant relationship - Question

Assume that an employee may work in up to two departments or may not be assigned to
any department. Assume that each department must have one and may have up to
three phone numbers. Under what conditions would the relationship HAS_PHONE be
redundant in this example?

50
Database Management Systems
Redundant relationship - Question

Assume that an employee may work in up to two departments or may not be assigned to any department.
Assume that each department must have one and may have up to three phone numbers. Under what
conditions would the relationship HAS_PHONE be redundant in this example?
● The HAS_PHONE relationship is redundant if phone numbers are exclusively associated with
departments, not individual employees.
● If employees only use department phone numbers, their association with these numbers can be
inferred through the WORKS_IN relationship, eliminating the need for a separate HAS_PHONE
relationship.

51
Database Management Systems
Example - 1

• Let us consider the Instructor relation of the university database.


• Every instructor would have a unique ID, along with this the instructor would also have the name, address,
phone number, Date of Birth, age, and department details to be there in the database
• Given instructor could be a part of only one department. However, each instructor could have multiple
phone numbers.
• The address is subdivided into street, city, state, and zip. The street would be again divided into street_no,
street_name, and apt_number.
• The name attribute would be subdivided into first name, middle name, and last name.

• Could you draw an E-R diagram for containing the entity Instructor along with all the attributes?

52
Database Management Systems
E-R Diagram

53
Database Management Systems
Example - 2

You are designing a database for a Smart City Infra-Management System. One of
the key entities is the Environmental Sensor, which is deployed across the city to
collect various environmental data like air quality, noise levels, etc.

Model the Environmental Sensor entity in an ER diagram with the following details:

● Each sensor must be uniquely IDentified.


● Each sensor has a Type (e.g., temperature, humidity, pollution).
● Each sensor is installed at a specific Location.
● It collects multiple Data readings,containing:
○ Date-Time of reading
○ Value of the reading
● Each sensor also has a Last Calibration Date.

54
Database Management Systems
Example - 2: ER Diagram

Solution:

55
Database Management Systems
Q&A

1) Which of the following is an example of a recursive relationship?


A) Course prerequisites
B) Instructor-student relationship
C) Student-course enrollment
D) Student-advisor relationship

2. What does a binary relationship set involve?


A) One entity set
B) Two entity sets
C) Three entity sets
D) More than three entity sets

56
Database Management Systems
Q&A

3) In an E-R diagram, how is a relationship between two entity sets depicted?


A) By a diamond
B) By a rectangle
C) By a line
D) By an oval

4) Which of the following is true about the relationship set ‘prereq’ in a university database?
A) It connects students to courses
B) It connects instructors to students
C) It connects courses to other courses
D) It connects projects to instructors

57
Database Management Systems
Q&A

5) What does a ternary relationship set involve?


A) Two entity sets
B) Three entity sets
C) One entity set
D) No entity sets

6) In an E-R diagram, how is a primary key represented?


A) Underlined attribute in the oval
B) Underlined attribute in the rectangle
C) Underlined attribute in the diamond
D) Underlined relationship set

58
Database Management Systems
Q&A

7) What does an extension of an entity set refer to?


A) The actual collection of entities
B) The unique identifier for each entity
C) The general set of attributes
D) The relationships between entity sets

8) What does an E-R diagram’s diamond represent?


A) Entity Set
B) Attribute
C) Relationship Set
D) Primary Key

59
Database Management Systems
Q&A

ANSWERS:
1. A
2. B
3. A
4. C
5. B
6. B
7. A
8. C

60
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

61
Database Management Systems
Entity-Relationship Model
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and S
Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Cardinality Ratio

• Mapping cardinalities, or cardinality ratios, express the number of entities to which


another entity can be associated via a relationship set.
• Mapping cardinalities are most useful in describing binary relationship sets, although
they can contribute to the description of relationship sets that involve more than two
entity sets.

• A binary relationship set R between entity sets A and B, the possible mapping
cardinality are:
○ One-to-one
○ One-to-many
○ Many-to-one
○ Many-to-many

3
Database Management Systems
Cardinality Ratio

• One-to-one
○ An entity in A is associated with at most one entity in
B, and an entity in B is associated with at most one
entity in A.
• One-to-many
○ An entity in A is associated with any number (zero or
more) of entities in B. An entity in B, however, can be
associated with at most one entity in A.
• Many-to-one
○ An entity in A is associated with at most one entity in
B. An entity in B, however, can be associated with any
number (zero or more) of entities in A.
• Many-to-many
○ An entity in A is associated with any number (zero or
more) of entities in B, and an entity in B is associated
with any number (zero or more) of entities in A.

4
Database Management Systems
Cardinality Ratio

• Let us now consider the advisor relationship set.

• A student can be advised by several instructors - this relationship set can be classified
as many-to-many.

• In contrast, if a particular university imposes a constraint that a student can be advised


by only one instructor, an instructor can advise several students, then the relationship
set from instructor to student must be one-to-many.

one-to-many many-to-many

5
Database Management Systems
Cardinality Ratio

In the E-R diagram notation, we indicate cardinality constraints on a relationship by


drawing:
• An undirected line (—)
between the relationship set and the entity set in question.

Let us try and understand how to use the above mapping technique with the help of
university database.

6
Database Management Systems
Cardinality Ratio

One-to-one (1:1)

• We draw a line from the relationship set to both entity sets.


• For example, the lines to instructor and student indicate that an instructor may
advise at most one student, and a student may have at most one advisor.

7
Database Management Systems
Cardinality Ratio

One-to-one (1:1)

• An example of a 1:1 binary relationship is


MANAGES as shown in the figure,which relates a
department entity to the employee who manages
that department.

• This represents that at any point in time an


employee can manage at most one department
and a department can have at most one manager.

8
Database Management Systems
Cardinality Ratio

One-to-many (1:N)
• This indicates that an instructor may advise many students, but a student may have
at most one advisor.

9
Database Management Systems
Cardinality Ratio

One-to-many (1:N)
• For example, In the WORKS_FOR binary
relationship type, DEPARTMENT:EMPLOYEE
is of cardinality ratio 1:N.
• Meaning that each department can be
related to any number of employees (N) but
an employee can be related to (work for) at
most one department.
• This means that for this particular
relationship type WORKS_FOR, a particular
department entity can be related to any
number of employees.

10
Database Management Systems
Cardinality Ratio

Many-to-one (N:1)

• The converse of the before type


• This indicates that an instructor may advise at most one student, but a student may
have many advisors.

11
Database Management Systems
Cardinality Ratio

Many-to-many (M:N)

• We draw an undirected line from the relationship set to both entity sets.
• Thus, there are undirected lines from the relationship set advisor to both entity sets
instructor and student.
• This indicates that an instructor may advise many students and a student may have
many advisors.

12
Database Management Systems
Cardinality Ratio

Many-to-many (M:N)

• The relationship type WORKS_ON is of cardinality ratio


M:N
• The rule is that an employee can work on several
projects and a project can have several employees which
is represented by the diagram on the right.

13
Database Management Systems
Participation Constraints and Existence Dependencies

• There are two types of participation :


○ Total participation
○ Partial participation

• The participation of an entity set E in a relationship set R is said to be total if every entity in E
must participate in at least one relationship in R.
○ Considering the university database, a university may require every student to have at least
one advisor Therefore, the participation of student in the relationship set advisor is total.

• If it is possible that some entities in E do not participate in relationships in R, the participation of


entity set E in relationship R is said to be partial.
○ An instructor need not advise any students therefore, The participation of instructor in the
advisor relationship set is therefore partial

14
Database Management Systems
Participation Constraints and Existence Dependencies

• We indicate the total participation of an entity in a relationship set using double lines.
• The example of the advisor relationship set described in the previous slide would look as shown
• where the double line indicates that a student must have an advisor.

15
Database Management Systems
Cardinality ratio

• Let us consider the entities:


○ Student
➢ Attributes: Student_ID,
Name, Major
○ Course
➢ Attributes: Course_ID,
Course_Name, Department

• The institution has made a rule that


given every student has to be
enrolled for a minimum of 2 courses
and a maximum of 5 courses

• How would we represent this in the


ER diagram?

16
Database Management Systems
Min-Max notation

• E-R diagrams also provide a way to indicate more complex constraints on the number of times each
entity participates in relationships in a relationship set.
• A line may have an associated minimum and maximum cardinality, shown in the form (l,h)
○ l is the minimum cardinality
○ h is the maximum cardinality.

• A minimum value = 1 total participation of the entity set in the relationship set;
• A maximum value = 1  that the entity participates in at most one relationship,
• A maximum value = N indicates no limit.

17
Database Management Systems
Min-Max notation and Chen notation

18
Database Management Systems
Example of Chen Notation

19
Database Management Systems
(Min,Max) Notation

20
Database Management Systems
Primary Key In Entity Sets

• Individual entities are conceptually distinct.


• From a database perspective, differences among entities are expressed through
attributes.
• Attribute values must uniquely identify each entity.
• No two entities in an entity set can share the same values for all attributes.
• Key for an entity is a set of attributes that distinguish entities.
• Concepts of superkey, candidate key, and primary key apply to entity sets.

21
Database Management Systems
Primary Key In Relationship Sets

Keys also help uniquely identify relationships. Thus differentiates relationships from each
other.
• So we will now look at how can we identify/distinguish relationships
• Let R be a relationship set involving entity sets E1, E2, …, En.
• Let primary-key(Ei) denote the set of attributes that forms the primary key for entity
set Ei .
• Assume for now that the attribute names of all primary keys are unique.
• The composition of the primary key for a relationship set depends on the set of
attributes associated with the relationship set R.
○ Case 1: Relationship R has no attributes
○ Case 2: If relation R has {a1, a2, …, am} attributes associated with it

22
Database Management Systems
Primary Key In Relationship Sets

Case 1:
• If the relationship set R has no attributes associated with it,
then the set of attributes:
primary-key(E1) ∪ primary-key(E2) ∪ ⋯ ∪ primary-key(En)
describes an individual relationship in set R.
• For example consider the relationship of advisor between Instructor and Student as
shown
• The set of attributes that would describe the relationship would be
○ {primary-key(Instructor) ∪ primary-key(Student)}
○ {Instructor_ID, Student_ID}

23
Database Management Systems
Primary Key In Relationship Sets

Case 2:
• If the relationship set R has attributes {a1, a2, …, am} associated with it,
then the set of attributes
primary-key(E1) ∪ primary-key(E2) ∪ ⋯ ∪ primary-key(En) ∪ {a1, a2,
…, am}
describes an individual relationship in set R.
• For example let us consider the student enrolls for a course and for each course the
student would have an associated grade
• The set of attributes that would describe the relationship would be
○ {primary-key(Student) ∪ primary-key(Course) ∪ {Grade}}
○ {Student_ID, Course_ID, Grade}

24
Database Management Systems
Primary Key In Relationship Sets

• If the attribute names of primary keys are not unique across entity sets, the attributes
are renamed to distinguish them;
○ Then the name of the entity set combined with the name of the attribute would
form a unique name.

• Considering the below diagram, the attribute name of primary key of both entities are
the same (ID)
• So in this case the set of attributes that would describe the relationship would be
○ {(Entity name + primary-key(Instructor)) ∪ (Entity name + primary-key(Student))}
○ {Instructor_ID, Student_ID}

25
Database Management Systems
Primary Key In Relationship Sets

• If an entity set participates more than once in a relationship set (as in the prereq
relationship in the university database), the role name is used instead of the name of
the entity set, to form a unique attribute name.

• For example, in the prereq relationship, the Course entity participates in the relation
more than once.
• Therefore we use the role name as shown below
• {Course_ID, Prereq_ID}

26
Database Management Systems
Primary Key In Relationship Sets

• Recall that a relationship set is a set of relationship instances, and each instance is
uniquely identified by the entities that participate in it.

• Thus, in both of the cases, {Case 1 or Case 2}


the set of attributes
primary-key(E1) ∪ primary-key(E2) ∪ ⋯ ∪ primary-key(En)
forms a superkey for the relationship set.

27
Database Management Systems
Primary Key In Relationship Sets

• The choice of the primary key for a binary relationship set depends on the mapping
cardinality of the relationship set.

• As we have seen there are 4 different types:


○ One-to-one
○ One-to-many
○ Many-to-one
○ Many-to-many

• Let us find out how do we choose the primary key for these binary relationships

28
Database Management Systems
Primary Key In Relationship Sets

One-to-One
• For this type of relation, the primary key of either one of the participating entity sets
forms a minimal superkey, and either one can be chosen as the primary key of the
relationship set.
• Let us consider the situation:
• If an instructor can advise only one student, and each student can be advised by only
one instructor, that is, if the advisor relationship is one-to-one
○ then the primary key of either student or instructor can be chosen as the primary
key for the advisor.

1 1

29
Database Management Systems
Primary Key In Relationship Sets

One-to-One
● Let us consider that at any point in time , an employee can manage at most one
department and a department can have at most one manager.
● Hence the primary key of either employee or department can be chosen as the
primary key for the manages relationship.

1 1

30
Database Management Systems
Primary Key In Relationship Sets

Many-to-Many
• For these relationships, the preceding union of the primary keys is a minimal superkey
and is chosen as the primary key.
• As an example we can consider the entity sets instructor and student, and the
relationship set advisor. the relationship set is many-to-many.
• The primary key for the advisor relation :
○ {primary-key(Instructor) ∪ primary-key(Student)}
○ {Instructor_ID, Student_ID}

M N

31
Database Management Systems
Primary Key In Relationship Sets

Many-to-Many
• For these relationships, the preceding union of the primary keys is a minimal superkey
and is chosen as the primary key.
• As another example we can consider the entity sets student and courses, and the
relationship set enrolls. the relationship set is many-to-many.
• The primary key for the advisor relation :
○ {primary-key(Student) ∪ primary-key(Course)}
○ {Student_ID, Course_ID}
○ The advisory relation : {Student_ID, Course_ID, Grade}
○ Note that grade is not a part of the primary key

M N
32
Database Management Systems
Primary Key In Relationship Sets

Many-to-Many
• For these relationships, the preceding union of the primary keys is a minimal
superkey and is chosen as the primary key.
• As an example we can consider the entity sets Employee and Project, and the
relationship works_on.
• The primary key for the works_on relation :
○ {primary-key(Employee) ∪ primary-key(Project)}

33
Database Management Systems
Primary Key In Relationship Sets

One-to-Many or Many-to-One
• the primary key of the “many” side is a minimal superkey and is used as the primary
key.
• For example, if the advisor relationship is many-to-one from student to instructor, that
is, each student can have at most one advisor
○ then the primary key of advisor is simply the primary key of student.

1 N

34
Database Management Systems
Primary Key In Relationship Sets

One-to-Many or Many-to-One
• the primary key of the “many” side is a minimal superkey and is used as the primary
key.
• On the contrary, if an instructor can advise only one student, that is, if the advisor
relationship is many-to-one from instructor to student
○ then the primary key of advisor is simply the primary key of instructor.

N 1

35
Database Management Systems
Primary Key In Relationship Sets

One-to-Many or Many-to-One
• the primary key of the “many” side is a minimal superkey and is used as the primary
key.
• Let us consider the entity sets Employee and Department and the relationship
works_for.
• In the figure, the employees e1, e3, and e6 work for department d1;
the employees e2 and e4 work for department d2; and the
employees e5 and e7 work for department d3.
○ then the primary key of works_for is simply the
primary key of employee.

36
Database Management Systems
Weak Entity Sets

• A weak entity set is one whose existence is dependent on another entity set, called it’s
identifying entity set;
• Instead of associating a primary key with a weak entity, we use the primary key of the
identifying entity, along with extra attributes, called discriminator attributes to
uniquely identify a weak entity.
• An entity set that is not a weak entity set is termed a strong entity set.
• Every weak entity must be associated with an identifying entity; that is, the weak
entity set is said to be existence dependent on the identifying entity set.
• The identifying entity set is said to own the weak entity set that it identifies. The
relationship associating the weak entity set with the identifying entity set is called the
identifying relationship.

37
Database Management Systems
Weak Entity Sets

• In the university database, let us consider the following schemas for the entities
course and section:
• Course (Course_ID, title, dept_name, credits)
○ Primary key {Course_ID}
• Section (Course_ID, sec_ID, semester, year, building, room_number, time_slot_id)
○ Primary key {Course_ID, sec_ID, semester, year}

• Now let us consider that every course should have a section where the course would
be taking place.
• How do we relate both these entities for showing the above relationship?

38
Database Management Systems
Weak Entity Sets

• For the above question, we would connect sections to courses using a relationship
called "sec_course."
• But we should note that the "sec course" information is kind of repetitive because
each section already has a "course id" telling us which course it belongs to.

• To solve this problem there are 2 possibilities:


○ Possibility 1: Remove the "sec_course" relationship
➢ Then the connection between sections and courses isn't as clear. That is it
becomes implicit in an attribute, which is not desirable
○ Possibility 2: Remove the "course_id" from the section and keep only the other
details like section number, year, and semester.
➢ If we do this, sections might share the same details (section number, year,
and semester) for different courses, making them hard to tell apart.

• To deal with this problem, we treat the relationship sec_course as a special


relationship that provides extra information, in this case, the course id, required to
identify section entities uniquely
39
Database Management Systems
Weak Entity Sets

• The identifying relationship is many-to-one from the weak entity set to the identifying
entity set, and the participation of the weak entity set in the relationship is total.

• The identifying relationship set should not have any descriptive attributes, since any
such attributes can instead be associated with the weak entity set.
• The ER representation of the weak entity is as shown

40
Database Management Systems
Primary Key for Weak Entity Sets

• For the situation considered for sec_course


○ Weak entity: Section
○ Identifying entity: Course,
○ Identifying relationship: Sec_course.
○ The primary key of the section is formed by the primary key of the identifying
entity set (that is, course), plus the discriminator of the weak entity set (that is,
section).
○ Primary key is {course id, sec id, year, semester}.

Primary-key(Identifying entity: Discriminator(weak entity:


Course) Section)

41
Database Management Systems
Weak Entity Sets

• Note that the ER diagram for the same would be as follows:


○ a weak entity set is depicted via a double rectangle
○ discriminator is underlined with a dashed line.
○ The relationship set connecting the weak entity set to the identifying strong entity
set is depicted by a double diamond.

42
Database Management Systems
Weak Entity Sets

• Note that we could have chosen to make sec_id globally unique across all courses
offered in the university, in which case the section entity set would have had a primary
key.
• However, conceptually, a section is still dependent on a course for its existence, which
is made explicit by making it a weak entity set.

• In general, a weak entity set must have total participation in its identifying
relationship set, and the relationship is many-to-one toward the identifying entity
set.

43
Database Management Systems
Weak Entity Sets

• A weak entity set can participate in relationships other than the identifying
relationship.
○ For instance, the section entity could participate in a relationship with the time
slot entity set, identifying the time when a particular class section meets.

• It is also possible to have a weak entity set with more than one identifying entity set. A
particular weak entity would then be identified by a combination of entities, one from
each identifying entity set.
• The primary key of the weak entity set would consist of the union of the primary keys
of the identifying entity sets, plus the discriminator of the weak entity set.

44
Database Management Systems
Weak Entity Sets in a Ternary Relationship

45
Database Management Systems
Weak Entity Sets in a Ternary Relationship

SUPPLY represented as a weak entity type


46
Database Management Systems
Example

In the previous slide set, you have already modeled the ENVIRONMENTAL SENSOR entity.

Now, you are provided with additional data related to urban infrastructure components. Each
Infrastructure entity has the following attributes: Infrastructure_ID,Type,
Location,Installed_Date and Operational_Status

Based on this:

1. Draw an ER diagram that shows the relationship between Environmental Sensor and
Infrastructure.
○ Specify the cardinality and participation (total/partial) on both sides.
○ Consider that each sensor is installed on exactly one infrastructure element,
but each infrastructure may host zero or more sensors.
2. Identify and model any recursive relationship (self-relationship) that may exist within
the Infrastructure entity.(Think of how one road may lead to another road).

47
Database Management Systems
Example

48
Database Management Systems
Example

In the same smart urban system, residents often report issues they observe in public infrastructure—such as broken
streetlights, potholes, or malfunctioning traffic signals—via a digital platform. Each report contains details about the
affected infrastructure, the citizen who raised it, and specifics like the description, date, and status of the issue.

Note that a single citizen may report multiple issues across different infrastructures, and an infrastructure element
may receive requests from various citizens. However, a maintenance request cannot exist independently—it must
always be associated with both a citizen and an infrastructure element.

Task:

● Using this context, extend your existing ER model to include this maintenance request process.
● Be precise in capturing how requests are tracked and related to both citizens and infrastructure.
● Reflect on how the uniqueness of each request might be determined.

49
Database Management Systems
Example

In this model, the Maintenance Request entity is


correctly represented as a weak entity, as it depends
on both the Citizen and the Infrastructure entities for
its identification and existence.

The attributes associated with the entities are


intentionally kept generic to provide flexibility and can
be further refined or expanded based on the specific
requirements of the implementation context.

With this foundation, you are now well-prepared to


design and interpret Entity-Relationship (ER)
models for real-world systems, ensuring clarity,
scalability, and logical consistency in your data
modeling practices.

50
Database Management Systems
Q&A

1)Which of the following statements is true about weak entity sets?

A. They can exist without being related to a strong entity


B. They always have a multivalued attribute
C. They have a total participation in the identifying relationship
D. They have their own primary key

2)In an ER diagram, a weak entity set is represented by:

A. A rectangle with a double border


B. A diamond with a dashed border
C. An ellipse with a dashed border
D. A rectangle with a dashed border

51
Database Management Systems
Q&A

3) In an ER diagram, the identifying relationship of a weak entity set is represented


by:

A. A dashed line diamond


B. A double diamond
C. A triangle
D. A rectangle with a double border

4) Which of the following must a weak entity set have?

A. A discriminator
B. A multivalued attribute
C. A recursive relationship
D. A derived attribute

52
Database Management Systems
Q&A

5)Role names are used instead of entity set names in primary key formation when:
A. The relationship has attributes associated with it
B. An entity set participates more than once in the same relationship set
C. The relationship is of many-to-many cardinality
D. The primary key attributes have identical names

6)In a 1:1 binary relationship set, which statement about primary key selection is
INCORRECT?
A. The primary key of either participating entity set can be chosen
B. Both entity primary keys together form a minimal superkey
C. Either entity's primary key alone forms a minimal superkey
D. The choice between entity primary keys is arbitrary from a uniqueness perspective

53
Database Management Systems
Primary Key In Relationship Sets

7)If a relationship set R has attributes {a1, a2, ..., am}, these attributes:
A. Must be included in the primary key of R
B. Are never included in the primary key of R
C. Are included in the super key but not necessarily the primary key of R
D. Replace the entity primary keys in forming the relationship primary key

8)Which of the following best describes why a weak entity set cannot exist
independently?

A. It lacks a relationship with any other entity


B. It depends on the foreign key of another entity set
C. It does not have a primary key of its own
D. It always participates in recursive relationships
54
Database Management Systems
Q&A

9) In the ER model, a double line connecting an entity to a relationship signifies:

A. The relationship is one-to-one


B. Partial participation
C. Total participation of the entity in that relationship
D. Recursive relationship

10)A university policy mandates every student must enroll in at least one course, but
instructors may or may not teach any course. What participation constraints apply?

A. Total for both Student and Instructor


B. Partial for both
C. Total for Student, Partial for Instructor
D. Partial for Student, Total for Instructor 55
Database Management Systems
Q&A

11)In min-max notation, if the line from Student to Enrolls has (2, 5), what does it
represent?

A. A student can enroll in at most 2 and at least 5 courses


B. Each student must enroll in exactly 5 courses
C. A student can enroll in minimum 2 and maximum 5 courses
D. The enrollment is optional

56
Database Management Systems
Q&A

Answers:
1. C
2. A
3. B
4. A
5. B
6. B
7. C
8. C
9. C
10. C
11. C

57
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

58
Database Management Systems
Converting E-R Diagram to Relational Schema

Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and S
Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Reducing ER diagrams to Relational Schema

• Both the E-R model and the relational database model are abstract and logical
representations of real-world organizations.
• Due to shared design principles, an E-R design can be transformed into a relational
design.
• For each entity set and for each relationship set in the database design, there is a
unique relation schema to which we assign the name of the corresponding entity set
or relationship set.
Database Management Systems
Reducing ER diagrams to Relational Schema

• Steps to convert ER diagram into a relational schema


• Mapping of strong entity sets
• Mapping of weak entity sets
• Mapping of strong entity sets with complex attributes
• Compound attributes
• Derived attributes
• Multivalued attributes
• Mapping of relational sets
• 1:1 or one-to-one
• 1:N or one-to-many or many-to-one
• M:N or many-to-many
• N-ary relations
Database Management Systems
Representation of Strong Entity Sets

• Let E be a strong entity set with only simple descriptive attributes a1, a2, …, an.
• We represent this entity with a schema called E with n distinct attributes.
• Each tuple in a relation on this schema corresponds to one entity of the entity set E.
• For schemas derived from strong entity sets, the primary key of the entity set serves
as the primary key of the resulting schema.
• From the university DB:
• Consider the entity set “student”.
• Attributes: ID, name, tot_cred.
• We represent this entity set by a schema called student with three attributes:
• Student (ID, name, tot cred)
Database Management Systems
Representation of Strong Entity Sets

• Similarly converting all the strong entities with simple attributes in the university DB,
we get the following schemas:
• Classroom(building, room number, capacity)
• Department(dept name, building, budget)
• Course(course id, title, credits)
• Instructor(ID, name, salary)
• Student(ID, name, tot cred)
Database Management Systems
Representation of Weak Entity Sets

• Let A be a weak entity set with attributes {a1, a2, …, am}.


• Let B be the strong entity set on which A depends. Let the primary key of B consist of
attributes {b1, b2, …, bn}.
• We represent the entity set A by a relation schema called A with one attribute for each
member of the set:
{a1, a2, …, am} ∪ {b1, b2, …, bn}
• The primary key consists of the primary key of Identifying/strong entity set union with
the discriminator of weak entity
• Foreign key constraint is also added to the relational schema of the weak entity (to the
primary-key of the identifying entity set)
Database Management Systems
Representation of Weak Entity Sets

• Consider the section entity of the university database as shown

• Relational schema:
• Section(course_id, sec_id, semester, year, building, room_no)
Database Management Systems
Representation of Weak Entity Sets

• Consider the section entity of the university database as shown

• Relational schema with foreign-key constraint:


Database Management Systems
Representation of Weak Entity Sets

• For schemas derived from a weak entity set, the combination of the primary key of the
strong entity set and the discriminator of the weak entity set serves as the primary key
of the schema.
• In addition to creating a primary key, we also create a foreign-key constraint on the
relation A, specifying that the attributes b1, b2, …, bn reference the primary key of the
relation B.
• The foreign-key constraint ensures that for each tuple representing a weak entity,
there is a corresponding tuple representing the corresponding strong entity
Database Management Systems
Representation of Weak Entity Sets

• In general, for the given ER diagram, the relational schema is as given:

ER Diagram Relational Schema


Database Management Systems
Representation of Weak Entity Sets

• After incorporating the weak entities the university database would look like:
Database Management Systems
Representation of Strong Entity Sets with Complex Attributes

COMPOSITE ATTRIBUTES
• Composite attributes are handled by creating a
separate attribute for each of the component
attributes (Do not create a separate attribute for the
composite attribute itself).
• Consider the version of the Instructor entity set as
shown.
• For the composite attribute “name”, the schema
generated for the instructor contains the attribute’s
first name, middle initial, and last name;
• The schema:
Instructor(ID, First_name, Middle_name, Last_name, Salary)
Database Management Systems
Representation of Strong Entity Sets with Complex Attributes

DERIVED ATTRIBUTES
• Derived attributes are not explicitly represented in
the relational data model.
• They can be represented as stored procedures,
functions, or methods in other data models
• Consider the version of the Instructor entity set as
shown.
• The derived attribute Age would be eliminated in the
relational schema
• The schema:
Instructor(ID, name, Date_of_Birth , Salary)
Database Management Systems
Representation of Strong Entity Sets with Complex Attributes

MULTIVALUED ATTRIBUTES
• For a multivalued attribute M, create a relation
schema R with an attribute A that corresponds to M
and attributes corresponding to the primary key of
the entity set or relationship set of which M is an
attribute.
• Consider the Instructor entity set as shown where:
• Primary key: ID
• Multivalued attribute: Phone_no
• primary key of the new relation schema R consists of
all attributes of the schema.
• the relational schema:
Instructor(ID, name, Salary)
R : Instructor_phone(ID, Phone_no)
Database Management Systems
Representation of Strong Entity Sets with Complex Attributes

MULTIVALUED ATTRIBUTES
• A foreign-key constraint is to be added to the
relation schema created from the multivalued
attribute.
• In that newly created schema, the attribute
generated from the primary key of the entity set
must reference the relation generated from the
entity set.
• In the above example, the foreign-key constraint on
the instructor phone relation would be that attribute
ID references the instructor relation
• Schema:
Instructor(ID, name, Salary)
R : Instructor_phone(ID, Phone_no)
Database Management Systems
Representation of Strong Entity Sets with Complex Attributes

MULTIVALUED ATTRIBUTES
• In the case: entity set consists of only two attributes
• A single primary-key attribute B
• A single multivalued attribute M
• The relation schema for the entity set would contain
only one attribute, namely, the primary-key attribute
B.
• In such cases we drop this relation, while retaining
the relation schema with the attribute B and
attribute A that corresponds to M.
• In the time_slot entity set shown,
Schema:
• B: Time_slot_id Time_slot(Time_slot_id, Day, Start_time, End_time)
• M: Time_stamp
Database Management Systems
Representation of Strong Entity Sets with Complex Attributes

MULTIVALUED ATTRIBUTES
• In the schema shown, the End_time is not included
as a part of the primary key
• This is because there can not be two meetings of a
class that start at the same time on the same day of
the week but end at different times;
• Due to this constraint, the end_time has been
omitted from the primary key of the time slot
schema.

Schema:
Time_slot(Time_slot_id, Day, Start_time, End_time)
Database Management Systems
Representation of Strong Entity Sets with Complex Attributes

• After incorporating the complex attributes the university


database would look like
Database Management Systems
Representation of Relationship Sets

Mapping of Binary 1:1 Relation Types


• For each binary 1:1 relationship type R in the ER schema, identify the relations S and T
that correspond to the entity types participating in R.
• There are three possible approaches:
• Foreign Key (2 relations) approach:
• Merged relation (1 relation) option:
• Cross-reference or relationship relation (3 relations) option:
Database Management Systems
Representation of Relationship Sets

• There are three possible approaches:


• Foreign Key (2 relations) approach:
• Choose one of the relations-say S and include a foreign key in S the primary
key of T.
• It is better to choose an entity type with total participation in R in the role of
S. And add all the descriptive attributes of the relation
• Merged relation (1 relation) option:
• This alternative, we merge the two entity types and the relationship into a
single relation.
• This may be appropriate when both participations are total. And add all the
descriptive attributes of the relation
• Cross-reference or relationship relation (3 relations) option:
• The third alternative is to set up a third relation R for the purpose of cross-
referencing the primary keys of the two relations S and T representing the
entity types. And add all the descriptive attributes of the relation
Database Management Systems
Representation of Relationship Sets

• Consider for this situation, the advisor relation to be one-to-one or 1:1 (Foreign key
approach)

T S : always towards total


participation
Database Management Systems
Representation of Relationship Sets

Mapping of Binary 1:N Relation Types


• For each regular binary 1:N relationship type R, identify the relation S that represent
the participating entity type at the N-side of the relationship type.
• Include as the foreign key in S the primary key of the relation T that represents the
other entity type participating in R.
• Include any simple attributes of the 1:N relation type as attributes of S.
Database Management Systems
Representation of Relationship Sets

• Consider for this situation, the advisor relation to be one-to-many or 1:N

T S : always toward the side that is


N
Database Management Systems
Representation of Relationship Sets

• After incorporating the 1:1 and 1:N


relations in the university database
would look as shown
Database Management Systems
Representation of Relationship Sets

Mapping of Binary M:N Relation Types


• For each regular binary M:N relationship type R, create a new relation S to represent
R. This is a relationship relation.
• Include as foreign key attributes in S the primary keys of the relations that represent
the participating entity types; their combination will form the primary key of S.
• Also include any simple attributes of the M:N relationship type (or simple components
of composite attributes) as attributes of S.
Database Management Systems
Representation of Relationship Sets

Consider for this situation, the advisor relation to be many-to-many or M:N


Database Management Systems
Representation of Relationship Sets

• After
incorporating the
M:N relations in
the university
database would
look as shown
Database Management Systems
Representation of Relationship Sets

Mapping n-ary relationship sets


• For each n-ary relationship type R, where n>2, create a new relationship S to represent
R.
• Include as foreign key attributes in S the primary keys of the relations that represent
the participating entity types.
• Also include any simple attributes of the n-ary relationship type (or simple
components of composite attributes) as attributes of S.
• Considering the project_guide ternary relationship discussed earlier, the ER and the
corresponding Schema is given in the following slide
Database Management Systems
Representation of Relationship Sets

Mapping n-ary relationship sets


• For each n-ary relationship type R, where n>2, create a new relationship S to represent
R.
• Include as foreign key attributes in S the primary keys of the relations that represent
the participating entity types.
• Also include any simple attributes of the n-ary relationship type (or simple
components of composite attributes) as attributes of S.
Database Management Systems
Question

Consider the ER diagram shown in the figure for part of a


BANK database.
Each bank can have multiple branches, and each branch
can have multiple accounts and loans.
● List the strong entity types.
● Is there a weak entity type? If so, give its name,partial
key, and identifying relationship.
● What constraints do the partial key and the
identifying relationship of the weak entity type
specify in this diagram?
.
Database Management Systems
Answers

● List the strong entity types.


LOAN, BANK, ACCOUNT and CUSTOMER

● Is there a weak entity type? If so, give its name,partial


key, and identifying relationship.
Yes, there is a weak entity BANK_BRANCH and its
identifying relationship is BRANCHES.
Database Management Systems
Answers

● What constraints do the partial key and the


identifying relationship of the weak entity type
specify in this diagram?
The constraints are:
1. No two branches have the same number
2. A bank can have any number of branches but a
branch only belongs to one bank.
Database Management Systems
Q&A

1. Which of the following is the first step in converting an ER diagram into a relational
schema?
A) Mapping weak entity sets
B) Mapping strong entity sets
C) Mapping relationships
D) Mapping composite attributes

2. What is the primary key in the relational schema for a weak entity set?
A) The discriminator of the weak entity set
B) The primary key of the identifying entity set
C) The combination of the weak entity’s discriminator and the identifying entity’s
primary key
D) A randomly generated primary key
Database Management Systems
Q&A

3. In the relational schema for a multivalued attribute, which constraint is added?


A) Primary key constraint
B) Foreign key constraint
C) Unique constraint
D) No constraint is added

4. How is a derived attribute handled in a relational schema?


A) It is stored as a regular attribute
B) It is ignored in the schema
C) It is represented using a stored procedure or function
D) It is merged with other attributes
Database Management Systems
Q&A

Answers:
1. B
2. C
3. B
4. C
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]
Database Management Systems
Relational Operations
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
1
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and
S Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Relational Algebra

A procedural language that consists of a set of operations, where each operation takes
one or two relations as input and produces a new relation as the result.

• The result of an operation is a new relation, which is formed from one or more input
relations.
• This property makes the algebra "closed" (all objects in relational algebra are relations)
• These relations can be further manipulated using operations of the same algebra.
• A sequence of relational algebra operations forms a relational algebra expression.
• The result of a relational algebra expression  relation that represents the result of a
database query.

3
Database Management Systems
Relational Algebra

Operators:
Unary Operators:
Select: σ
Project: ∏
Rename: ρ
Binary Operators:
Union: U
Intersection :∩
Set difference: –
Cartesian product: X
Join: ⋈
4
Database Management Systems
Relational Algebra

Use this table for query examples:


University Database sample table/relation

5
Database Management Systems
Example-2 Company Database State

6
Database Management Systems
Unary Operator: Select (σ)

● The SELECT operation is used to choose a subset of the tuples from a relation that satisfies a selection
condition.
● We can consider the SELECT operation to be a filter that keeps only those tuples that satisfy a qualifying
condition.
Notation: σ p (r)
● p is called the selection predicate
● The SELECT operation can also be visualized as a horizontal partition of the relation into two sets of
tuples—those tuples that satisfy the condition and are selected, and those tuples that do not satisfy
the condition and are filtered out.

7
Database Management Systems
Unary Operator: Select (σ)

● Example 1: Select those tuples of the instructor relation where the instructor is in the “Physics”
department.
● Query: σ(Dept_Name = “Physics”)(Instructor)

● Example 2: Select the tuples for all employees who either work in department 4 and make over $25,000
per year, or work in department 5 and make over $30,000.
● Query: σ(Dno=4 AND Salary>25000) OR (Dno=5 AND Salary>30000)(EMPLOYEE)

8
Database Management Systems
Unary Operator: Select (σ)

SELECT Operation Properties


• The SELECT operation σ (<condition> ) produces a relation S that has the same schema
(same attributes) as R
• SELECT is commutative:

• Because of commutative property, A cascade (sequence) of SELECT operations may be applied in any order:
or

9
Database Management Systems
Unit 1: Relational Model and Database Design

• We allow comparisons to be used in the selection predicate.


=, ≠, >, ≥. <. ≤

• We can combine several predicates into a larger predicate by using the connectives:
∧ (and), ∨ (or), ¬ (not)

Example: Find the instructors in Physics with a salary greater $90,000, we write:

σ dept_name=“Physics” ∧ salary > 90,000 (instructor)

• The select predicate may include comparisons between two attributes.

Example: Fetch all the details of the ‘Research’ department.


σ dept_name=“Research” (department)

NOTE: The number of tuples in the result of a SELECT is less than (or equal to) the number of
tuples in the input relation R.
10
Database Management Systems
Unary Operator: PROJECT(∏)

PROJECT(∏)
● A unary operation that returns its argument relation, with certain attributes left out.

Notation: ∏ A1,A2,A3 ….Ak (r)

where A1, A2, …, Ak are attribute names and r is a relation name.


● The PROJECT operation, selects certain columns from the table and discards the other columns.
● Duplicate rows removed from result, since relations are sets
● This operation keeps certain columns (attributes) from a relation and discards other columns.
● The PROJECT operation can be visualized as a vertical partition of the relation
● The list of specified columns (attributes) is kept in each tuple. The other attributes in each tuple are
discarded.
11
Database Management Systems
Unary Operator: PROJECT(∏)

● Example 1:
Eliminate the dept_name attribute of instructor

Query:
∏ID, name, salary (instructor)

Result:

12
Database Management Systems
Unary Operator: PROJECT(∏)

● Example 2: Fetch the Lname, Fname, Salary information about the Employees.
Query:
∏Lname, Fname, Salary (EMPLOYEE)

● Example 3: Fetch the Sex and Salary information about the Employees.

Query:
∏Sex, Salary (EMPLOYEE)

output for example 2

output for example 3


13
Database Management Systems
Unary Operator: PROJECT(∏)

Properties of PROJECT operator:


● The project operation removes any duplicate tuples.
● This is because the result of the project operation must be a set of tuples.
● Mathematical sets do not allow duplicate elements.
● The number of tuples in the result of projection ∏(R) is always less or equal to the number of tuples in R.
● If the list of attributes includes a key of R, then the number of tuples in the result of PROJECT is equal to
the number of tuples in R.

14
Database Management Systems
Composition of Relational Operations

● The result of a relational-algebra operation is a relation and therefore of relational-algebra


operations can be composed together into a relational-algebra expression.
Example: Find the names of all instructors in the Physics department.
∏name(σ dept_name =“Physics” (instructor) )
Instead of giving the name of a relation as the argument of the PROJECT operation, we give an
expression that evaluates to a relation.

• We may want to apply several relational algebra operations one after the other.
• Either we can write the operations as a single relational algebra expression by nesting the
operations, or we can apply one operation at a time and create intermediate result relations.

15
Database Management Systems
Relational Algebra Expressions

● To retrieve the first name, last name and salary of all employees who work in the department 5,
we must apply a select and a project operation.
● We can write a single relational algebra expression as follows:
∏Fname,Lname,Salary(σ DNo = 5​(Employee))

● Or we can explicitly show the sequence of operations, giving a name to each immediate relation:

DEP5_EMPS ← σ DNO=5 (Employee)

RESULT ← ∏ Fname,Lname,Salary (DEP5_EMPS)

16
Database Management Systems
Unary Operator: RENAME (ρ)

• The general RENAME operation r can be expressed by any of the following forms:
rS (B1, B2, …, Bn )(R) changes both:
○ the relation name to S, and
○ the column (attribute) names to B1, B2, …..Bn
rS(R) changes:
○ the relation name only to S
r(B1, B2, …, Bn )(R) changes:
○ the column (attribute) names only to B1, B2, …..Bn

17
Database Management Systems
Unary Operator: RENAME (ρ)

• The results of relational-algebra expressions do not have a name that we can use to refer to
them. The rename operator, ρ , is provided for that purpose.
• The RENAME operator is denoted by ρ(rho)
• The expression:

ρx (E)
returns the result of expression E under the name x
• Another form of the rename operation:
S(B1, B2, ... , Bn)(ρ) or S(ρ) or (B1, B2, ... , Bn)(ρ)
ρx(A1,A2, .. An) (E)
• In some cases, we may want to rename the attributes of a relation
or the relation's name or both.
• Useful when a query requires multiple operations
18
Database Management Systems
Unary Operator: RENAME (ρ)

• This is a unary operator which changes attribute names for a relation without changing any values.
• Renaming removes the limitations associated with set operators.

• Notation: ρOldName->NewName (r)


• For Example, ρFather->Parent (Paternity). For the given operation, the attribute “Father” will be renamed as

“Parent” for the relation Paternity.


• If there are two or more attributes involved in renaming operation, then ordering is more meaningful.
ρBrach,Salary->Location,Pay (Employee)

19
Database Management Systems
Unary Operator: RENAME (ρ)

● RESULT will have the same attribute names as DEP5_EMPS (same attributes as EMPLOYEE

● The 10 attributes of DEP5_EMPS are renamed to F, M, L, S, B, A, SX, SAL, SU, DNO, respectively
Note: the symbol is an assignment operator

20
Database Management Systems
The Assignment Operator (←)

• It is convenient at times to write a relational-algebra expression by assigning parts of it to


temporary relation variables.
• The assignment operation is denoted by ← and works like assignment in a programming
language.
• Example: Fetch the FirstName, LastName and Salary of all employees who work in
department 5.

• With the assignment operation, a query can be written as a sequential program consisting of a
series of assignments followed by an expression whose value is displayed as the result of the
query.
21
Database Management Systems
The Assignment Operator (←)

Example:

22
Database Management Systems
Questions

● Scenario: You have a table Employees with columns EmployeeID, Name, Department,
Salary, and JoiningYear (eg:2016).
Question: How would you create a query to find the names and departments of employees
who joined before 2020, but without showing the EmployeeID and Salary columns?

● Scenario: A database contains a Customers table with fields CustomerID, FirstName,


LastName, City, and PurchaseAmount.
Question: If you need to rename the City column to Location, write the relational query for this
scenario.

● Scenario: You are working with a table Products with columns ProductID, ProductName,
Category, Price, and StockQuantity.
Question: How would you select the ProductName and Price of all products that are in the
Electronics category and have more than 10 items in stock?
23
Database Management Systems
Questions

● Scenario: You have a table Employees with columns EmployeeID, Name, Department,
Salary, and JoiningYear (eg:2016).
Question: How would you create a query to find the names and departments of employees
who joined before 2020, but without showing the EmployeeID and Salary columns?

Answer: ∏Name,Department​(σ JoiningYear<2020​(Employees))

● Scenario: A database contains a Customers table with fields CustomerID, FirstName,


LastName, City, and PurchaseAmount.
Question: If you need to rename the City column to Location, write the relational query for this
scenario.
Answer: ρCity->Location (Customers)

24
Database Management Systems
Questions

● Scenario: You are working with a table Products with columns ProductID, ProductName,
Category, Price, and StockQuantity.
Question: How would you select the ProductName and Price of all products that are in the
Electronics category and have more than 10 items in stock?

Answer: ∏ProductName,Price​(σ Category=’Electronics’∧ StockQuantity>10​(Products))

25
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

26
Database Management Systems
Set Operations,Equivalent Queries

Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and
S Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Set Operations : Set Operations

What are set operations ?

• Union Operation U

• Set-Intersection Operation ∩

• Set Difference Operation ─

3
Database Management Systems
Set Operations : UNION Operation

• Set Operations are Binary.


• These operations can be applied on the relations which are union compatible.
• Two relations are said to be union compatible if
1. Both the relations have the same arity (number of attributes) and
2. The corresponding attribute domains must be compatible (example: 2nd column of r
should have the same type of values as does the 2nd column of s).
Union operation:
• The result of R U S, is a relation that includes all tuples that are either in R or in S or
in both R and S. (R and S should be union compatible)
• Duplicate tuples are eliminated.
• The union operation allows us to combine two relations. Notation: r ∪ s 4
Database Management Systems
University Database
Instructor Table: Course Table: Prerequisites Table:

Teaches Table:

Department Table: Section Table:

5
Database Management Systems
Set Operations : UNION Operation

Schema:

Example: Find all courses (course_id) taught in the Fall 2017 semester, or in the Spring 2018
semester, or in both

6
Database Management Systems
Set Operations : UNION Operation

Query:
∏course_id (σ semester=“Fall” Λ year=2017 (section)) ∪ ∏course_id (σ semester=“Spring” Λ year=2018 (section))

Table: Output:

7
Database Management Systems
Set Operations : UNION Operation

What will be the first Column Name after Union of Paternity and Maternity Relation?
Is It Meaningful?

Table: Output:

8
Database Management Systems
Set Operations : UNION Operation

• To retrieve the social security numbers of all employees who either work in department 5
(RESULT1 below) or directly supervise an employee who works in department 5 (RESULT2
below).

• We can use the UNION operation as follows:

• The union operation produces the tuples that are in either RESULT1 or RESULT2 or both.

9
Database Management Systems
Set Operations : UNION Operation

10
Database Management Systems
Set Operations : Intersection Operation

• The set-intersection operation allows us to find tuples that are in both the input relations.
Notation: r ∩ s
• r, s have the same arity attributes of r and s are
compatible
Example: Find the set of all courses taught in both the
Fall 2017 and the Spring 2018 semesters.

Query: ∏course_id (σ semester=“Fall” Λ year=2017 (section)) ∩


∏ course_id (σ semester=“Spring” Λ year=2018 (section))

Output:

11
Database Management Systems
Set Operations : Intersection Operation

Example : Graduates ∩ Managers

Output:

12
Database Management Systems
Set Operations : SET DIFFERENCE

The set-difference operation allows us to find tuples that are in one relation but are not in
another.

Notation: r – s
Set differences must be applied on compatible relations ie r and s must have the same arity
and attribute domains of r and s must be compatible.

Example: To find all courses taught in the Fall 2017 semester, but not in the Spring 2018
semester.
Query: ∏course_id (σ semester=“Fall” Λ year=2017 (section)) − ∏course_id (σ semester=“Spring” Λ year=2018 (section))

Output:

13
Database Management Systems
Set Operations : Intersection Operation

Example : Graduates - Managers

Output:

14
Database Management Systems
Set Operations :The set operations UNION, INTERSECTION, and MINUS

Example : STUDENT ∪ INSTRUCTOR

Output:

15
Database Management Systems
Set Operations :The set operations UNION, INTERSECTION, and MINUS

Example : STUDENT ∩ INSTRUCTOR.

Output:

16
Database Management Systems
Set Operations :The set operations UNION, INTERSECTION, and MINUS

Example : STUDENT – INSTRUCTOR.

Output:

17
Database Management Systems
Set Operations :The set operations UNION, INTERSECTION, and MINUS

Example : INSTRUCTOR – STUDENT.

Output:

18
Database Management Systems
Some properties of UNION, INTERSECT and DIFFERENCE

Notice that both union and intersection are commutative operations; that is:
● R ∪ S = S ∪ R and R ∩ S = S ∩ R

Both union and intersection can be treated as n-ary operations applicable to any number
of relations as both are associative operations; that is:
● R ∪ (S ∪ T) = (R ∪ S) ∪ T
● (R ∩ S) ∩ T = R ∩ (S ∩ T)

The minus operation is not commutative; that is, in general:


● R−S ≠ S−R

19
Database Management Systems
Set Operations : Division Operation

● The Division operation is also a binary operation.

● R(Z) ÷ S(X), where X is a subset of Z.


● Y = Z - X that is let Y is the set of attributes of R that are not attributes of S in the
resultant relation (T).

● The result of DIVISION is a relation T(Y) that includes a tuple t if tuples tR appear
in R with tR [Y] = t, and with tR [X] = ts for every tuple ts in S.

● For a tuple t to appear in the result T of the DIVISION, the values in t must appear
in R in combination with every tuple in S.

20
Database Management Systems
Set Operations : Division Operation

Example:
(a) Dividing SSN_PNOS by SMITH_PNOS. (b) T ← R ÷ S

21
Database Management Systems
Equivalent Queries

There is more than one way to write a query in relational algebra.


Example:
Find information about courses taught by instructors in the Physics department with salary greater
than 90,000.
Query 1: σ dept_name=“Physics” ∧ salary > 90,000 (instructor)

Query 2 : σ dept_name=“Physics” (σ salary > 90.000 (instructor))

The two queries are not identical; they are however equivalent -- they give the same result on the
given database.

22
Database Management Systems
Binary Operator : The Cartesian-Product Operation

• In Relational Algebra, the Cartesian Product (×) is a binary operation that combines every
tuple of one relation with every tuple of another relation.
• Allows to combine information from two relations.
• Denoted as R x S
• n - number of attributes in R and m - number of attributes in S
• Result is a relation Q with degree (arity) n + m attributes and Q(A1, A2, . . ., An, B1,
B2, . . ., Bm) order of attributes.
• The resulting relation state has one tuple for each combination of tuples—one from R
and one from S.
• Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then R x S will
have nR * nS tuples. 23
Database Management Systems
Binary Operator : The Cartesian-Product Operation

Sample Employee Table

Sample Dependent Table

24
Database Management Systems
Binary Operator : The Cartesian-Product Operation

Sample Cartesian Product : Employee X Dependent

The resultant relation will have 56 rows, 15 columns.

25
Database Management Systems
Binary Operator : The Cartesian-Product Operation

Generally, CROSS PRODUCT is not a meaningful operation, can become meaningful when
followed by other operations

Example (not meaningful):

EMP_DEPENDENTS ← EMPLOYEE x DEPENDENT

EMP_DEPENDENTS will contain every combination of EMPNAMES and DEPENDENT

whether or not they are actually related

26
Database Management Systems
Binary Operator : The Cartesian-Product Operation

•To keep only combinations where the DEPENDENT is related to the EMPLOYEE, we
add a SELECT operation as follows:

Example (meaningful):
EMP_DEPENDENTS ← EMPLOYEE x DEPENDENT
ACTUAL_DEPS ← σ SSN=ESSN(EMP_DEPENDENTS)
RESULT (ACTUAL_DEPS) will now contains employees and their dependents
information

27
Database Management Systems
Binary Operator : The Cartesian-Product Operation

The issue with Cartesian Product is that it returns all possible combinations of rows from
both tables.

This results in redundant and meaningless combinations unless you filter them using a
condition.(as shown in previous slide: σ SSN=ESSN)

A natural or join, on the other hand, performs a Cartesian Product internally and
automatically filters rows by matching columns with the same name and compatible
data types, thus ensuring that there are no redundancies in the results.

28
Database Management Systems
Binary Operator : The Join Operation (Natural Join)

29
Database Management Systems
Binary Operator : The Join Operation (Natural Join)

Example: Write an relational algebraic query to retrieve the details of Managers(Fname,


Name) and their managing department name.

30
Database Management Systems
Binary Operator : The Join Operation

Q:

31
Database Management Systems
Aggregate Functions and Grouping

Aggregate Functions:
Aggregate functions are functions that take a collection
(a set or multiset) of values as input and return a
single value.
There are five aggregate functions they are:

1. AVERAGE
2. MINIMUM
3. MAXIMUM
4. SUM
5. COUNT
32
Database Management Systems
Types of aggregate functions

33
Database Management Systems
Aggregate Functions and Grouping

•A type of request that cannot be expressed in basic relational algebra is to specify


mathematical aggregate functions on collections of values from the database.
•Examples of such functions include retrieving the average or total salary of all employees or the total number of
employee tuples.

•These functions are used in simple statistical queries that summarize information from the database tuples.
•Common functions applied to collections of numeric values include SUM, AVERAGE, MAXIMUM, and
MINIMUM.

•The COUNT function is used for counting tuples or values.

Note: count just counts the number of rows, without removing duplicates

34
Database Management Systems
Aggregate Functions and Grouping

Use of the Aggregate Functional operation ℱ

• ℱMAX Salary (EMPLOYEE) retrieves the maximum salary value from the EMPLOYEE relation.

• ℱMIN Salary (EMPLOYEE) retrieves the minimum salary value from the EMPLOYEE relation.

• ℱSUM Salary (EMPLOYEE) retrieves the sum of salary values from the EMPLOYEE relation.

• ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) of employees & their average

salary.

35
Database Management Systems
Aggregate Functions and Grouping

Grouping can be combined with Aggregate Functions

Example: For each department, retrieve the DNO, COUNT SSN, and AVERAGE SALARY
•A variation of aggregate operation ℱ allows this:

• Grouping attribute placed to left of symbol

• Aggregate functions to right of symbol

• DNO ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE)

•Above operation groups employees by DNO (department number) and computes the
count of employees and average salary per department

36
Database Management Systems
Count Function

Count:
Count function is used to count the number of tuples in a relation.
It can work on both numeric and non-numeric data types.
Ex:

Output: g count(*) (E)=4

4 is the number of tuples in the above table


37
Database Management Systems
Sum Function

Sum-
● Sum function is used to calculate the sum of all selected columns.
● It works on numeric fields only.
Ex:

38
Database Management Systems
Average Function

Average-

● The function is used to calculate the average value of the numeric type.
● average function returns the average of all non-Null values.
Ex:

Output: g average(C) (R)=27/4=6.75


39
Database Management Systems
Minimum Function and Maximum Function

Minimum-
● MIN function is used to find the minimum value of a certain column.
● This function determines the smallest value of all selected values of a column.
Ex-
g min(IC) (R)=3

Maximum-
● MAX function is used to find the maximum value of a certain column.
● This function determines the largest value of all selected values of a column.
Ex-
g max(C) (R)=10
40
Database Management Systems
Aggregate Functions and Grouping

Consider the table:

41
Database Management Systems
Aggregate Functions and Grouping

Example:
1) Find the sum of the salaries of all employees, the maximum salary, the minimum salary,
and the average salary.
Query:
ρR(Total_Sal,Highest_Sal, Lowest_sal,Average_Sal) ℑ (SUM(Salary), MAX(Salary), MIN(Salary), AVG(Salary)) (EMPLOYEE)

Output:

42
Database Management Systems
Grouping

• In many cases aggregate functions can be applied to subgroups of data.


Example: Calculate average salary for each department.
• In these cases data is divided into non-overlapping groups based on specified attributes.
• Each group (partition) will consist of the tuples that have the same value of some
attribute(s), called the grouping attribute(s).
• We can then apply the function to each such group independently to produce summary
information about each group. SQL has a GROUP BY clause for this purpose.
• The GROUP BY clause specifies the grouping attributes, which should also appear in the
SELECT clause, so that the value resulting from applying each aggregate function to a group of
tuples appears along with the value of the grouping attribute(s).

43
Database Management Systems
Aggregate Functions and Grouping

Example: [Link] the average salary of employees in each department


b. Find the number of employees who work on each project.
Query:

ρR(Dno,count(*), Avg(Salary))( Dno ℑ (Dno,COUNT(*), AVG(Salary))


(EMPLOYEE)

Output:

44
Database Management Systems
Aggregate Functions and Grouping

Exercise:

Write Relational Algebra Statements to:


a) Retrieve each department number, the number of employees in the department, and their
average salary, while renaming the resulting attributes.
b) Show the department number, the total number of employees, and the average salary of
employees in that department, for each department.
c) FInd the total number of employees and the average salary across the entire company.

45
Database Management Systems
Aggregate Functions and Grouping

Answers:
a. ρR(Dno, No_of_employees, Average_sal)(Dno ℑ COUNT Ssn, AVERAGE Salary (EMPLOYEE)).

b. Dno ℑ Count ssn,Average_salary(EMPLOYEE).

c. ℑ COUNT ssn, AVERAGE Salary(EMPLOYEE).


Output:

46
Database Management Systems
Questions and Answers

Fill in the Blanks:


1) The union operation allows us to combine two relations (r and s). The notation used for
union is ______.
Ans: r ∪ s
2) For the union operation r ∪ s to be valid, r and s must have the same ______.
Ans: arity (same number of attributes)
3) In the union operation, duplicate tuples are ______.
Ans: eliminated
4) The set-intersection operation allows us to find tuples that are in both input relations.
The notation used for intersection is ______.
Ans: r ∩ s
47
Database Management Systems
Questions and Answers

Fill in the Blanks:


5) The set-difference operation allows us to find tuples that are in one relation but not in
another. The notation used for set-difference is ______.
Ans: r−s
6) The number of attributes in a relation is referred to as its ______.
Ans: arity
7) The two queries that give the same result on any database but are written differently in
relational algebra are called ______queries.
Ans: equivalent
8) In the UNION operation, the two operand relations must be "type ______“.
Ans: compatible
48
Database Management Systems
Questions and Answers

Can you write the intersection operation using union and set difference
operations ?

INTERSECTION can be expressed in terms of union and set difference as follows:

R ∩ S = ((R ∪ S) − (R − S)) − (S − R)

49
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

50
Database Management Systems
Relational Algebra - Hands on
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and
S Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra: Procedural Form

■ Q1: Retrieve the name and address of all employees who work for the
‘Research’ department.

■ Q2: Retrieve the names of employees who have no dependents.

3
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra: Procedural Form

■ Q1: Retrieve the name and address of all employees who work for the
‘Research’ department.
RESEARCH_DEPT ← σ DNAME=’Research’ (DEPARTMENT)
RESEARCH_EMPS ← (RESEARCH_DEPT ⋈ DNUMBER= DNOEMPLOYEE EMPLOYEE)
RESULT ← π FNAME, LNAME, ADDRESS (RESEARCH_EMPS)

■ Q2: Retrieve the names of employees who have no dependents.


ALL_EMPS ← π SSN(EMPLOYEE)
EMPS_WITH_DEPS(SSN) ← π ESSN(DEPENDENT)
EMPS_WITHOUT_DEPS ← (ALL_EMPS - EMPS_WITH_DEPS)
RESULT ← π LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)

4
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra: Single Expression

As a single expression, these queries become:


■ Q1: Retrieve the name and address of all employees who work for
the ‘Research’ department.
π Fname, Lname, Address (σ Dname= ‘Research’
(DEPARTMENT⋈Dnumber=Dno(EMPLOYEE))

■ Q2: Retrieve the names of employees who have no dependents.


π Lname, Fname((π Ssn (EMPLOYEE) − ρ Ssn (π Essn (DEPENDENT)))⋈
EMPLOYEE)

5
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

Consider a database with the following schema:

Write relational algebra expressions for the following queries

6
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

7
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

1: Retrieve rows corresponding to expert (rating more than 8) sailors

2: Retrieve the rating of each sailor

3: Retrieve the age of sailors

4: Retrieve name and ratings of highly rated sailor

8
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

9/22/2021 9
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

9/22/2021 10
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

9/22/2021 11
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

INCORRECT

9/22/2021 12
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

9/22/2021 14
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

9/22/2021 15
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra for practice

Consider a database with the following schema:

Person ( name, age, name is a key


gender )
Frequents ( name, pizzeria ) (name, pizzeria) is a key

Eats ( name, pizza ) (name, pizza) is a key

Serves ( pizzeria, pizza, (pizzeria, pizza) is a key


price )
Write relational algebra expressions for the following queries

9/22/2021 16
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

1. Find all pizzerias frequented by at least one person under the age of 18.
2. Find the names of all females who eat either mushroom or pepperoni
pizza (or both).
3. Find the names of all females who eat both mushroom and pepperoni
pizza.
4. Find all pizzerias that serve at least one pizza that Amy eats for less than
$10.00.
5. Find all pizzerias that are frequented by only females or only males.
6. For each person, find all pizzas the person eats that are not served by any
pizzeria the person frequents. Return all such person (name) / pizza pairs.
7. Find the names of all people who frequent only pizzerias serving at least
one pizza they eat.
8. Find the names of all people who frequent every pizzeria serving at least
one pizza they eat.
9. Find the pizzeria serving the cheapest pepperoni pizza. In the case of ties,
9/22/2021 return all of the cheapest-pepperoni pizzerias. 17
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

In general there are many correct expressions for each query.


Sample solutions

9/22/2021 18
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

9/22/2021 19
DATABASE MANAGEMENT SYSTEM
Example Queries in Relational Algebra

9/22/2021 20
DATABASE MANAGEMENT SYSTEM
Summary

Relational Algebra
• Unary Relational Operations

• Relational Algebra Operations From Set Theory

• Binary Relational Operations

• Additional Relational Operations

• Examples of Queries in Relational Algebra

9/22/2021 21
DATABASE MANAGEMENT SYSTEM
Q&A

1. The relational algebra expression for finding the names of instructors in the
Physics department is:
a) Instructor - Physics
b) σ Name (∏ Dept_Name = "Physics" (Instructor))
c) ∏ Name (Instructor ⋈ Physics)
d) ∏ Name (σ Dept_Name = "Physics" (Instructor))

2. The expression ρ City -> Location (Customers) is used to:


a) Rename the attribute "City" to "Location" in the "Customers" relation
b) Select only the "City" column from "Customers"
c) Find the intersection of "City" and "Location" columns
d) Perform a join operation on "Customers"

9/22/2021 22
DATABASE MANAGEMENT SYSTEM
Q&A

3. The result of the operation ∏ Name, Department (σ JoiningYear < 2020


(Employees)) will return:
a) EmployeeIDs of employees who joined before 2020
b) Names and departments of employees who joined before 2020
c) Employee salaries who joined before 2020
d) All attributes of employees who joined before 2020

4. The SIDs of students who are not enrolled in any course is given by:
A. ∏ SID(Student) - ∏ SID(Enrolled)
B. σ SID NOT IN (∏ SID(Enrolled)) (Student)
C. ∏ SID(Student ⋈ Enrolled)
D. ∏ SID(Enrolled) - ∏ SID(Student)

9/22/2021 23
DATABASE MANAGEMENT SYSTEM
Q&A

Answers

1. D
2. A
3. B
4. A

9/22/2021 24
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

25
Database Management Systems
SQL overview, Data definition,
Structure of SQL queries
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and S
Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Introduction to SQL Commands

SQL language
● Considered as one of the major reasons for the commercial success of relational
databases.

SQL

● SQL actually comes from the word “SEQUEL” which was the original term used in the
paper: “SEQUEL TO SQUARE” by Chamberlin and Boyce. IBM could not copyright that
term, so they abbreviated to SQL and copyrighted the term SQL.
● Now popularly known as “Structured Query language”.
● The language has features for Data definition, Data Manipulation, Transaction control ,
Indexing ,Security specification (Grant and Revoke), Active databases, Multi-media ,
Distributed databases etc.
Database Management Systems
Types of database languages

● Data Definition Language (DDL) statements are used to define


the database structure or schema.
● Data Manipulation Language (DML) statements are used for
managing data within schema objects
● Data Control Language (DCL) statements are used to control
access to the database.
● Transaction Control Language (TCL) is used to run the changes
made by the DML statement.
Database Management Systems
Schema Concepts in SQL

SQL schema
● Identified by a schema name
● Includes an authorization identifier and descriptors for each element
● Each statement in SQL ends with a semicolon
● Schema elements include tables, constraints, views, domains and other constraints
Database Management Systems
Schema Concepts in SQL

CREATE SCHEMA statement:

MySQL Syntax:
CREATE SCHEMA Library;
This is same as CREATE DATABASE Library; ie schema and database are same in MySql

Equivalent PostgreSQL/Oracle syntax: (for reference)

CREATE SCHEMA Library AUTHORIZATION ‘Jsmith’;


Database Management Systems
Attribute Data Types in SQL

Basic data types:

Numeric data types


● Integer numbers: INTEGER, INT, and SMALLINT
● Floating-point (real) numbers: FLOAT or REAL, and DOUBLE PRECISION

Character-string data types


● Fixed length: CHAR(n), CHARACTER(n)
● Varying length: VARCHAR(n), CHAR VARYING(n), CHARACTER VARYING(n)
Database Management Systems
Attribute Data Types in SQL

Bit-string data types

● Fixed length: BIT(n)


● Varying length: BIT VARYING(n)

Boolean data type

● Values of TRUE or FALSE or NULL

DATE data type

● Ten positions
● Components are YEAR, MONTH, and DAY in the form YYYY-MM-DD
● Multiple mapping functions available in RDBMSs to change date formats

Note: Learn more about MYSQL inbuilt Date Functions here


Database Management Systems
Attribute Data Types in SQL

Timestamp data type

● Includes the DATE and TIME fields


● Plus a minimum of six positions for decimal fractions of seconds
● Optional WITH TIME ZONE qualifier
INTERVAL data type

● Specifies a relative value that can be used to increment or decrement an absolute value of a date,
time, or timestamp

DATE, TIME, Timestamp, INTERVAL data types can be cast or converted to string formats for
comparison.
Database Management Systems
Attribute Data Types in SQL Explained

The SQL standard supports a variety of built-in types, including:

• char(n): A fixed-length character string with user-specified length n. The full form,

character, can be used instead.

• varchar(n): A variable-length character string with user-specified maximum length

n. The full form, character varying is equivalent.

• int: An integer (a finite subset of the integers that is machine dependent). The full

form, integer, is equivalent.

• smallint: A small integer (a machine-dependent subset of the integer type).


Database Management Systems
Attribute Data Types in SQL Explained

• Numeric(p, d): A fixed-point number with user-specified precision. The number consists of p digits
(plus a sign), and d of the p digits are to the right of the decimal point. Thus, numeric(3,1) allows
44.5 to be stored exactly, but neither 444.5 nor 0.32 can be stored exactly in a field of this type.
• Real, double precision: Floating-point and double-precision floating-point numbers with machine-
dependent precision.
• Float(n): A floating-point number with a precision of at least n digits.
• Null: Each type may include a special value called the null value. A null value indicates an absent
value that may exist but be unknown or that may not exist at all.
• ENUM: Data type in MySQL is a string object. It allows us to limit the value chosen from a list of
permitted values in the column specification at the time of table creation. It is short
for enumeration, which means that each column may have one of the specified possible values.
Database Management Systems
Char vs Varchar
Database Management Systems
Char vs Varchar

The char data type stores fixed-length strings.

Consider, for example, an attribute A of type char(10).

• If we stored a string “Avi” in this attribute, seven spaces are appended to the string to
make it 10 characters long.

• In contrast, if attribute B were of type varchar(10), and we stored “Avi” in attribute B, no


spaces would be added.
Database Management Systems
Advanced Data Types like CLOB, BLOB

Large-Object Types

● Many current-generation database applications need to store attributes that can


be large (of the order of many kilobytes)
● These data types include as a photograph, or very large (of the order of many megabytes or
even gigabytes), such as a high-resolution medical image or video clips.
● SQL therefore provides large-object data types for character data (clob) and binary data (blob).
The letters “lob” in these data types stand for “Large OBject.”
Database Management Systems
Advanced Data Types like CLOB, BLOB

Large-Object Types

For example, we may declare attributes in oracle as:


book review clob(10KB)
image blob(10MB)
movie blob(2GB)
• For result tuples containing large objects (multiple megabytes to gigabytes) , it is inefficient or
impractical to retrieve an entire large object into memory.
• LOBs in the database are stored in a way that optimizes the space and provides efficient
access within the database tablespaces.
• Internal LOBs (BLOBs, CLOBs) also provide transactional support (Commit, Rollback, and so
on) of the database server.
Database Management Systems
Advanced Data Types like CLOB, BLOB

•BLOBs (Binary LOBs) used to store unstructured binary (also called “raw”) data, such as
video clips.
•CLOBs (Character LOBs) used to store large blocks of character data from the database
character set.
•Blob and Clob together are known as LOB(Large Object Type). The following are the major
differences between Blob and Clob data types.

Binary Large Object Stores any kind of data in binary format such as images,
(BLOB) audio, and video.

Stores string data in the database having character set


Character Large
format. Used for large set of characters/strings or
Object (CLOB)
documents that use the database character.
Database Management Systems
Advanced Data Types like CLOB, BLOB

Blob Clob
● The full form of Blob is a Binary Large ● The full form of Clob is Character Large
Object. Object.
● This is used to store large binary data. ● This is used to store large textual data.
● This stores values in the form of binary ● This stores values in the form of character
streams. streams.
● Using this you can stores files like videos, ● Using this you can store files like text files,
images, gifs, and audio files. PDF documents, word documents etc.
● MySQL supports this with the following ● MySQL supports this with the following
datatypes:TINYBLOB datatypes:TINYTEXT
● BLOB ● TEXT
● MEDIUMBLOB ● MEDIUMTEXT
● LONGBLOB ● LONGTEXT
Database Management Systems
Single Example covers all data types

CREATE TABLE example (


id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
char_column CHAR(50) NOT NULL,
varchar_column VARCHAR(255) NOT NULL,
tinytext_column TINYTEXT NOT NULL,
text_column TEXT NOT NULL,
mediumtext_column MEDIUMTEXT NOT NULL,
longtext_column LONGTEXT NOT NULL,
binary_column BINARY(20) NOT NULL,
varbinary_column VARBINARY(20) NOT NULL,
tinyblob_column TINYBLOB NOT NULL,
blob_column BLOB NOT NULL,
mediumblob_column MEDIUMBLOB NOT NULL,
Database Management Systems
Single Example covers all data types

longblob_column LONGBLOB NOT NULL,


enum_column ENUM('value1', 'value2', 'value3') NOT NULL,
set_column SET('value1', 'value2', 'value3') NOT NULL,
bool_column BOOLEAN NOT NULL DEFAULT 0,
date_column DATE NOT NULL,
datetime_column DATETIME NOT NULL,
timestamp_column TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON
UPDATE CURRENT_TIMESTAMP,
time_column TIME NOT NULL,
year_column YEAR(4) NOT NULL,
float_column FLOAT(10,6) NOT NULL DEFAULT '0.000000',
double_column DOUBLE(10,6) NOT NULL DEFAULT '0.000000'
);
Database Management Systems
SQL Data Definition, Data Types, Standards

Terminology:
● Table, row, and column used for relational model terms relation, tuple, and
attribute respectively.

CREATE statement
● Main SQL command for data definition

● The CREATE TABLE command is used to create a new relation by giving it a


name and specifying its attributes and initial constraints.
Database Management Systems
Data Definition Language(DDL)

• The SQL DDL provides commands for defining table schemas, deleting tables,
and modifying table schemas.
• DDL includes commands such as CREATE, ALTER, and DROP.
• DDL is used to CREATE, ALTER, OR DROP the database objects (Table, Views, Users).
• CREATE - to create objects in the database
• ALTER - alters the structure of the database
• DROP - delete objects from the database
• TRUNCATE - remove all records from a table, including all spaces allocated for the records
are removed
• COMMENT - add comments to the data dictionary
• RENAME - rename an object
Database Management Systems
Types of Database Languages: DDL

CREATE TABLE

Syntax: Create table from existing table:


CREATE TABLE CREATE TABLE new_table_name AS SELECT *
table_name(
Col_name1 datatype(), FROM existing_table;
Col_name2 datatype(),… CREATE TABLE new_table_name LIKE
Col_namen datatype(),
existing_table;
);
Database Management Systems
Types of Database Languages: DDL

Difference between CREATE TABLE new_table_name AS SELECT * FROM existing_table;


and CREATE TABLE new_table_name LIKE existing_table;
Database Management Systems
Types of Database Languages: DDL

ALTER TABLE
3) RENAME
1) ADD
Syntax:
Syntax:
ALTER TABLE table_name
ALTER TABLE table_name
RENAME COLUMN (Old_fieldname TO
ADD Col_name datatype()...;
New_fieldname...);
2) MODIFY
4) DROP
Syntax:
Syntax:
ALTER TABLE table_name
ALTER TABLE table_name DROP
MODIFY (fieldname datatype()...);
column_name;
Learn more examples
Database Management Systems
Types of Database Languages: DDL

E) RENAME
C) DESCRIBE TABLE
Syntax:
Syntax:
RENAME TABLE table_name to new
DESCRIBE table_name;
table_name
//lists basic description.
SHOW CREATE TABLE table_name; F) TRUNCATE
// lists complete information of table including Syntax:
constraints,index etc.. TRUNCATE TABLE table_name; // delete
complete data from an existing table. Table
D) DROP TABLE Structure remains
Syntax: DROP TABLE table_name;
// Complete table structure will be dropped
Database Management Systems
Types of Database Languages: DDL
Feature DROP TRUNCATE

Removes data Yes Yes

Removes table Yes (table is deleted) No (structure remains)


structure

Auto_Increment reset N/A (table removed) Resets to 1 (in MySQL)

Constraints & Deleted along with table Preserved (only rows deleted)
Indexes

Rollback (MySQL) Cannot be rolled back Cannot be rolled back (rollback dealt in further slides)

Use case When table is no longer needed When you want to empty the table but keep its structure
Database Management Systems
COMPANY relational database schema
Database Management Systems
SQL CREATE TABLE data definition statements for defining the COMPANY schema

• CREATE TABLE `department` ( `Dname` varchar(15) NOT NULL, `Dnumber` int(11)


NOT NULL, `Mgr_ssn` char(9) NOT NULL, `Mgr_start_date` date DEFAULT NULL);

• CREATE TABLE `dependent` ( `Essn` char(9) NOT NULL, `Dependent_name`


varchar(15) NOT NULL, `Gender` char(1) DEFAULT NULL, `Bdate` date DEFAULT
NULL, `Relationship` varchar(8) DEFAULT NULL);

• CREATE TABLE `dept_locations` ( `Dnumber` int(11) NOT NULL, `Dlocation`


varchar(15) NOT NULL);
Database Management Systems
SQL CREATE TABLE data definition statements for defining the COMPANY schema

• CREATE TABLE `employee` ( `Fname` varchar(15) NOT NULL, `Minit` char(1) DEFAULT NULL,
`Lname` varchar(15) NOT NULL, `Ssn` char(9) NOT NULL, `Bdate` date DEFAULT NULL,
`Address` varchar(30) DEFAULT NULL, `Gender` char(1) DEFAULT NULL, `Salary` decimal(10,2)
DEFAULT NULL, `Super_ssn` char(9) DEFAULT NULL, `Dno` int(11) NOT NULL);

• CREATE TABLE `project` ( `Pname` varchar(15) NOT NULL, `Pnumber` int(11) NOT NULL,
`Plocation` varchar(15) DEFAULT NULL, `Dnum` int(11) NOT NULL);

• CREATE TABLE `works_on` ( `Essn` char(9) NOT NULL, `Pno` int(11) NOT NULL, `Hours`
decimal(3,1) DEFAULT NULL);
Database Management Systems
One possible database state for the COMPANY relational database schema
Database Management Systems
One possible database state for the COMPANY relational database schema
Database Management Systems
Q&A

1. What is the difference between CHAR(n) and VARCHAR(n)?


A) CHAR(n) stores variable-length strings, VARCHAR(n) stores fixed-length strings
B) CHAR(n) stores fixed-length strings, VARCHAR(n) stores variable-length strings
C) Both store fixed-length strings
D) Both store variable-length strings

2. What is the format of the DATE data type in SQL?


A) DD-MM-YYYY
B) MM-DD-YYYY
C) YYYY-MM-DD
D) Any format can be used
Database Management Systems
Q&A

3. What is a key difference between BLOB and CLOB data types?


A) BLOB is used for large binary objects, CLOB is used for large character data
B) Both store large binary objects
C) Both store large character objects
D) CLOB is used for numbers only

4. What is the purpose of the ENUM data type in MySQL?


A) Storing fixed-length character data
B) Storing numeric data
C) Defining a list of possible values for a column
D) Storing date and time values
Database Management Systems
Q&A

Answers:

1. B
2. C
3. A
4. C
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

35
Database Management Systems
SQL overview, Data definition,
Structure of SQL queries
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and S
Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Specifying Constraints in SQL

• Constraints in SQL refer to the conditions and restrictions


that are applied on the database
• This further means that before inserting data into the
database, we are checking for some conditions.
• If the condition we have applied to the database holds true
for the data that is to be inserted, then only the data will be
inserted into the database tables.
• Constraints can be specified when the table is created with
the CREATE TABLE statement, or after the table is created
with the ALTER TABLE statement
Database Management Systems
Specifying Constraints in SQL

Basic constraints:

Relational Model has 3 basic constraint types that are


supported in SQL:

● Key constraint: A primary key value cannot be duplicated


● Entity Integrity Constraint: A primary key value cannot be
null
● Referential integrity constraints : The “foreign key “ must
have a value that is already present as a primary key, or may
be null
Database Management Systems
Specifying Constraints in SQL

Other Restrictions on attribute domains:

Default value of an attribute


● DEFAULT <value>
● NULL is not permitted for a particular attribute (NOT NULL)

CHECK clause
Dnumber INT NOT NULL CHECK (Dnumber > 0 AND Dnumber < 21);
Database Management Systems
Specifying Constraints in SQL

NOT NULL
● Because SQL allows NULLs as attribute values, a constraint NOT NULL may be
specified if NULL is not permitted for a particular attribute.
● This is always implicitly specified for the attributes that are part of the primary key of
each relation, but it can be specified for any other attributes whose values are required
not to be NULL
Example:
Database Management Systems
Specifying Constraints in SQL

DEFAULT
● It is also possible to define a default value for an attribute by appending the clause
DEFAULT <value> to an attribute definition.
● The default value is included in any new tuple if an explicit value is not provided for
that attribute
● If no default clause is specified, the default default value is NULL for attributes
that do not have the NOT NULL constraint.
Example:
Database Management Systems
Specifying Constraints in SQL

PRIMARY KEY clause

● Specifies one or more attributes that make up the primary key of a


relation
● Dnumber INT PRIMARY KEY;
UNIQUE clause

● Specifies alternate (secondary) keys (called CANDIDATE keys in the


relational model).
● Dname VARCHAR(15) UNIQUE;
Database Management Systems
Specifying Constraints in SQL

FOREIGN KEY clause

Default operation: reject update on violation

● Attach referential triggered action clause


● Options include SET NULL, CASCADE, and SET DEFAULT
● Action taken by the DBMS for SET NULL or SET DEFAULT is the same for both
ON DELETE and ON UPDATE
● CASCADE option suitable for “relationship” relations
Database Management Systems
Specifying Constraints in SQL

Using the Keyword CONSTRAINT CREATE TABLE EMPLOYEE ( … ,


Name a constraint Dno INT NOT NULL DEFAULT 1,
Useful for later altering
CONSTRAINT EMPPK PRIMARY KEY (Ssn),
CONSTRAINT EMPSUPERFK
FOREIGN KEY (Super_ssn) REFERENCES
The general structure of the SQL
EMPLOYEE(Ssn)
CONSTRAINT is defined as:
ON DELETE SET NULL ON UPDATE CASCADE,
The CONSTRAINT keyword is
CONSTRAINT EMPDEPTFK
followed by a constraint name
FOREIGN KEY(Dno) REFERENCES
followed by a column or a list of
DEPARTMENT(Dnumber)
columns.
ON DELETE SET DEFAULT ON UPDATE CASCADE);
Database Management Systems
Example for Specifying Constraints in SQL
Database Management Systems
Specifying Constraints in SQL

CREATE TABLE DEPT_LOCATIONS ( Dnumber INT NOT NULL,


Dlocation VARCHAR(15) NOT NULL DEFAULT ‘Banglore’,
PRIMARY KEY (Dnumber, Dlocation),
Constraint “fk_dno ” FOREIGN KEY (Dnumber) REFERENCES
DEPARTMENT(Dnumber) );

Referential actions
FOREIGN KEY (foreign_key_columns)
REFERENCES parent_table(parent_key_columns)
ON UPDATE action
ON DELETE action;
Database Management Systems
Referential actions
Delete actions of rows in the parent table
If you delete one or more rows in the parent table, you can set one of the following actions:
● ON DELETE NO ACTION: SQL Server raises an error and rolls back the delete action on the row
in the parent table.
● ON DELETE CASCADE: SQL Server deletes the rows in the child table that is corresponding to
the row deleted from the parent table.
● ON DELETE SET NULL: SQL Server sets the rows in the child table to NULL if the corresponding
rows in the parent table are deleted. To execute this action, the foreign key columns must be
nullable.
● ON DELETE SET DEFAULT SQL Server sets the rows in the child table to their default values if the
corresponding rows in the parent table are deleted. To execute this action, the foreign key columns
must have default definitions. Note that a nullable column has a default value of NULL if no default
value specified.
By default, SQL Server applies ON DELETE NO ACTION if you don’t explicitly specify any action.
Database Management Systems
Referential actions

Update action of rows in the parent table


If you update one or more rows in the parent table, you can set one of the following actions:
● ON UPDATE NO ACTION: SQL Server raises an error and rolls back the update action on the
row in the parent table.
● ON UPDATE CASCADE: SQL Server updates the corresponding rows in the child table when the
rows in the parent table are updated.
● ON UPDATE SET NULL: SQL Server sets the rows in the child table to NULL when the
corresponding row in the parent table is updated. Note that the foreign key columns must be
nullable for this action to execute.
● ON UPDATE SET DEFAULT: SQL Server sets the default values for the rows in the child table
that have the corresponding rows in the parent table updated.
Database Management Systems
Specifying Constraints on Tuples Using CHECK

● Additional Constraints on individual tuples within a relation are also possible using
CHECK

CHECK clauses at the end of a CREATE TABLE statement

● Apply to each tuple individually

CHECK (Dept_create_date <= Mgr_start_date);


Database Management Systems
Schema Change Statements in SQL

Schema evolution commands

● DBA may want to change the schema while the database is operational
● Does not require recompilation of the database schema
● schema evolution commands available in SQL, which can be used to alter a
schema by adding or dropping tables, attributes, constraints, and other
schema elements.
● This can be done while the database is operational and does not require
recompilation of the database schema checks must be done by the DBMS to
ensure that the changes do not affect the rest of the database and make it
inconsistent
Database Management Systems
The DROP Command

DROP command

• The DROP command can be used to drop named schema elements, such as tables,
domains, types, or constraints.

• One can also drop a whole schema if it is no longer needed by using the DROP SCHEMA
command. There are two drop behavior options: CASCADE and RESTRICT.

• For example, to remove the COMPANY database schema and all its tables, domains, and
other elements, the CASCADE option is used as follows
Drop behavior options:
CASCADE and RESTRICT
Example:
DROP SCHEMA COMPANY CASCADE;
This removes the schema and all its elements including tables,views, constraints, etc.
Database Management Systems
The DROP Command

• If a base relation within a schema is no longer needed, the relation and its
definition can be deleted by using the DROP TABLE command.
• For example, if we no longer wish to keep track of dependents of employees
in the COMPANY database.
• we can get rid of the DEPENDENT relation by issuing the following
command:
DROP TABLE DEPENDENT CASCADE;
• If the RESTRICT option is chosen instead of CASCADE, a table is dropped
only if it is not referenced in any constraints
Database Management Systems
MySql Commands

To create a new database in MySQL use the CREATE DATABASE statement with the
below syntax:

Syntax:
CREATE DATABASE [IF NOT EXISTS] database_name
To show the database created in MySQL

SHOW CREATE DATABASE employeedb;

To check the created database using the following query:


SHOW DATABASES;
Database Management Systems
Describe in sql

The DESCRIBE command in MySQL provides a detailed overview of a table's


structure, including its column names, data types used, and any constraints
applied to the columns.

DESCRIBE table_name;
(or)
SHOW COLUMNS FROM table_name;
(or)
DESC table_name;
Database Management Systems
MySql Commands

SQL command USE is used to select a particular database:


Syntax:
USE database_name;

To create table in the database the following query is used:


Create table table_name(…….)
To drop the database
DROP DATABASE [IF EXISTS] database_name;
Example:
Drop database customers;
Database Management Systems
The ALTER table command

Alter table actions include:

● Adding or dropping a column (attribute)


● Changing a column definition
● Adding or dropping table constraints

Example:

ALTER TABLE people ADD COLUMN Job VARCHAR(12);


Database Management Systems
The ALTER table command

1) ADD a column in the table


ALTER TABLE table_name ADD new_column_name column_definition
[ FIRST | AFTER column_name ];
2) Add multiple columns in the table
ALTER TABLE table_name ADD new_column_name column_definition
[ FIRST | AFTER column_name ],
ADD new_column_name column_definition [ FIRST | AFTER column_name ],
3) MODIFY column in the table
ALTER TABLE table_name MODIFY column_name column_definition
[ FIRST | AFTER column_name ];
4) DROP column in table
ALTER TABLE table_name
DROP COLUMN column_name;
Database Management Systems
The ALTER table command
5) RENAME column in table
ALTER TABLE table_name CHANGE COLUMN old_name new_name column_definition
[ FIRST | AFTER column_name ]
6) RENAME table
ALTER TABLE table_name
RENAME TO new_table_name;

7) Altering (Changing) a Column Definition or a Name


To change a column's definition, use MODIFY or CHANGE clause along with the ALTER command.
For example, to change column c from CHAR(1) to CHAR(10), you can use the following command
ALTER TABLE table_name MODIFY c CHAR(10);
ALTER TABLE table_name MODIFY j BIGINT NOT NULL DEFAULT 100;
ALTER TABLE table_name ALTER i SET DEFAULT 1000;
*Note:i,j are columns
Database Management Systems
Change the column position of MySQL table without losing
column data

You can change the column position of MySQL table without losing data with the help of
ALTER TABLE command

Syntax:

ALTER TABLE yourTableName MODIFY yourColumnName1 data type AFTER /BEFORE


yourColumnName2;
Database
Changing Management
a Column’sSystems
Default Value

Use SET DEFAULT to specify the default value explicitly, or DROP DEFAULT to remove the
current default and allow MySQL to assign the “default default.”
To change a default value, use
Syntax:
ALTER col_name SET DEFAULT
Ex-
ALTER TABLE mytbl ALTER j SET DEFAULT 1000;

To drop a default value, use

Syntax:

ALTER col_name DROP DEFAULT

Ex-

ALTER TABLE mytbl ALTER j DROP DEFAULT;


Database Management Systems
Alter Command for adding a attribute

• We use the alter table command to add attributes to an existing table.

• All tuples in the table are assigned null as the value for the new attribute.

• The form of the alter table command is

alter table r add A D;

• where r is the name of an existing table, A is the name of the attribute to be added, and
D is the type of the added attribute.
Database Management Systems
Alter Command for dropping a attribute

• We can drop attributes from a table by the command

alter table r drop A;

• where r is the name of an existing table.

• A is the name of an attribute of the table.

• Many database systems do not support dropping of attributes, although they will allow an entire
table to be dropped.
Database Management Systems
Adding and Dropping Constraints

Change constraints specified on a table

Add or drop a named constraint

Adding constraint
ALTER table people add constraint people_gender_fk foreign key (gender)
references gender_tab;

Dropping constraint
ALTER TABLE Persons DROP CONSTRAINT PK_Person;
Database Management Systems
Dropping Columns, Default Values

To drop a column

● Choose either CASCADE or RESTRICT


● CASCADE would drop the column from views etc.
● RESTRICT is possible if no views refer to it.

ALTER TABLE EMPLOYEE DROP COLUMN Address CASCADE;

Default values can be dropped and altered :

ALTER TABLE DEPARTMENT ALTER COLUMN Mgr_ssn DROP DEFAULT;


ALTER TABLE DEPARTMENT ALTER COLUMN Mgr_ssn SET DEFAULT
‘333445555’;
Database Management Systems
Drop table

Drop:
• To remove a table from an SQL database, we use the drop table command.

• The drop table command deletes all information about the dropped table from the
database drop table r; // Complete table structure will be dropped

Syntax:
DROP TABLE table1;

DROP TABLE IF EXISTS table1;

DROP TABLE IF EXISTS table1, table2, table3;


Database Management Systems
Truncate table

Truncate:
A truncate SQL statement is used to remove all rows (complete data) from a table. It is
similar to the DELETE statement with no WHERE clause.

Syntax:

TRUNCATE TABLE table_name;

// delete complete data from an existing table. Table Structure remains


Database Management Systems
Drop vs Truncate
Database Management Systems
Rename table

RENAME:

RENAME TABLE offers more flexibility. It allows renaming multiple tables in one statement.
This can be useful when replacing a table with a new pre-populated version:

Syntax:

RENAME table table_name to new table_name

RENAME TABLE products TO products_old, products_new TO products;


Database Management Systems
ALTER - Example

Consider 2 tables - ‘employee’ and ‘department’

CREATE TABLE employee (


emp_id INT PRIMARY KEY,
name VARCHAR(50)
);

CREATE TABLE department (


id SMALLINT PRIMARY KEY
);
Database Management Systems
ALTER - Example

1. Adding a New Column

ALTER TABLE employee


ADD COLUMN department_id INT;

2. Reposition a Column ( ‘department_id’ after ‘name’)

ALTER TABLE employee


MODIFY COLUMN department_id INT AFTER name;
--We use the ‘AFTER’ keyword to reposition col1 after any col2

To place a column as the first column in the table, we use the ‘FIRST’ keyword
ALTER TABLE employee
MODIFY COLUMN department_id INT FIRST;
Database Management Systems
ALTER - Example

3. Rename a Column (department_id to dept_ref_id)

ALTER TABLE employee


RENAME COLUMN department_id TO dept_ref_id;

4. Change Datatype of a Column (INT to SMALLINT for dept_ref_id)

ALTER TABLE employee


MODIFY COLUMN dept_ref_id SMALLINT;
Database Management Systems
ALTER - Example

5. Adding Constraints

a). NOT NULL

ALTER TABLE employee


MODIFY COLUMN dept_ref_id SMALLINT NOT NULL;
--This ensures that dept_ref_id can never have a null value

b). DEFAULT

ALTER TABLE employee


ALTER COLUMN dept_ref_id SMALLINT NOT NULL SET DEFAULT 100;
--This sets the default dept_id as 100 if no value has been assigned to it
Database Management Systems
ALTER - Example

c). CHECK

ALTER TABLE employee


ADD CONSTRAINT chk_dept CHECK (dept_ref_id > 0);
--The chk_dept is a check constraint which will check if the value inserted is >0 and if not, it returns an error.

d). PRIMARY KEY


Suppose we want to make dept_ref_id as the primary key instead of emp_id. For this, we first have to drop the
existing primary key (emp_id) and then add the primary key constraint for defp_ref_id

ALTER TABLE employee


DROP PRIMARY KEY;

ALTER TABLE employee


ADD CONSTRAINT pk_emp PRIMARY KEY (dept_ref_id);
Database Management Systems
ALTER - Example

e). FOREIGN KEY

● ON DELETE CASCADE, ON UPDATE CASCADE

ALTER TABLE employee


ADD CONSTRAINT fk_dept
FOREIGN KEY (dept_ref_id) REFERENCES department(id)
ON DELETE CASCADE
ON UPDATE CASCADE;

If the foreign key constraint is being added to an attribute which already has an existing foreign key constraint,
we first have to drop the existing constraint and then add a new one. Otherwise, it is not necessary.

ALTER TABLE employee


DROP FOREIGN KEY fk_dept; -- In this case, the foreign key is named fk_dept
Database Management Systems
ALTER - Example

● ON DELETE SET NULL

ALTER TABLE employee


ADD CONSTRAINT fk_null
FOREIGN KEY (dept_ref_id) REFERENCES department(id)
ON DELETE SET NULL;

● ON DELETE SET DEFAULT


-- Set column default before applying
ALTER TABLE employee
MODIFY COLUMN dept_ref_id SMALLINT DEFAULT 1;

ALTER TABLE employee


ADD CONSTRAINT fk_default
FOREIGN KEY (dept_ref_id) REFERENCES department(id)
ON DELETE SET DEFAULT;
Database Management Systems
ALTER - Example

e). Dropping Constraints

● DROPPING DEFAULT

ALTER TABLE employee


ALTER COLUMN dept_ref_id DROP DEFAULT;

● DROPPING CHECK

ALTER TABLE employee


DROP CONSTRAINT chk_dept;
Database Management Systems
ALTER - Example

● DROPPING PRIMARY KEY

ALTER TABLE employee


DROP PRIMARY KEY;

● DROPPING FOREIGN KEY

ALTER TABLE employee


DROP FOREIGN KEY fk_default;
Database Management Systems
ALTER - Example

f). Dropping a Column

ALTER TABLE employee


DROP COLUMN dept_ref_id;
Database Management Systems
Q&A

1. Which command is used to delete all records from a table without removing its structure?
A) DELETE
B) DROP
C) TRUNCATE
D) ALTER

2. In SQL, what does the DESCRIBE command do?


A) Shows the details of a table structure
B) Deletes a table
C) Modifies a table
D) Creates a new schema
Database Management Systems
Q&A

3. Which of the following statements is used to remove an entire table from the database?
A) DELETE
B) DROP
C) TRUNCATE
D) REMOVE

4. The ALTER TABLE statement is used to?


A) Remove a table
B) Modify an existing table structure
C) Retrieve table data
D) Define a new schema
Database Management Systems
Q&A

Answers:

1. C
2. A
3. B
4. B
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

48
Database Management Systems
Database Modification
Prof. Mahitha G

Department of Computer Science and Engineering

Acknowledgments: Dr. Nagasundari S


(Dept of CSE), Teaching Assistants
Database Management Systems

Unit 1: Introduction to Database Management and SQL


Slides adapted from Author Slides of “Database System Concepts”, Silberschatz, H Korth and
S Sudarshan, McGrawHill, 7th Edition, 2019. And Author Slides of Fundamentals of Database
Systems”, Ramez Elamsri, Shamkant B Navathe, Pearson, 7th Edition, 2017.

Prof. Mahitha G.

Department of Computer Science and Engineering


2
Database Management Systems
Database Modification

DML commands in Structured Query Language change the data present in the SQL database. We can
easily access, store, modify, update and delete the existing records from the database using DML
commands.

DML commands:
• SELECT Command
• INSERT Command
• UPDATE Command
• DELETE Command
Database Management Systems
INSERT COMMAND

• In its simplest form, INSERT is used to add a single tuple (row) to a relation (table).
• We must specify the relation name and a list of values for the tuple.
• There are two forms of the INSERT statement.
Form 1:
The values should be listed in the same order in which the corresponding attributes were
specified in the CREATE TABLE command.
Syntax:
INSERT INTO table_name VALUES (value1,value2,…..);
Example:
INSERT INTO EMPLOYEE VALUES ( ‘Richard’, ‘K’, ‘Marini’, ‘653298653’, ‘1962-12-30’, ‘98
Oak Forest, Katy, TX’, ‘M’, 37000, ‘653298653’, 4 );
Database Management Systems
INSERT COMMAND

Before Insert:

Example: INSERT INTO EMPLOYEE VALUES ( ‘Richard’, ‘K’, ‘Marini’, ‘653298653’, ‘1962-12-30’, ‘98
Oak Forest, Katy, TX’, ‘M’, 37000, ‘653298653’, 4 );
Output:
Database Management Systems
INSERT COMMAND

• Form 2 allows the user to specify explicit attribute names that correspond to the
values provided in the INSERT command.
• This is useful if a relation has many attributes but only a few of those attributes are assigned values in the new
tuple.
• However, the values must include all attributes with NOT NULL specification and no default value.
• Attributes with NULL allowed or DEFAULT values are the ones that can be left out.
Syntax:
INSERT INTO table_name(attribute1,attribute2,….) VALUES (value1,value2,…..);
Example: Enter a tuple for a new EMPLOYEE for whom we know only the Fname, Lname, Dno, and Ssn
attributes
Query: INSERT INTO EMPLOYEE (Fname, Lname, Dno, Ssn) VALUES (‘Richard’, ‘Marini’, 4, ‘653298653’);
Database Management Systems
INSERT COMMAND

Another variation of the INSERT command:


A variation of the INSERT command inserts multiple tuples into a relation in conjunction with
creating the relation and loading it with the result of a query.

• create a temporary table that has the employee last name, project name, and hours per week for
each employee working on a project

Command :
CREATE TABLE WORKS_ON_INFO ( Emp_name VARCHAR(15), Proj_name VARCHAR(15),
Hours_per_week DECIMAL(3,1) );
Database Management Systems
INSERT COMMAND

Example:
INSERT INTO WORKS_ON_INFO ( Emp_name, Proj_name, Hours_per_week )
SELECT [Link], [Link], [Link]
FROM PROJECT P, WORKS_ON W, EMPLOYEE E
WHERE [Link] = [Link] AND [Link] = [Link];

Another variation: We can use the command given below to backup the data present in a table. The condition is
that the schema of both the backup table and the original should must be the same.
Example:
INSERT INTO WORKS_ON_INFO_Backup (select * from WORKS_ON_INFO ) ;
Note: WORKS_ON_INFO_Backup table need to be created before.
Database Management Systems
SELECT COMMAND

• The SELECT clause lists the attributes desired in the result of a query.
• It corresponds to the projection operation of relational algebra.
• FROM clause is used to specify the table name or the relation name.
Example: Find the names of all instructors:
select fname,Lname
from EMPLOYEE;
Output:

Note: Select will be discussed


in detail in the next lecture
Database Management Systems
UPDATE COMMAND

• The UPDATE command is used to modify attribute values of one or more selected tuples.
• As in the DELETE command, a WHERE clause in the UPDATE command selects the tuples to be modified
from a single relation.
• However, updating a primary key value may propagate to the foreign key values of tuples in other relations if
such a referential triggered action is specified in the referential integrity Additional constraints of the DDL.
• An additional SET clause in the UPDATE command specifies the attributes to be modified and their new
values.
• Several tuples can be modified with a single UPDATE command.
Syntax:
UPDATE table_name SET column_name = value WHERE condition;
Database Management Systems
UPDATE COMMAND

Example1: To change the location and controlling department number of project number
10 to ‘Bellaire’ and 5, respectively.
Query: UPDATE PROJECT SET Plocation = ‘Bellaire’, Dnum = 5 WHERE Pnumber = 10;
Before: After:
Database Management Systems
UPDATE COMMAND

Example 1: Give all employees in the ‘Research’ department a 10% raise in salary.
Query: UPDATE EMPLOYEE SET Salary = Salary * 1.1 WHERE Dno = 5;
Before : After :

Example 2: Give a 5 percent salary raise to employees whose salary is less than average.
Query: UPDATE employee SET Salary = Salary * 1.05 WHERE salary < (SELECT avg_sal FROM
(SELECT AVG(salary) AS avg_sal FROM employee ) AS temp);
Database Management Systems
UPDATE COMMAND

Example 2: Give a 5 percent salary raise to employees whose salary is less than average.
Query: UPDATE employee SET Salary = Salary * 1.05 WHERE salary < (SELECT avg_sal FROM
(SELECT AVG(salary) AS avg_sal FROM employee ) AS temp);

In PostgreSQL / Oracle / SQL Server:


UPDATE employee SET Salary = Salary * 1.05 WHERE salary < (SELECT AVG(salary) FROM employee);
This does not work in MySQL, reason is it does not allow selecting from the same table you are updating (in the
same query) without wrapping it in a derived table.
Database Management Systems
DELETE COMMAND

• The DELETE command removes tuples from a relation.


• It includes a WHERE clause, similar to that used in an SQL query, to select the tuples to be deleted.
• Tuples are explicitly deleted from only one table at a time.
• However, the deletion may propagate to tuples in other relations if referential triggered actions are specified in the referential
integrity constraints of the DDL.
• Depending on the number of tuples selected by the condition in the WHERE clause, zero, one, or several tuples can be
deleted by a single DELETE command.
• A missing WHERE clause specifies that all tuples in the relation are to be deleted. However, the table remains in the
database as an empty table.
• We must use the DROP TABLE command to remove the table definition.
Syntax:
DELETE FROM table_name WHERE condition
Database Management Systems
DELETE COMMAND

Example1: Delete the employee record whose last name is Zelaya .


Query: DELETE FROM EMPLOYEE WHERE Lname = ‘Zelaya’;
Before: After:

Example2: Delete all employee records [It is similar to truncating employee table]
Query: DELETE FROM EMPLOYEE;
Database Management Systems
Q&A

1. Which SQL clause is used to specify the attributes that should be displayed in the output?
A) FROM
B) WHERE
C) SELECT
D) ORDER BY

2. Which DML operation is used to remove data from a database?


A) SELECT
B) INSERT
C) DELETE
D) UPDATE
Database Management Systems
Q&A

3. In the SQL DML example, "SELECT name FROM instructor WHERE dept_name = 'History'", what
does 'name' refer to?
A) A column in the 'instructor' table
B) A table in the database
C) A value in the 'dept_name' column
D) A query condition

4. What is the purpose of the Data Manipulation Language (DML)?


A) Defining the structure of a database
B) Querying the database
C) Managing database access
D) Storing the data in the database
Database Management Systems
Q&A

Answers:

1. C
2. C
3. A
4. B
THANK YOU

Mahitha G
Department of Computer Science and Engineering
mahithag@[Link]

19

You might also like