Chapter 1: Introduction
Database System Concepts, 7th Ed.
©Silberschatz, Korth and Sudarshan
See [Link] for conditions on re-use
Introduction to Database Management
System
As the name suggests, the database management system consists of
two parts.
They are: 1. Database and 2. Management System
What is a Database?
To find out what database is, we have to start from data, which is the
basic building block of any DBMS.
Introduction to Database Management
System
Introduction to Database Management
System
Introduction to Database Management
System
Database Applications Examples
Enterprise Information
• Sales: customers, products, purchases
• Accounting: payments, receipts, assets
• Human Resources: Information about employees, salaries,
payroll taxes.
Manufacturing: management of production, inventory,
orders, supply chain.
Banking and finance
• customer information, accounts, loans, and banking
transactions.
• Credit card transactions
• Finance: sales and purchases of financial instruments
(e.g., stocks
and bonds; storing real-time market data
Universities: registration, grades
Database System Concepts - 7th Edition 1.4 ©Silberschatz, Korth and Sudarshan
Database Applications Examples (Cont.)
Airlines: reservations, schedules
Telecommunication: records of calls, texts, and data usage,
generating
monthly bills, maintaining balances on prepaid calling cards
Web-based services
• Online retailers: order tracking, customized
recommendations
• Online advertisements
Document databases
Navigation systems: For maintaining the locations of varies
places of interest along with the exact routes of roads, train
systems, buses, etc.
Database System Concepts - 7th Edition 1.5 ©Silberschatz, Korth and Sudarshan
Purpose of Database Systems
In the early days, database applications were built directly on
top of file systems, which leads to:
Data redundancy and inconsistency: data is stored in
multiple file formats resulting induplication of information
in different files
Difficulty in accessing data
• Need to write a new program to carry out each new
task
Data isolation
• Multiple files and formats
Integrity problems
• Integrity constraints (e.g., account balance > 0)
become “buried”
in program code rather than being stated explicitly
• Hard to add new constraints or change existing ones
Database System Concepts - 7th Edition 1.6 ©Silberschatz, Korth and Sudarshan
Purpose of Database Systems (Cont.)
Atomicity of updates
• Failures may leave database in an inconsistent state
with partial updates carried out
• Example: Transfer of funds from one account to another
should either complete or not happen at all
Concurrent access by multiple users
Concurrent access (many users accessing the same data at
the same time) is necessary for performance in systems like
databases and banking apps.
But if not controlled properly, it can cause inconsistencies.
Example: Two users see a balance of 100. Each withdraws 50
at the same time. Both think the balance will be 50, but the
system might record it as 0 or 50 incorrectly, depending on how
the updates overlap.
👉 That’s why concurrency control (like locks, transactions) is
needed to keep data correct.
Security problems
• Hard to provide user access to some, but not all, data
Database systems offer solutions1.7to all the above problems
Database System Concepts - 7th Edition ©Silberschatz, Korth and Sudarshan
Data Models
A collection of tools for describing
• Data
• Data relationships
• Data semantics
• Data constraints
Relational model
Entity-Relationship data model (mainly for database
design)
Object-based data models (Object-oriented and Object-
relational)
Semi-structured data model (XML)
Other older models:
• Network model
• Hierarchical model
Database System Concepts - 7th Edition 1.10 ©Silberschatz, Korth and Sudarshan
Relational Model
All the data is stored in various tables.
Example of tabular data in the relational
model
Columns
Rows
Ted Codd
Turing Award
1981
Database System Concepts - 7th Edition 1.11 ©Silberschatz, Korth and Sudarshan
A Sample Relational Database
Database System Concepts - 7th Edition 1.12 ©Silberschatz, Korth and Sudarshan
View of Data
An architecture for a database
system
Database System Concepts - 7th Edition 1.14 ©Silberschatz, Korth and Sudarshan
Instances and Schemas
1. Logical Schema
• Definition: High-level design of the database (conceptual/blueprint).
• Focuses on what data is stored and how entities relate.
• Independent of physical storage details.
• Usually represented by ER diagrams or relational models.
• Defines tables, attributes, relationships, constraints.
• 👉 Example (University database logical schema):
• Tables/Entities: Students, Courses, Enrollments
• Attributes: StudentID, StudentName, CourseID, CourseName
• Relationships: A student can enroll in many courses (many-to-
many).
Database System Concepts - 7th Edition 1.15 ©Silberschatz, Korth and Sudarshan
Instances and Schemas
Database System Concepts - 7th Edition 1.15 ©Silberschatz, Korth and Sudarshan
Instances and Schemas
Database System Concepts - 7th Edition 1.15 ©Silberschatz, Korth and Sudarshan
2. Physical Schema
• Definition: Actual implementation of the logical schema in the
database system.
• Focuses on how data is stored physically in terms of files, indexes,
partitions, etc.
• Includes performance details like indexes, data types, storage
paths, triggers, views, access methods.
• Tied to the specific DBMS (e.g., SQL Server, Oracle, MySQL).
👉 Example (Physical schema for SQL Server):
CREATE TABLE Students (
StudentID INT PRIMARY KEY IDENTITY(1,1),
StudentName NVARCHAR(100) NOT NULL,
Age INT,
Address NVARCHAR(200)
);
🎓 Scenario: University Course Registration System
The university needs a database to manage students, courses, and their
registrations. You have been hired as a database designer to create this system.
Requirements:
1. Each student has a unique ID, name, email, and department.
2. Each course has a unique ID, title, credit hours, and is offered by one
department.
3. A student can register for many courses, and each course can have many
students (many-to-many relationship).
4. The system should record the semester and grade when a student takes a
course.
5. The database should enforce primary keys and foreign key relationships.
🎓 Scenario: University Course Registration System
The university needs a database to manage students, courses, and their
registrations. You have been hired as a database designer to create this system.
Requirements:
1. Each student has a unique ID, name, email, and department.
2. Each course has a unique ID, title, credit hours, and is offered by one
department.
3. A student can register for many courses, and each course can have many
students (many-to-many relationship).
4. The system should record the semester and grade when a student takes a
course.
5. The database should enforce primary keys and foreign key relationships.
Student Tasks:
Design the Logical Schema
Draw an ER diagram showing entities: Students, Courses, and Registrations.
Show attributes and relationships (with cardinalities).
Implement the Physical Schema
Write SQL code to:
Create the database UniversityDB.
Create tables with primary keys and foreign keys.
Implement the many-to-many relationship using a junction table (Registrations).
🎓 Logical Schema (Conceptual Design)
Entities & Attributes
1. Students
1. StudentID (Primary Key)
2. Name
3. Email
4. Department
2. Courses
1. CourseID (Primary Key)
2. Title
3. CreditHours
4. Department
3. Registrations (junction entity to resolve many-to-many)
1. StudentID (Foreign Key → Students)
2. CourseID (Foreign Key → Courses)
3. Semester
4. Grade
Physical Data Independence
Physical Data Independence – the ability to modify the
physical schema without changing the logical schema
• Applications depend on the logical schema
• In general, the interfaces between the various levels and
components should be well defined so that changes in
some parts do not seriously influence others.
Database System Concepts - 7th Edition 1.16 ©Silberschatz, Korth and Sudarshan
🔹 1. Numeric Data Types
Used to store numbers.
• INT / INTEGER → Whole numbers (e.g., 1, -5, 1000).
• SMALLINT → Smaller range of whole numbers.
• BIGINT → Very large whole numbers.
• DECIMAL(p, s) / NUMERIC(p, s) → Exact numbers with fixed
precision (p = total digits, s = digits after decimal). Example:
DECIMAL(10,2) → 99999999.99.
• FLOAT / REAL / DOUBLE → Approximate numbers with
floating-point precision.
• BIT / BOOLEAN → True/False or 0/1 values.
🔹 2. String (Character) Data Types
Used for text data.
• CHAR(n) → Fixed-length string (always takes n characters, padded
with spaces if shorter).
• VARCHAR(n) → Variable-length string (only uses required space, up
to n).
• TEXT / CLOB → Large blocks of text (e.g., paragraphs, documents).
🔹 3. Date & Time Data Types
Used for storing time-related values.
DATE → Stores only date (YYYY-MM-DD).
TIME → Stores only time (HH:MM:SS).
DATETIME / TIMESTAMP → Stores both date and time.
YEAR → Stores year only.
Some DBs have INTERVAL for durations.
🔹 4. Binary Data Types
Used for images, files, and other binary data.
• BINARY(n) → Fixed-length binary data.
• VARBINARY(n) → Variable-length binary data.
• BLOB (Binary Large Object) → Stores large binary data like images,
audio, video.
🔹 What is DDL?
DDL (Data Definition Language) is a subset of SQL commands
used to define and manage database structure.
It does not manipulate data, instead it creates, modifies, or
deletes schemas, tables, and other objects.
👉 The main DDL commands are:
CREATE → make a new database/table/index/view.
ALTER → change an existing structure.
DROP → delete a database/table.
TRUNCATE → remove all rows but keep the table structure.
Real Time coding
Data Manipulation Language (DML)
Language for accessing and updating the data
organized by the appropriate data model
• DML also known as query language
There are basically two types of data-manipulation
language
• Procedural DML -- require a user to specify what data are
needed and how to get those data.
• Declarative DML -- require a user to specify what data are
needed without specifying how to get those data.
Declarative DMLs are usually easier to learn and use than are
procedural DMLs.
Declarative DMLs are also referred to as non-procedural DMLs
The portion of a DML that involves information retrieval is called
a query
language.
Database System Concepts - 7th Edition 1.18 ©Silberschatz, Korth and Sudarshan
SQL Query Language
SQL query language is nonprocedural. A query takes as input
several tables (possibly only one) and always returns a single
table.
Example to find all instructors in Comp. Sci. dept
select name
from instructor
where dept_name = 'Comp. Sci.'
SQL is NOT a Turing machine equivalent language
To be able to compute complex functions SQL is usually
embedded in some higher-level language
Application programs generally access databases through
one of
• Language extensions to allow embedded SQL
• Application program interface (e.g., ODBC/JDBC) which
allow SQL
queries to be sent to a database
Database System Concepts - 7th Edition 1.19 ©Silberschatz, Korth and Sudarshan
Database Design
The process of designing the general structure of the database:
Logical Design – Deciding on the database schema. Database
design requires that we find a “good” collection of relation
schemas.
• Business decision – What attributes should we record in
the
database?
• Computer Science decision – What relation schemas
should we have and how should the attributes be
distributed among the various relation schemas?
Physical Design – Deciding on the physical layout of the
database
Database System Concepts - 7th Edition 1.21 ©Silberschatz, Korth and Sudarshan
Database Engine
A database storage engine is an internal software component
that a database server uses to store, read, update and delete data
in the underlying memory and storage systems.
Database System Concepts - 7th Edition 1.22 ©Silberschatz, Korth and Sudarshan
First introduced in 1971, a B-tree is a self-
balancing tree data structure that keeps data
sorted and allows searches, sequential access,
insertions, and deletions in logarithmic time.
Log Structured Merge (LSM) Tree Based
Engine
The LSM tree is a data structure with performance characteristics best fit for
indexed access to files with high write volume over an extended period.
At the core, database storage engines use an index management
algorithm that is optimized for either read performance (B-Tree)
or write performance (LSM). The database API layer (SQL vs.
NoSQL) is a choice independent of the storage engine since both
B-Tree and LSM engines are used in SQL and NoSQL databases.
Storage Manager
A program module that provides the interface between the low-
level data stored in the database and the application programs
and queries submitted to the system.
The storage manager is responsible to the following tasks:
• Interaction with the OS file manager
• Efficient storing, retrieving and updating of data
The storage manager components include:
• Authorization and integrity manager
• Transaction manager
• File manager
• Buffer manager
Database System Concepts - 7th Edition 1.23 ©Silberschatz, Korth and Sudarshan
Storage Manager (Cont.)
The storage manager implements several data structures as
part of the physical system implementation:
• Data files -- store the database itself
• Data dictionary -- stores metadata about the
structure of the database, in particular the schema
of the database.
• Indices -- can provide fast access to data items. A
database index provides pointers to those data items that
hold a particular value.
Database System Concepts - 7th Edition 1.24 ©Silberschatz, Korth and Sudarshan
Query Processor
Components of Query Processor
1. DDL Interpreter
1. Handles Data Definition Language (DDL) commands like
CREATE, ALTER, DROP.
[Link] the data dictionary (metadata about
tables, columns, constraints, etc.).
[Link] Compiler
[Link] Data Manipulation Language (DML)
commands like SELECT, INSERT, UPDATE, DELETE.
[Link] SQL into a low-level query plan.
[Link] Optimizer
[Link] the most efficient execution plan among
many alternatives.
[Link] cost-based or rule-based optimization.
[Link] Executor (Evaluation Engine)
[Link] the query using the chosen plan.
[Link] with the storage manager to access
data physically stored on disk.
Database System Concepts - 7th Edition 1.25 ©Silberschatz, Korth and Sudarshan
Query Processing
Function Description
The SQL query is checked for syntax
1. Parsing and Translation errors and converted into an internal
representation (like a parse tree).
The system finds the most efficient
way to execute the query (for
2. Optimization
example, choosing the best indexes
or join order).
The optimized query plan is
3. Evaluation / Execution executed, and results are fetched
from the database.
Database System Concepts - 7th Edition 1.26 ©Silberschatz, Korth and Sudarshan
Database Applications
Database applications are usually partitioned into two or three
parts
Two-tier architecture -- the application resides at the client
machine, where it invokes database system functionality at
the server machine
Three-tier architecture -- the client machine acts as a front
end and does not contain any direct database calls.
• The client end communicates with an application
server, usually through a forms interface.
• The application server in turn communicates with a
database
system to access data.
Database System Concepts - 7th Edition 1.30 ©Silberschatz, Korth and Sudarshan
Two-tier and three-tier architectures
Database System Concepts - 7th Edition 1.31 ©Silberschatz, Korth and Sudarshan
Database Administrator
A person who has central control over the system is called a
database administrator (DBA). Functions of a DBA include:
Schema definition
Storage structure and access-method definition
Schema and physical-organization modification
Granting of authorization for data access
Routine maintenance
Periodically backing up the database
Ensuring that enough free disk space is available for
normal operations, and upgrading disk space as
required
Monitoring jobs running on the database
Database System Concepts - 7th Edition 1.33 ©Silberschatz, Korth and Sudarshan
History of Database Systems
1950s and early 1960s:
• Data processing using magnetic tapes for storage
Tapes provided only sequential access
• Punched cards for input
Late 1960s and 1970s:
• Hard disks allowed direct access to data
• Network and hierarchical data models in widespread
use
• Ted Codd defines the relational data model
Would win the ACM Turing Award for this work
IBM Research begins System R prototype
UC Berkeley (Michael Stonebraker) begins Ingres
prototype
Oracle releases first commercial relational database
• High-performance (for the era) transaction processing
Database System Concepts - 7th Edition 1.34 ©Silberschatz, Korth and Sudarshan
History of Database Systems (Cont.)
1980s:
• Research relational prototypes evolve into commercial
systems
SQL becomes industrial standard
• Parallel and distributed database systems
Wisconsin, IBM, Teradata
• Object-oriented database systems
1990s:
• Large decision support and data-mining applications
• Large multi-terabyte data warehouses
• Emergence of Web commerce
Database System Concepts - 7th Edition 1.35 ©Silberschatz, Korth and Sudarshan
History of Database Systems (Cont.)
2000s
• Big data storage systems
Google BigTable, Yahoo PNuts,
Amazon,
“NoSQL” systems.
• Big data analysis: beyond SQL
Map reduce and friends
2010s
• SQL reloaded
SQL front end to Map Reduce
systems
Massively parallel database systems
Multi-core main-memory databases
Database System Concepts - 7th Edition 1.36 ©Silberschatz, Korth and Sudarshan
End of Chapter 1
Database System Concepts - 7th Edition 1.37 ©Silberschatz, Korth and Sudarshan