1.
1 Introduction (Simple Explanation)
Databases and database technology play a very important role in today’s world. Almost every area where
computers are used depends on databases. These areas include business, e-commerce, social media,
education, medicine, law, libraries, and engineering.
What is a Database?
A database is a collection of related data.
Data means facts that can be stored and have meaning.
Example: Names, phone numbers, and addresses stored in your mobile phone.
This information is stored in an organized way, so it becomes a database.
Even storing contacts in a phone or data in Excel or MS Access is an example of a database.
Important Properties of a Database
A database has the following features:
1. Represents the real world (Miniworld / UoD)
The database represents real-world objects like students, employees, products, etc.
When something changes in real life, the same change must be made in the database.
2. Logically organized data
Data in a database is meaningful and well-organized.
Random or unrelated data cannot be called a database.
3. Created for a specific purpose
A database is designed for a particular group of users and applications.
Why Databases Must Be Updated
When real-world events happen (like a student enrolling in a course or a customer buying a product), the
database must be updated immediately.
This keeps the database accurate and reliable.
Size of Databases
Databases can be:
Small: Personal address books with a few records
Medium: Library databases with thousands of books
Very large:
o Facebook databases with billions of users
o Amazon databases storing product, customer, and purchase information
These large databases run on many servers and are updated continuously.
Manual vs Computerized Databases
Manual database: Library card catalog
Computerized database: Managed using computers and software
This book focuses only on computerized databases.
Database Management System (DBMS)
A DBMS is software that helps users create, store, manage, and use databases.
Main Functions of DBMS
1. Defining the database
o Specifies data types, structure, and rules
o Stores this information as metadata (data about data)
2. Constructing the database
o Stores actual data on disk or storage devices
3. Manipulating the database
o Retrieving data (queries)
o Updating data
o Generating reports
4. Sharing data
o Allows many users to access the database at the same time
5. Protection and maintenance
o Prevents data loss due to crashes
o Provides security against unauthorized access
o Supports long-term use and changes
Database System
A database system consists of:
The database
The DBMS software
1.2 Example: UNIVERSITY Database (Simple Explanation)
Consider a University database that stores information about:
Students
Courses
Sections
Grades
Prerequisites
Files in the University Database
1. STUDENT
Stores student details such as:
o Name
o Student number
o Class (1 = freshman, 2 = sophomore, etc.)
o Major
2. COURSE
Stores course details like:
o Course name
o Course number
o Credit hours
o Department
3. SECTION
Stores information about course sections:
o Section ID
o Semester
o Year
o Instructor
4. GRADE_REPORT
Stores grades obtained by students in each section
5. PREREQUISITE
Stores prerequisite courses for each course
Relationships in the Database
One student can have many grade records
Courses can have prerequisites
Sections are linked to courses
This shows that records in different files are connected
1.3 Characteristics of the Database Approach (Simple Explanation)
The database approach is different from the old file-processing system.
In file processing, each application has its own files and programs, which causes:
Data duplication
Wasted storage
Difficulty in updating data
In the database approach, data is stored once and shared by many users using a DBMS.
The main characteristics are:
1. Self-Describing Nature of a Database System
A database system stores data as well as information about the data.
This information is called meta-data (data about data).
Meta-data is stored in a database catalog.
Example:
The catalog stores:
Table names
Column names
Data types
Constraints
Because of this, the DBMS understands the database structure without hard-coding it into programs.
👉 In file systems, structure is written inside programs.
👉 In DBMS, structure is stored separately in the catalog.
2. Insulation Between Programs and Data (Program–Data Independence)
In DBMS, programs are independent of data structure.
If the structure of data changes, programs do not need to be rewritten.
Example:
If a new field like Birth_date is added to STUDENT:
File system → all programs must change
DBMS → only catalog is updated
This feature is called program–data independence.
3. Data Abstraction
DBMS hides low-level storage details from users.
Users see only logical data, not how it is stored on disk.
Example:
User asks for Student Name
DBMS finds it automatically, without the user knowing:
Byte position
Record length
Storage format
This makes databases easy to understand and use.
4. Support for Multiple Views of Data
Different users need different views of the same database.
A view is a customized way of seeing data.
Example:
Student → sees only transcript
Admin → sees student + course + fees
Faculty → sees course and grades
Views can:
Show only selected data
Hide unnecessary data
5. Sharing of Data and Multiuser Transaction Processing
Many users can access the database at the same time.
DBMS controls this using concurrency control.
Example:
In airline booking:
Two agents cannot book the same seat at the same time
Transaction:
A transaction is a set of database operations.
DBMS ensures:
o Atomicity: All or nothing
o Isolation: Transactions don’t interfere with each other
This is important for OLTP (Online Transaction Processing) systems.
1.4 Actors on the Scene
(People who directly use and manage the database)
1. Database Administrator (DBA)
The DBA is the main person in charge of the database.
Controls user access, security, and permissions.
Monitors database performance and handles problems.
Manages hardware, software, and backups.
2. Database Designers
Decide what data should be stored and how it should be organized.
Collect requirements from different users.
Create database structure and user views.
Ensure the database meets all user needs.
3. End Users
(Use the database for daily work)
a) Casual End Users
Use the database occasionally.
Ask different queries each time.
Example: Managers generating reports.
b) Naive / Parametric End Users
Use the database regularly.
Use predefined (canned) transactions.
Example: Bank tellers, reservation agents, social media users.
c) Sophisticated End Users
Have deep knowledge of DBMS.
Write complex queries and applications.
Example: Engineers, scientists, data analysts.
d) Standalone Users
Maintain personal databases.
Use ready-made software packages.
Example: Personal finance software users.
4. System Analysts and Application Programmers
System analysts gather user requirements.
Application programmers write and maintain database programs.
They create and test canned transactions.
Also called software engineers.
1.5 Workers Behind the Scene
(Do not use data directly, but support the system)
1. DBMS System Designers and Implementers
Design and build the DBMS software.
Create modules for queries, security, recovery, and storage.
Ensure DBMS works with OS and programming languages.
2. Tool Developers
Develop support tools for database design and performance.
Create tools for modeling, monitoring, testing, and interfaces.
Tools improve efficiency but are optional.
3. Operators and Maintenance Personnel
Maintain hardware and software systems.
Handle system installation, backups, and updates.
Ensure the database system runs smoothly.
1.6 Advantages of Using the DBMS Approach
1. Controlling Redundancy
In DBMS, the same data is stored only once.
This avoids data duplication and saves storage space.
It prevents data inconsistency caused by multiple copies.
DBMS can also manage controlled redundancy when needed for performance.
2. Restricting Unauthorized Access
DBMS protects data using user accounts and passwords.
Different users get different access rights (read or update).
Confidential data like salaries is protected.
Security rules are enforced automatically by the DBMS.
3. Providing Persistent Storage for Program Objects
DBMS can store program data permanently.
Data remains safe even after the program stops running.
Object-oriented DBMS stores complex objects directly.
This avoids manual file conversion by programmers.
4. Efficient Query Processing
DBMS uses indexes and search techniques to retrieve data fast.
It stores frequently used data in memory buffers.
Query optimizer chooses the best way to execute queries.
This improves speed and performance.
5. Backup and Recovery
DBMS protects data from system failures.
Automatically restores database after crashes.
Supports disk backups for safety.
Ensures data is not lost during errors.
6. Multiple User Interfaces
DBMS provides different interfaces for different users.
Supports mobile apps, forms, menus, and query languages.
Provides GUI and web-based interfaces.
Easy for both technical and non-technical users.
7. Representing Complex Relationships
DBMS can store relationships between data easily.
Supports one-to-one, one-to-many, and many-to-many relationships.
Related data can be retrieved efficiently.
Useful for real-world applications like universities and banking.
8. Enforcing Integrity Constraints
DBMS ensures data accuracy using rules and constraints.
Prevents invalid data entry (e.g., wrong data type).
Enforces key, uniqueness, and referential constraints.
Maintains correctness of database data.
9. Inferencing and Actions using Rules and Triggers
DBMS can derive new information using rules.
Triggers automatically perform actions when data changes.
Stored procedures handle complex operations.
Active databases respond to events automatically.
10. Enforcing Standards
DBA can define common data standards.
Ensures uniform naming, formats, and terminology.
Improves communication between departments.
Easy to manage in centralized databases.
11. Reduced Application Development Time
DBMS reduces time needed to develop applications.
New reports and queries are created quickly.
Less coding compared to file systems.
Faster development and maintenance.
12. Flexibility
Database structure can be modified easily.
New fields or tables can be added when needed.
Existing programs usually continue to work.
Supports changing business requirements.
13. Availability of Up-to-Date Information
All users see updated data immediately.
Ensures real-time data availability.
Important for banking and reservation systems.
Supported by concurrency control.
14. Economies of Scale
Data and applications are centralized.
Reduces duplicate work and hardware costs.
Organization can invest in powerful systems.
Lowers overall operational cost.
1.7 A Brief History of Database Applications
This section explains how database systems evolved over time and why new types of DBMS were needed.
1.7.1 Early Database Applications
(Hierarchical and Network Systems)
Early databases were used in large organizations like banks, hospitals, universities, and companies.
They stored large numbers of similar records, such as student details, courses, and grades.
Data relationships were mixed with physical storage details, meaning how data was stored on disk
was tightly linked to programs.
This caused low data abstraction and poor flexibility.
Writing new queries was difficult because only programming languages were used (no query
languages).
These systems ran on expensive mainframe computers (1960s–1980s).
Main early models:
o Hierarchical model
o Network model
o Inverted file systems
👉 Main problem: Hard to change, hard to maintain, and expensive.
1.7.2 Relational Databases
(Better Data Abstraction and Flexibility)
Relational databases separated logical data from physical storage.
Data was stored in tables (rows and columns).
Introduced high-level query languages like SQL, so users didn’t need to write full programs.
Supported data abstraction and program–data independence.
Early relational systems were slow, but later improved using indexes and query optimization.
Became the most widely used DBMS on PCs, servers, and large systems.
👉 Main advantage: Easy to use, flexible, and powerful.
1.7.3 Object-Oriented Applications
(Need for Complex Databases)
Object-oriented programming languages created a need to store complex objects.
This led to Object-Oriented Databases (OODBMS).
Features included:
o Encapsulation
o Inheritance
o Object identity
OODBMS were complex and lacked early standards, so they were not widely adopted.
Used mainly in specialized fields like CAD, multimedia, and manufacturing.
Many object concepts were added to relational DBMSs, forming Object-Relational DBMS
(ORDBMS).
👉 Main idea: Support complex data with object concepts.
1.7.4 Web and E-Commerce
(Using XML for Data Exchange)
The World Wide Web allowed sharing information through web pages.
E-commerce websites needed dynamic data (prices, flight details, availability).
Data was extracted from DBMSs and shown on web pages.
XML (eXtensible Markup Language) was developed to exchange data between databases and web
systems.
XML combines document structure and database concepts.
👉 Main use: Sharing data between databases and web applications.
1.7.5 Extending Database Capabilities
(Support for New Applications)
New applications needed databases to handle non-traditional data.
Examples:
o Scientific data (genome, experiments)
o Images (medical scans, photos)
o Videos (movies, clips)
o Data mining (fraud detection)
o Spatial data (maps, GPS)
o Time-series data (sales, economy)
Relational DBMSs were not enough because:
o Needed complex data types
o Needed new operations and indexes
DBMSs were extended with:
o Object features
o Special modules (e.g., time-series modules)
👉 Result: Databases became more powerful and flexible.
1.7.6 Big Data and NOSQL Databases
Growth of social media, cloud storage, e-commerce, and search engines created huge volumes of
data.
Traditional relational DBMSs were not suitable for all big-data needs.
New systems were developed for:
o Fast data access
o Large-scale storage
o Non-structured data
NOSQL means “Not Only SQL”.
Some data uses SQL databases, while other data uses NOSQL databases.
👉 Main purpose: Handle massive, fast-growing, and unstructured data.
1.8 When Not to Use a DBMS
Although DBMSs have many advantages, they are not suitable for every situation.
In some cases, using a DBMS can add extra cost and complexity.
Why DBMS May Not Be Suitable
A DBMS has overhead costs, such as:
High initial cost for hardware, software, and user training
Extra processing due to general-purpose features
Additional overhead for:
o Security
o Concurrency control
o Recovery
o Data integrity
Because of this, simpler systems may be better in certain cases.
Situations Where DBMS Should Not Be Used
Very simple applications
If the database is small, simple, and will never change, a DBMS is unnecessary.
Real-time systems
Applications that need very fast response may be slowed down by DBMS overhead.
Embedded systems
Devices with limited memory and storage cannot support large DBMS software.
Single-user systems
If only one user accesses the data, DBMS features like concurrency control are not needed.
Examples of Applications Without DBMS
CAD (Computer-Aided Design) systems
Use custom file systems to handle drawings and 3D models efficiently.
Telecommunication systems
Use special data structures for fast call routing and switching.
GIS (Geographical Information Systems)
Often use their own data formats to process maps, locations, and spatial data efficiently.
2.1 Data Models, Schemas, and Instances
A database system hides unnecessary storage details and shows only what users
need. This is called data abstraction.
Different users can view the same data at different levels of detail using data models.
Data Model (Simple Meaning)
A data model is a set of rules and concepts used to describe data.
It defines:
o Data types
o Relationships
o Constraints
It also includes basic operations like insert, delete, update, and retrieve data.
Some data models also define behavior, such as allowed operations on data (e.g., COMPUTE_GPA
for a student).
👉 In short: A data model explains how data is structured and used.
2.1.1 Categories of Data Models
Data models are grouped based on level of detail.
1. Conceptual (High-Level) Data Models
Closest to how humans think about data.
Use simple concepts:
o Entity – real-world object (Student, Course)
o Attribute – property of entity (Name, Salary)
o Relationship – connection between entities (Works_on)
Used during database design.
Example: Entity–Relationship (ER) model
👉 Used by: Database designers and users.
2. Representational (Implementation) Data Models
Used in commercial DBMSs.
Easier for users, but also close to how data is stored.
Examples:
o Relational model (tables)
o Hierarchical model
o Network model
Data is stored using records, so they are also called record-based models.
👉 Most commonly used model: Relational model.
3. Object Data Models
Support complex data and objects.
Combine data and operations.
Used in object-oriented databases.
Also used as conceptual models in software engineering.
👉 Example: Object-Relational DBMS (ORDBMS).
4. Physical (Low-Level) Data Models
Describe how data is stored on disk.
Include:
o File structures
o Record order
o Access paths (indexing, hashing)
Meant for system experts, not end users.
👉 Focus: Performance and storage efficiency.
5. Self-Describing Data Models
Store data and its description together.
Schema is not separate.
Examples:
o XML
o NOSQL and key-value stores
👉 Used for: Big data and web applications.
2.1.2 Schemas, Instances, and Database State
Database Schema
A schema is the structure or blueprint of the database.
Defined during database design.
Does not change frequently.
Shows:
o Tables
o Fields
o Constraints
Represented using a schema diagram.
👉 Example: STUDENT(Name, USN, Class, Major)
Schema Diagram
A visual representation of the schema.
Shows record types and fields.
Does not show actual data values.
👉 Used for understanding structure, not data.
Database Instance / State
The actual data stored in the database at a given time.
Changes whenever:
o New data is added
o Data is deleted
o Data is updated
Also called:
o Database state
o Snapshot
👉 Schema = structure, Instance = data
Important Difference
Schema Instance
Database design Actual data
Rarely changes Changes frequently
Blueprint Current data
Meta-Data
Schema details are stored in the DBMS catalog.
This information is called meta-data (data about data).
DBMS uses it to:
o Validate data
o Enforce constraints
Schema Evolution
Sometimes schema needs changes due to new requirements.
Example:
o Adding Date_of_birth to STUDENT table.
This process is called schema evolution.
👉 Modern DBMSs support schema changes.
✅ Simple Summary (Exam Tip)
Data model defines how data is structured and used.
Schema is the structure of the database.
Instance is the actual data at a particular time.
Schema is stable, instance keeps changing.
2.2 Three-Schema Architecture and Data Independence
The three-schema architecture separates how users see data from how data is actually stored.
Its main goal is to reduce dependency between programs and data and to support multiple user views.
It helps in:
Data abstraction
Data independence
Easy database maintenance
The three levels are Internal, Conceptual, and External.
2.2.1 The Three-Schema Architecture
1. Internal Level (Physical Level)
Describes how data is stored physically on storage devices.
Includes file structure, indexes, access paths, and record placement.
Uses a physical data model.
Hidden from end users.
Focuses on storage efficiency and performance.
2. Conceptual Level (Logical Level)
Describes the entire database structure for all users.
Includes entities, attributes, relationships, and constraints.
Hides physical storage details.
Uses a representational data model (usually relational).
Acts as a bridge between physical storage and user views.
3. External Level (View Level)
Shows only the required data for a particular user or group.
Each user can have a different view of the same database.
Hides unnecessary data for security and simplicity.
Helps users access data in a customized way.
Uses the same data model as the conceptual level.
Mapping Between Levels
DBMS converts user requests from:
o External → Conceptual → Internal level
This conversion process is called mapping.
Results are converted back to match the user’s view.
Actual data is stored only at the internal level.
2.2.2 Data Independence
Data independence means changing one level of schema without affecting higher levels.
It makes databases easier to maintain and update.
There are two types of data independence:
Logical data independence
Physical data independence
1. Logical Data Independence
Ability to change the conceptual schema without changing user views.
Examples:
o Adding a new table
o Removing an attribute
o Changing constraints
External schemas and application programs continue to work.
Only mappings are updated, not programs.
Harder to achieve than physical data independence.
2. Physical Data Independence
Ability to change the internal schema without changing the conceptual schema.
Examples:
o Adding indexes
o Changing file organization
o Reorganizing storage
Does not affect user views or applications.
Improves performance without changing queries.
Commonly supported in most DBMSs.
2.3 Database Languages and Interfaces
A DBMS supports different types of users, so it provides different languages and interfaces.
These help users define the database, store data, retrieve data, and update data easily.
Some languages are used by DBAs and designers, while others are used by end users.
2.3.1 DBMS Languages
Data Definition Language (DDL)
Used to define the database structure.
Specifies tables, attributes, constraints, and schemas.
Used mainly by DBA and database designers.
Examples: CREATE, ALTER, DROP in SQL.
Storage Definition Language (SDL)
Used to describe physical storage details.
Controls file organization, indexes, and access paths.
Used mainly by the DBA.
In modern DBMSs, SDL is handled internally, not as a separate language.
View Definition Language (VDL)
Used to define user views of the database.
Shows only required data to specific users.
Helps in security and simplicity.
In relational DBMSs, SQL is used for this purpose.
Data Manipulation Language (DML)
Used to retrieve and modify data in the database.
Operations include insert, delete, update, and select.
Used by users and programmers.
Example: SELECT, INSERT, UPDATE, DELETE.
High-Level vs Low-Level DML
High-level DML works on many records at once (set-oriented).
Example: SQL queries.
Low-level DML works on one record at a time.
Usually embedded in programming languages.
2.3.2 DBMS Interfaces
Menu-Based Interfaces
Users select options from menus.
No need to remember commands.
Common in websites and applications.
Easy for beginners.
Mobile Apps Interfaces
Provide database access through mobile phones.
Used in banking, shopping, and reservation systems.
Secure login and limited options are provided.
Very user-friendly.
Forms-Based Interfaces
Users fill forms to enter or retrieve data.
Common for clerks and data entry operators.
Used for repetitive tasks.
Reduces input errors.
Graphical User Interfaces (GUI)
Displays data using diagrams, buttons, and menus.
Users interact visually instead of writing queries.
Often combines forms and menus.
Easy to understand.
Natural Language Interfaces
Users type queries in English or local language.
System tries to understand the request.
Converts it into database queries.
Used in limited applications.
Keyword-Based Search Interfaces
Similar to search engines like Google.
Users enter keywords to search data.
Results are ranked by relevance.
Used in modern database research systems.
Speech Input and Output Interfaces
Users speak queries instead of typing.
System gives spoken responses.
Used in phone services and help systems.
Works with limited vocabulary.
Interfaces for Parametric Users
Designed for users who perform repetitive tasks.
Uses predefined buttons or commands.
Example: bank teller operations.
Fast and efficient.
Interfaces for DBA
Used only by Database Administrators.
Allows managing users, security, and storage.
Used for schema changes and system control.
Requires special privileges.
2.4 Database System Environment (Simple Meaning)
A DBMS environment includes:
Users of the database
DBMS software modules
Operating system and network
Utilities and tools that help manage the database
It shows how a DBMS works internally and externally.
2.4.1 DBMS Component Modules
A DBMS has two main parts:
1. User side (Top part)
2. Internal DBMS modules (Bottom part)
[Link] and Interfaces (Top Part)
1. DBA (Database Administrator)
Defines and manages the database.
Uses DDL and privileged commands.
Controls security, tuning, and maintenance.
2. Casual Users
Use interactive queries.
Do not write programs.
Example: searching student details.
3. Application Programmers
Write programs using Java, C, Python, PHP, etc.
Embed SQL queries inside programs.
4. Parametric Users
Execute predefined transactions.
Only supply input values.
Example: bank clerk entering account number and amount.
B. DBMS Processing Modules (Bottom Part)
DDL Compiler
Processes schema definitions.
Stores metadata in system catalog (data dictionary).
System Catalog / Data Dictionary
Stores information about:
o Tables, fields, data types
o File sizes and storage details
o Constraints and mappings
Used by all DBMS modules.
Query Compiler
Checks query syntax and correctness.
Converts SQL into internal form.
Query Optimizer
Chooses the fastest execution plan.
Reduces disk access.
Uses statistics from the catalog.
DML Compiler
Compiles INSERT, UPDATE, DELETE, SELECT commands.
Generates database access code.
Precompiler
Extracts SQL from application programs.
Sends SQL to DML compiler.
Remaining code goes to host language compiler.
Runtime Database Processor
Executes:
o Queries
o Transactions
o DBA commands
Manages buffers and memory.
Works with storage manager and catalog.
Stored Data Manager
Controls reading/writing data to disk.
Uses operating system services.
Improves performance by managing data storage.
Concurrency Control & Recovery
Ensures safe multi-user access.
Prevents data inconsistency.
Handles backup and crash recovery.
Client–Server Environment
Client: User machine (PC / Mobile)
Server: Database machine
Sometimes an application server sits between client and DB server.
2.4.2 Database System Utilities
Utilities help the DBA manage the database.
Important Utilities (Easy to Remember)
1. Loading Utility
Loads data from files into database.
Converts data format automatically.
2. Backup Utility
Creates copy of database.
Used during system failure.
Incremental backup saves only changes.
3. Storage Reorganization
Reorganizes files and indexes.
Improves performance.
4. Performance Monitoring
Collects usage statistics.
Helps DBA tune the system.
2.4.3 Tools, Application Environments & Communication
CASE Tools
Used during database design.
Help in modeling and documentation.
Data Dictionary / Repository
Stores extra information like:
o Design decisions
o Standards
o User details
Used by DBA and users.
Application Development Tools
Help build database applications.
Examples: PowerBuilder, JBuilder.
Provide GUI, querying, and coding support.
Communication Facilities
Allow remote users to access database.
Use networks like:
o Internet
o LAN
o WAN
o Satellite
DBMS + communication system = DB/DC system.
✅ Very Short Memory Summary (Exam Gold)
Top part → Users & Interfaces
Bottom part → DBMS internal modules
Catalog = DBMS brain
Optimizer = performance booster
Utilities = DBA helpers
Networks = remote access
2.5 Centralized and Client/Server Architectures for DBMSs
1. Centralized DBMS Architecture
All database work is done on one central computer.
Users use terminals/PCs only for display and input.
The central system runs:
o DBMS
o Application programs
o User interface
Easy to control but slow and not scalable.
Mostly used in old mainframe systems.
2. Client/Server Architecture (Basic Idea)
Work is divided between client and server.
Client: User interface + local processing
Server: DBMS + data storage
Clients request services; servers provide them.
Supports many users and improves performance.
3. Two-Tier Client/Server Architecture
Two layers:
o Client → UI + application programs
o Server → DBMS + database (SQL server)
Client sends SQL queries to the server.
Uses ODBC / JDBC to connect to DBMS.
Simple and efficient for small applications.
4. Three-Tier Architecture
Three layers:
1. Client (Web browser / UI)
2. Application server (Business logic & security)
3. Database server (DBMS + data)
More secure, scalable, and flexible.
Commonly used in web applications.
5. n-Tier Architecture
Extension of three-tier architecture.
Business logic split into multiple layers.
Used in large systems like ERP and CRM.
Easier to maintain and upgrade.
⭐ Easy Memory Tip
Centralized → One system
Two-tier → Client + DB
Three-tier → Client + Logic + DB
n-tier → Many logic layers