CHAPTER TWO
Database System Concepts and
Architecture
Introduction to DBMS
Architecture
• Historically, database management systems (DBMS) were built as monolithic
systems, where the entire software package was integrated into a single unit. This
design limited flexibility and scalability.
• Modern DBMS architectures have transitioned to a modular approach, adopting a
client/server architecture.
• This structure allows for better distribution of responsibilities and resources.
• Client Module: Typically runs on user devices (mobile, workstation, PC) and
handles user interfaces and application programs that access the database.
Provides user-friendly interfaces such as mobile apps and menu-based GUIs.
• Server Module: Manages data storage, access, and processing functions.
This architecture supports scalability, making it suitable for distributed systems managing
large amounts of data.
Introduction to DBMS
Architecture…
Distributed Systems:
• The recent surge in data volume has led to the development of distributed
database systems, consisting of numerous interconnected computers.
• This structure mirrors the shift in computing from centralized mainframe
systems to networks of distributed workstations and servers (e.g., web
servers, application servers).
Data Models, Schemas, and
Instances
• A critical feature of the database approach is data abstraction, which allows
users to focus on high-level data representations without needing to
understand the complexities of data organization and storage.
• This abstraction enables users to interact with data at their preferred level
of detail.
• One of the main characteristics of the database approach is to support
data abstraction so that different users can perceive data at their preferred
level of detail.
• A data model—a collection of concepts that can be used to describe the
structure of a database—provides the necessary means to achieve this
abstraction.
Data Models
• A data model is a structured collection of concepts that define how
data is organized, stored, and manipulated within a database.
• Types of Data Models:
High-Level (Conceptual) Data Models: These models provide a user-friendly
view of data, reflecting how users generally perceive and interact with data.
They include concepts like entities, attributes, and relationships.
Entities: Represent real-world objects or concepts, such as a student or a course.
Attributes: Describe properties of entities, such as a student's name or age.
Relationships: Define how entities are related to one another, such as the enrollment of
a student in a course.
Data Models…
Low-Level (Physical) Data Models: These models focus on the
details of how data is stored in a computer system, addressing aspects
such as file structures and storage formats. They are generally geared
toward computer specialists rather than end-users.
Representational (Implementation) Data Models: These models
strike a balance between high-level and low-level models, providing
concepts that are understandable to end-users while still being
implementable on computer systems. The relational data model is a
common example.
Schemas, Instances, and
Database State
• Database Schema: The schema serves as the blueprint of the
database, defining its structure and constraints. It is specified during
the database design phase and is generally stable over time.
Schema Diagram: A graphical representation of the schema, illustrating
entities, attributes, and relationships without displaying the actual data.
Schemas, Instances, and
Database State…
• Instances: The actual data stored in the database at any given
moment is referred to as an instance. Each instance reflects the
current state of the database.
• Database State: This term describes the content of the database at a
specific point in time, characterized by the collection of instances.
Three-Schema Architecture and Data
Independence
• The three-schema architecture is designed to separate user
applications from the physical database, enhancing data abstraction
and independence.
• This architecture is essential for achieving three of the key
characteristics of the database approach:
I. Self-describing Nature: The use of a catalog to store the database
description (schema).
II. Isolation of Programs and Data: This includes both program-data
independence and program-operation independence.
III. Support for Multiple User Views: Different users can have tailored views of
the same database.
The Three-Schema Architecture
• The primary goal of the three-schema architecture is to provide a
framework that allows for a clear distinction between how data is
stored, how it is conceptualized, and how it is presented to users.
• The architecture consists of three distinct levels:
Internal Level:
o The internal schema describes the physical storage structure of the database.
o It includes specifics about data storage formats, data access paths (e.g., indexing), and
physical file organization.
o This level uses a physical data model to specify how data is stored on storage media (e.g.,
magnetic disks).
The Three-Schema
Architecture…
Conceptual Level:
o The conceptual schema provides a unified view of the entire database for a community
of users.
o It abstracts the complexities of the physical storage, focusing instead on:
1. Entities: Real-world objects or concepts (e.g., students, courses).
2. Attributes: Properties of these entities (e.g., a student’s name, age).
3. Relationships: Connections between entities (e.g., enrollment in a course).
o Typically, a representational data model is employed to implement the conceptual
schema, translating user needs into a structured format.
The Three-Schema
Architecture…
Data Independence
• Data independence is the ability to change the schema at one level of
the database system without necessitating changes at the next higher
level. This concept is crucial for maintaining flexibility in database
management.
Types of Data Independence
Logical Data Independence
Physical Data Independence
Logical Data Independence
• This refers to the capacity to modify the conceptual schema without
altering the external schemas or application programs that rely on it.
Examples of Changes:
Adding new record types or data items to the conceptual schema.
Modifying constraints or relationships without affecting user views.
• For instance, if the file structure changes, external schemas
that reference it should remain unaffected, only requiring updates to
the view definitions and mappings.
Physical Data Independence:
• This is the ability to change the internal schema without requiring
changes to the conceptual schema.
Examples of Changes:
Reorganizing physical files, such as implementing new indexing strategies to
enhance performance.
• Users should not notice any difference in how they interact with the
database, even if the underlying storage mechanisms change.
Importance of Data
Independence
• Application Stability: Data independence ensures that applications
remain stable and functional even when changes are made to the
database structure.
• Catalog Management: When schema changes occur, the database
catalog must be updated to reflect new mappings between the
various levels of the architecture, ensuring that application programs
continue to function correctly.
DBMS Languages
• Once a database is designed and a DBMS is selected, the initial step
involves defining conceptual and internal schemas, along with the
necessary mappings between these schemas. The primary languages
involved include:
1. Data Definition Language (DDL)
2. Storage Definition Language (SDL)
3. View Definition Language (VDL)
4. Data Manipulation Language (DML)
Data Definition Language (DDL)
• Used by Database Administrators (DBAs) and database designers to
define schemas.
• The DDL compiler processes DDL statements to identify schema
constructs and stores schema descriptions in the DBMS catalog.
• In systems with clear separation between conceptual and internal
levels, DDL specifies the conceptual schema. In some cases, it may
also encompass the external schemas.
Storage Definition Language
(SDL)
• Utilized to specify the internal schema, detailing how data is physically
stored.
• While SDL is not commonly found in modern relational DBMSs, it was
essential for defining physical storage structures.
View Definition Language (VDL)
• Designed to specify user views and their mappings to the conceptual
schema.
• In many modern DBMSs, DDL serves dual purposes, defining both
conceptual and external schemas.
Data Manipulation Language
(DML)
• This language allows users to manipulate the database by performing
operations such as retrieval, insertion, deletion, and modification of
data.
DBMS Interfaces
• User-friendly interfaces provided by a DBMS are essential for
facilitating interaction between users and the database.
• Different types of interfaces cater to various user needs, enhancing
accessibility and usability.
Types of Interfaces
• Menu-Based Interfaces:
o Present users with a list of options (menus) that guide them in formulating
requests.
o Users can navigate through menus, eliminating the need to memorize
commands or syntax.
o Pull-down menus are particularly popular in web-based interfaces and
browsing tools.
• Mobile Device Applications:
o Allow mobile users to access their data through dedicated applications.
o Apps typically require user login and provide a limited set of options for
transactions, such as bill payments or reservations.
Types of Interfaces…
• Forms-Based Interfaces:
o Display forms that users can fill out to insert new data or retrieve existing
data.
o Designed for naive users, these interfaces often employ forms specification
languages (e.g., SQL*Forms, Oracle Forms) to facilitate interaction.
• Graphical User Interfaces (GUIs):
o Use diagrams to represent the database schema, allowing users to specify
queries by manipulating visual elements.
o GUIs often integrate menus and forms to enhance user interaction.
Types of Interfaces…
• Natural Language Interfaces:
o Accept user requests in natural language and attempt to interpret them.
o If successful, the interface generates high-level queries that are submitted to the
DBMS. Otherwise, it may engage in dialogue to clarify requests.
• Keyword-Based Database Search:
o Similar to web search engines, these interfaces accept keyword queries and match
them with relevant database entries.
o They utilize predefined indexes and ranking functions to retrieve results based on
keyword matches.
• Interfaces for Database Administrators (DBAs):
o Include privileged commands available only to DBAs for managing the database.
o Allow for tasks such as account creation, schema changes, and database tuning.
Types of Interfaces…
• Speech Input and Output:
o Enable users to interact through spoken queries and receive spoken
responses.
o Commonly used in applications with limited vocabularies, such as directory
inquiries and account information.
• Interfaces for Parametric Users:
o Designed for users who regularly perform specific operations (e.g., bank
tellers).
o Often feature single-function keys to streamline routine tasks and minimize
keystrokes.
The Database System
Environment
• A DBMS is a complex software system comprised of various components that work together
to manage data and facilitate user interaction.
• The typical DBMS can be divided into two main parts:
User Interfaces: Interfaces for different user categories (DBA, casual users, application programmers).
Internal Modules: Responsible for data storage and transaction processing.
• Key Components Include:
Query Compiler: Validates and compiles user queries into an internal form for processing.
Runtime Database Processor: Executes queries, transactions, and privileged commands while managing
data transfer and storage.
Stored Data Manager: Handles low-level I/O operations, working with the operating system.
• The DBMS interacts with the operating system for disk access and utilizes its own buffer
management for efficiency.
• The runtime processor engages with the system catalog to optimize operations and maintain
data integrity.
Database System Utilities
• In addition to core components, DBMSs often include utilities that
assist in database management:
o Loading Utility: Facilitates the import of data files into the database,
converting formats as necessary.
o Backup Utility: Creates backup copies of the database for recovery purposes,
supporting both full and incremental backups.
o Storage Reorganization Utility: Optimizes file organization and access paths
to enhance performance.
o Performance Monitoring Utility: Tracks database usage and provides
statistics for decision-making regarding file organization and indexing.
Tools and Application
Environments
• CASE Tools: Used during the design phase for database systems,
facilitating effective planning and development.
• Data Dictionary: Stores extensive metadata about the database and
its design decisions, accessible by users and DBAs.
• Application Development Environments: Provide integrated tools for
building database applications, including GUI design and querying
capabilities.
• Communications Software: Allows remote access to the database
through various networks, supporting client-server interactions.
Classification of Database
Management Systems
• DBMSs can be classified using several criteria:
o Data Model: Includes relational, object, object-relational, NoSQL, key-value,
hierarchical, and network models.
o Number of Users: Single-user and multiuser systems, with multiuser systems being
more common.
o Distribution: Centralized versus distributed DBMSs, including homogeneous and
heterogeneous systems.
o Cost: Ranges from open-source solutions to expensive enterprise-level systems.
o Access Path Options: Such as inverted file structures and other specialized designs.
o General Purpose vs. Special Purpose: General-purpose DBMSs can handle a variety
of applications, while special-purpose DBMSs are tailored for specific tasks.
END