ROCKVIEW UNIVERSITY
DEPARTMENT OF COMPUTER
SCIENCE DATABASE
TECHNOLOGIES: UNIT 1
2025
ROCKVIEW UNIVERSTY
By Mr Kalaluka Nyambe
WHAT WE ARE GOING TO COVER IN UNIT I
Database System Applications,
Purpose of Database Systems,
View of Data
– Data Abstraction
– Instances and Schemas
Data Models
– the ER Model
– Relational Model
– Other Models
Database Languages
- DDL (Data Definition Language) – Defines the structure of the database.
- DML (Data Manipulation Language) – Manipulates data in the database.
- DCL (Data Control Language) – Manages access and permissions.
- TCL (Transaction Control Language) – Handles transactions in the databaseDatabase
Access for applications Programs
– database Users and Administrator
– Transaction Management
– data base Architecture
– Storage Manager
– the Query Processor Data base design and ER diagrams
ER Model
- Entities,
- Attributes and Entity sets
– Relationships and Relationship sets
– ER Design Issues – Concept Design
– Conceptual Design for University Enterprise.
Introduction to the Relational Model
– Structure
– Database Schema, Keys
– Schema Diagrams
UNIT-1
Introduction to Database Management System As the name suggests, the database
management system consists of two parts. They are:
1. Database and
2. Management System
What is a Database?
To find out what database is, we have to start from data, which is the basic building block of any
DBMS. Data: Facts, figures, statistics etc. Having no particular meaning (e.g. 1, Nyambe, 19 etc).
Record: A record is a Collection of related data items, e.g. in the above example the three data
items had no meaning. But if we organize them in the following way, then they collectively
represent meaningful information.
Roll Name age
1 Nyambe 16
Table or Relation: Collection of related records.
Roll Name age
1 Nyambe 16
2 Kalaluka 40
3 Rose 2
The columns of this relation are called Fields, Attributes or Domains. The rows are called Tuples
or Records.
To answer the question what is the database?
A database is a structured collection of data that is stored and managed electronically. It allows
users to store, retrieve, update, and manage data efficiently.
I. Database columns are the vertical sections in a database table that store specific types of
data. Each column has a name and a defined data type, such as text, integer, or date. Columns
organize data within a table, ensuring that each row (record) contains the appropriate values in
the right format.
In the context of databases, a tuple refers to a single row (or record) in a table. It represents a
unique instance of data that follows the structure defined by the table’s columns.
Roll Name age
1 Nyambe 16
Key Characteristics of Tuples
• Each tuple represents a record with values corresponding to the table’s columns.
• Tuples maintain order within the table but are typically accessed based on keys.
• Tuples must follow the schema (data types and constraints) defined by the table.
Database: Collection of related relations. Consider the following collection of tables:
We now have a collection of 4 tables. They can be called a “related collection” because we can
clearly find out that there are some common attributes existing in a selected pair of tables.
Because of these common attributes we may combine the data of two or more tables together
to find out the complete details of a student. Questions like “Which hostel does the youngest
student live in?” can be answered now, although Age and Hostel attributes are in different
tables
A database in a DBMS could be viewed by lots of different people with different responsibilities
For example, within a company there are different departments, as well as customers, who each
need to see different kinds of data. Each employee in the company will have different levels of
access to the database with their own customized front-end application. In a database, data is
organized strictly in row and column format. The rows are called Tuple or Record. The data
items within one row may belong to different data types. On the other hand, the columns are
often called Domain or Attribute. All the data items within a single attribute are of the same
data type.
Database Management System (DBMS)
A Database Management System (DBMS) is software that allows users to create, manage, and
manipulate databases efficiently. It provides tools for storing, retrieving, updating, and
managing data in a structured way.
Functions of a DBMS
1. Data Storage & Retrieval – Stores large amounts of data and retrieves it efficiently.
2. Data Security – Manages access control and authentication to protect sensitive data.
3. Data Integrity – Ensures consistency, accuracy, and reliability of data.
4. Concurrency Control – Allows multiple users to access the database simultaneously
without conflicts.
5. Backup & Recovery – Provides mechanisms for data recovery in case of failure.
Types of DBMS
1. Relational DBMS (RDBMS) – Uses tables to organize data (e.g., MySQL, PostgreSQL, SQL
Server).
2. NoSQL DBMS – Handles unstructured or semi-structured data (e.g., MongoDB,
Cassandra).
3. Hierarchical DBMS – Data is stored in a tree-like structure (e.g., IBM IMS).
4. Network DBMS – Uses a graph structure for complex relationships (e.g., IDMS).
Examples of DBMS
• MySQL – Popular open-source RDBMS.
• PostgreSQL – Advanced, open-source RDBMS.
• MongoDB – NoSQL database for flexible data storage.
• Oracle DB – Enterprise-level RDBMS.
• Microsoft SQL Server – RDBMS used in enterprise applications.
Purpose of Database Systems
A Database System is designed to efficiently manage and organize data, ensuring accuracy,
security, and accessibility. It eliminates the limitations of traditional file-based systems and
provides a structured way to store, retrieve, and manipulate data.
Key Purposes of Database Systems
1. Efficient Data Management
o Stores large volumes of structured data in an organized manner.
o Provides quick access and retrieval of information.
2. Data Integrity and Accuracy
o Ensures data consistency through constraints (e.g., Primary Keys, Foreign Keys).
o Prevents data duplication and inconsistencies.
3. Data Security and Access Control
o Restricts unauthorized access using authentication and user permissions.
o Encrypts sensitive data to prevent breaches.
4. Data Sharing and Multi-User Access
o Supports concurrent access by multiple users without conflicts.
o Uses transaction management to prevent data corruption.
5. Data Backup and Recovery
o Provides automatic backup features to prevent data loss.
o Allows recovery in case of system failures or crashes.
6. Improved Performance and Scalability
o Uses indexing and optimization techniques for fast queries.
o Can handle increasing amounts of data efficiently.
7. Data Independence
o Separates data storage from application logic, making modifications easier.
o Changes in the database schema do not affect applications.
8. Business Decision Support
o Enables reporting and analytics for data-driven decision-making.
o Supports business intelligence tools for insights.
Advantages of a Database Management System (DBMS)
A DBMS provides a structured and efficient way to manage data, offering several advantages
over traditional file-based storage systems.
1. Data Redundancy Control
• Avoids duplication of data by storing it in a centralized database.
• Improves storage efficiency and reduces inconsistencies.
2. Data Consistency and Integrity
• Ensures that data remains accurate and valid through constraints (e.g., Primary Keys,
Foreign Keys).
• Prevents conflicting data updates.
3. Data Security
• Implements authentication (user login) and access control (permissions).
• Protects sensitive information through encryption and backup policies.
4. Multi-User Access and Concurrency Control
• Supports multiple users accessing the database simultaneously.
• Uses transaction management techniques like locking mechanisms to prevent conflicts.
5. Data Backup and Recovery
• Automatically backs up data to prevent loss due to crashes, failures, or cyber-attacks.
• Provides recovery options to restore lost data.
6. Efficient Data Retrieval and Query Processing
• Uses indexes and optimized query execution for fast data retrieval.
• Supports SQL queries for flexible and powerful data access.
7. Scalability and Performance Optimization
• Can handle growing amounts of data efficiently.
• Allows for horizontal (sharding) and vertical (replication) scaling.
8. Improved Decision-Making with Data Analysis
• Supports reporting and analytics for business insights.
• Integrates with BI tools for forecasting and trend analysis.
9. Data Independence
• Separates data storage from application logic, making it easier to modify without
affecting the entire system.
• Uses Schemas and Data Abstraction for flexibility.
10. Better Collaboration and Data Sharing
• Multiple departments or teams can share and update data in real time.
• Ensures that everyone accesses the latest version of the data.
View of Data
A database system is a collection of interrelated data and a set of programs that allow users to
access and modify these data. A major purpose of a database system is to provide users with an
abstract view of the data. That is, the system hides certain details of how the data are stored
and maintained.
Data Abstraction in DBMS
Data Abstraction in a Database Management System (DBMS) refers to the process of hiding
complex details of data storage and representation from users while providing a simple and
efficient way to interact with the database. It ensures that users interact only with the required
level of data without worrying about low-level details like data structures, indexing, or storage.
Levels of Data Abstraction
1. Physical Level (Lowest Level) – "How data is stored?"
• Defines how data is physically stored in memory (e.g., files, indexes, data structures).
• Concerned with hardware storage mechanisms like hard drives, SSDs, and RAID.
• Example: Data stored in binary format, B-Trees, Hash Indexing, or Tablespaces.
2. Logical Level (Middle Level) – "What data is stored?"
• Defines how data is structured and organized in the database.
• Uses schemas, tables, relationships, and constraints to represent data.
• Example: A relational database table with attributes (id, name, age, email,
enrollment_date).
3. View Level (Highest Level) – "How users see the data?"
• The simplified user interface that presents only relevant data to different users.
• Hides complex details and presents virtual tables (views).
• Example: A student portal only showing student name, email, and enrolled courses,
while the admin has full access.
Example of Data Abstraction in a Student Database
Level Example Representation
Physical Level Data stored in binary format using indexes for faster retrieval.
Logical Level A students table with columns (id, name, age, email, enrollment_date).
View Level A student only sees (name, email, enrolled courses) on their portal.
Why is Data Abstraction Important?
Enhances Security – Users only access relevant data (e.g., students can't see financial
records).
Simplifies Interaction – Hides complex storage details from end users.
Increases Flexibility – Changing storage structures does not affect applications.
Improves Performance – DBMS optimizes queries without exposing technical details.
Instances and Schemas in DBMS
In a Database Management System (DBMS), instances and schemas define the structure and
state of a database.
1. Schema (Database Structure) – "What is the blueprint?"
• A schema is the overall design or structure of the database.
• It defines tables, columns, data types, relationships, constraints (e.g., Primary Keys,
Foreign Keys).
• The schema does not change frequently; it remains stable unless modifications are
required.
Example of a Database Schema
A university database schema may include:
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(100),
age INT,
email VARCHAR(255) UNIQUE,
enrollment_date DATE
);
CREATE TABLE courses (
course_id INT PRIMARY KEY,
course_name VARCHAR(100),
credits INT
);
Think of a schema like a blueprint for building a house. It defines the structure of the
database.
2. Instance (Database State) – "What is stored right now?"
• A database instance is the actual data stored at a given moment.
• It changes frequently as users insert, update, delete, and modify records.
• The schema remains the same, but the instance changes as the database is updated.
Example of a Database Instance
A snapshot of the students table at a particular moment:
id name age Email enrollment_date
1 John Doe 20 john@[Link] 2024-01-15
2 Jane Smith 22 jane@[Link] 2023-09-10
Think of an instance as the actual rooms and furniture inside a house. The house (schema)
remains the same, but the furniture (data) can change.
Key Differences Between Schema and Instance
Feature Schema Instance
Definition Blueprint of the database Current state of stored data
Changes Rarely changes Changes frequently
Examples Tables, columns, constraints Data inside the tables
Analogy House blueprint Furniture inside the house
• Schema = The database design (structure).
• Instance = The current snapshot of the database (data at a moment in time).
Data Models in DBMS
A Data Model defines how data is structured, stored, and manipulated within a database. It
provides a framework for organizing data and relationships between different data entities.
Types of Data Models
1. Hierarchical Data Model
• Organizes data in a tree-like structure with parent-child relationships.
• Each parent node can have multiple child nodes, but a child has only one parent.
• Uses pointers or links to navigate between records.
Example:
A university storing student data:
University
├── Faculty of Science
│ ├── Department of Computer Science
│ │ ├── Course: Database Management
│ │ ├── Course: Software Engineering
Used in: IBM Information Management System (IMS).
2. Network Data Model
• Similar to the hierarchical model, but supports multiple parent-child relationships
(many-to-many).
• Uses graph-like structures (nodes and edges).
Example:
A student can enroll in multiple courses, and a course can have multiple students.
Student ──> Course
Course ──> Instructor
Instructor ──> Department
Used in: Integrated Data Store (IDS).
3. Relational Data Model (RDBMS) 🗄
• Stores data in tables (relations) with rows (tuples) and columns (attributes).
• Uses Primary Keys and Foreign Keys for relationships.
• Data can be retrieved using SQL (Structured Query Language).
Example:
Students Table
student_id name age Email
1 John 20 john@[Link]
2 Jane 22 jane@[Link]
Courses Table
course_id course_name
101 Database Management
102 Computer Networks
Used in: MySQL, PostgreSQL, SQL Server, Oracle.
4. Object-Oriented Data Model (OODBMS) 🖥
• Stores data as objects, similar to Object-Oriented Programming (OOP).
• Objects contain attributes (data) and methods (functions).
• Supports inheritance, polymorphism, and encapsulation.
Example (in Python-like notation):
class Student:
def __init__(self, student_id, name, age, email):
self.student_id = student_id
[Link] = name
[Link] = age
[Link] = email
def get_details(self):
return f"{[Link]} ({[Link]}) - {[Link]}"
Used in: MongoDB (with JSON-like documents), Db4o.
5. Document-Oriented Data Model (NoSQL)
• Stores data in documents (JSON, BSON, XML) instead of tables.
• Flexible, schema-less structure (NoSQL databases).
• Best for big data and unstructured data.
Example (MongoDB JSON Document):
"student_id": 1,
"name": "John Doe",
"age": 20,
"courses": ["Database Management", "Computer Networks"]
Used in: MongoDB, CouchDB.
6. Key-Value Data Model
• Data is stored as key-value pairs, similar to dictionaries or hash tables.
• Extremely fast and scalable for caching and session storage.
Example:
"user_1": {"name": "John", "age": 20, "email": "john@[Link]"}
"user_2": {"name": "Jane", "age": 22, "email": "jane@[Link]"}
Used in: Redis, Amazon DynamoDB.
Comparison of Data Models
Model Structure Best Use Case Examples
Hierarchical Tree-like File systems, banking IBM IMS
Network Graph Complex relationships IDS
Relational Tables Structured data with relations MySQL, PostgreSQL
Object-Oriented Objects OOP applications MongoDB (partially)
Document-Oriented JSON/XML Big data, flexible schema MongoDB, CouchDB
Key-Value Hash table Caching, real-time processing Redis, DynamoDB
• Relational models (RDBMS) are the most common for structured data.
• NoSQL models (Document, Key-Value) are better for scalability and big data.
• Hierarchical and Network models are mostly used in legacy systems.
• Object-Oriented models work well in applications that follow OOP principles.
Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access or manipulate
data as organized by the appropriate data model. The types of access are:
• Retrieval of information stored in the database
• Insertion of new information into the database
• Deletion of information from the database
• Modification of information stored in the database
There are basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data
are needed without specifying how to get those data. Declarative DMLs are usually easier to
learn and use than are procedural DMLs. However, since a user does not have to specify how to
get the data, the database system has to figure out an efficient means of accessing 13 data. A
query is a statement requesting the retrieval of information. The portion of a DML that involves
information retrieval is called a query language. Although technically incorrect, it is common
practice to use the terms query language and data-manipulation language synonymously
Data-Definition Language (DDL)
We specify a database schema by a set of definitions expressed by a special language called a
data-definition language (DDL). The DDL is also used to specify additional properties of the data.
We specify the storage structure and access methods used by the database system by a set of
statements in a special type of DDL called a data storage and definition language. These
statements define the implementation details of the database schemas, which are usually
hidden from the users. The data values stored in the database must satisfy certain consistency
constraints. For example, suppose the university requires that the account balance of a
department must never be negative. The DDL provides facilities to specify such constraints. The
database system checks these constraints every time the database is updated. In general, a
constraint can be an arbitrary predicate pertaining to the database. However, arbitrary
predicates may be costly to test. Thus, database systems implement integrity constraints that
can be tested with minimal overhead.
• Domain Constraints.
A domain of possible values must be associated with every attribute (for example, integer types,
character types, date/time types). Declaring an attribute to be of a particular domain acts as a
constraint on the values that it can take. Domain constraints are the most elementary form of
integrity constraint. They are tested easily by the system whenever a new data item is entered
into the database.
• Referential Integrity. There are cases where we wish to ensure that a value that appears in one
relation for a given set of attributes also appears in a certain set of attributes in another relation
(referential integrity). For example, the department listed for each course must be one that
actually exists. More precisely, the dept name value in a course record must appear in the dept
name attribute of some record of the department relation. Database modifications can cause
violations of referential integrity. When a referential-integrity constraint is violated, the normal
procedure is to reject the action that caused the violation.
• Assertions. An assertion is any condition that the database must always satisfy. Domain
constraints and referential-integrity constraints are special forms of assertions. However, there
are many constraints that we cannot express by using only these special forms. For example,
“Every department must have at least five courses offered every semester” must be expressed
as an assertion. When an assertion is created, the system tests it for validity. If the assertion is
valid, then any future modification to the database is allowed only if it does not cause that
assertion to be violated.
• Authorization. We may want to differentiate among the users as far as the type of access they
are permitted on various data values in the database. These differentiations are expressed in
terms of authorization, the most common being: read authorization, which allows reading, but
not modification, of data; insert authorization, which allows insertion of new data, but not
modification of existing data; update authorization, which allows modification, but not deletion,
of data; and delete authorization, which allows deletion of data. We may assign the user all,
none, or a combination of these types of authorization
The DDL, just like any other programming language, gets as input some instructions
(statements) and generates some output. The output of the DDL is placed in the data
dictionary,which contains metadata—that is, data about data. The data dictionary is considered
to be a special type of table that can only be accessed and updated by the database system
itself (not a regular user). The database system consults the data dictionary before reading or
modifying actual data
Data Dictionary in DBMS
A Data Dictionary is a repository that contains metadata (data about data) for a database. It
provides detailed information about database objects, including tables, columns, data types,
constraints, indexes, relationships, and more.
Types of Data Dictionary
1. Active Data Dictionary
• Automatically updated by the DBMS whenever changes occur.
• Ensures that metadata is always accurate and consistent.
• Common in Relational Database Management Systems (RDBMS) like MySQL,
PostgreSQL, and SQL Server.
Example in SQL Server:
SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'students';
• Retrieves details of the students table, including column names, data types, and
constraints.
2. Passive Data Dictionary
• Manually maintained by database administrators (DBAs).
• Stored as external documentation (Excel sheets, Word docs, etc.).
• Needs to be updated manually when database changes occur.
Example in Documentation:
Table Name Column Name Data Type Constraint Description
students student_id INT PRIMARY KEY Unique student identifier
students Name VARCHAR(100) NOT NULL Student's full name
students Email VARCHAR(255) UNIQUE Student's email address
Contents of a Data Dictionary
A Data Dictionary typically includes:
Table names – List of all tables in the database.
Column names – Names of attributes (fields) in each table.
Data types – Specifies types (e.g., INT, VARCHAR, DATE).
Constraints – Includes PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, etc.
Indexes – Details about indexed columns for faster searches.
Relationships – Defines links between tables (e.g., Student → Course).
Stored Procedures & Triggers – Lists any special functions in the database.
Example: Data Dictionary for a Student Database
Attribute Description
Database Name university_db
Tables students, courses, enrollments
Table: students
student_id Unique identifier (Primary Key)
name Full name (VARCHAR 100)
email Unique email (VARCHAR 255)
enrollment_date Date of joining (DATE)
Table: courses
course_id Unique course identifier (Primary Key)
course_name Course title (VARCHAR 100)
credits Number of credits (INT)
Importance of a Data Dictionary
Ensures Data Consistency – Provides a single reference for database structure.
Improves Data Quality – Avoids conflicts and redundancies in data definitions.
Aids Developers & DBAs – Helps in database management, debugging, and optimization.
Enhances Security – Defines permissions and access control for different users.
Supports Documentation – Acts as a guide for developers and business analysts.
How to View the Data Dictionary in SQL?
1. View All Tables in a Database (MySQL, PostgreSQL)
SELECT table_name FROM information_schema.tables WHERE table_schema = 'public';
2. View Table Columns and Data Types (MySQL, PostgreSQL)
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = 'students';
3. View Constraints (Primary Key, Foreign Key)
SELECT constraint_name, table_name, constraint_type
FROM information_schema.table_constraints
WHERE table_name = 'students';
A Data Dictionary is essential for managing, documenting, and maintaining a database
efficiently. It helps developers, DBAs, and analysts understand the database structure and
ensures data consistency and security.
Database Architecture
Database Architecture defines the structure, design, and interactions of a database system. It
describes how data is stored, managed, and accessed by users and applications.
Types of Database Architecture
1. 1-Tier Architecture (Monolithic Architecture)
• The database, application, and user interface all run on a single machine.
• Suitable for small-scale applications like desktop apps (e.g., Microsoft Access).
Example:
• A local MySQL database used in a personal inventory management system.
Advantages:
✔ Fast performance (no network latency).
✔ Simple and easy to manage.
Disadvantages:
✖ Not scalable for multiple users.
✖ Data security is low since everything is on one system.
2. 2-Tier Architecture (Client-Server)
• Client-side application (frontend) communicates directly with a database server.
• Used in medium-scale applications like university management systems.
Example:
• A Laravel web application accessing a MySQL database on a remote server.
Advantages:
✔ Better security (database runs on a separate server).
✔ Faster than 3-tier for small networks.
Disadvantages:
✖ Not ideal for large-scale distributed systems.
✖ Application logic is still on the client-side, leading to higher load.
3. 3-Tier Architecture (Web-Based)
• Data processing is split into three layers:
1. Presentation Layer (Client/Frontend) – User interface (HTML, React, [Link]).
2. Application Layer (Business Logic/Backend) – Processes requests (Laravel,
[Link]).
3. Database Layer (Storage) – Stores and retrieves data (MySQL, PostgreSQL).
Example:
A university portal where students apply, pay fees, and check results:
1. Frontend ([Link], Bootstrap) → User submits form.
2. Backend (Laravel API) → Processes form and interacts with the database.
3. Database (MySQL) → Stores student data.
Advantages:
✔ Highly scalable for large systems.
✔ More secure (backend hides database from users).
✔ Efficient load balancing (servers handle different tasks).
Disadvantages:
✖ Slightly slower than 2-tier due to multiple layers.
✖ Requires more resources (servers, databases, APIs).
Types of Database Architectures Based on Distribution
1. Centralized Database Architecture
• All data is stored in one central database on a single server.
• Used in banks, universities, and government offices.
Advantages:
✔ Easier maintenance since everything is in one place.
✔ Better data consistency (no duplicates).
Disadvantages:
✖ Single point of failure (if the database crashes, all services go down).
✖ Slower for remote users due to network dependency.
2. Distributed Database Architecture
• Data is spread across multiple servers in different locations.
• Used by global companies like Facebook, Google, and Amazon.
Advantages:
✔ High availability (if one server fails, others continue working).
✔ Faster access for users in different regions.
Disadvantages:
✖ More complex to manage (data synchronization issues).
✖ Expensive (multiple database servers required).
3. Cloud Database Architecture
• Database is hosted on a cloud platform like AWS, Google Cloud, or Azure.
• Used in modern web apps, e-commerce, and IoT applications.
Advantages:
✔ Scalable on demand (pay only for what you use).
✔ Highly available and secure.
Disadvantages:
✖ Dependent on internet connectivity.
✖ Data privacy concerns (stored on third-party servers).
Comparison of Database Architectures
Architecture Best Use Case Pros Cons
Not scalable, security
1-Tier (Monolithic) Personal use, small apps Fast, simple
risk
2-Tier (Client- Small businesses, Secure, better
Limited scalability
Server) universities performance
3-Tier (Web- Large web apps, e-
Scalable, secure Higher complexity
Based) commerce
Centralized Banks, universities Easy to manage Single point of failure
Distributed Global companies Fast, fault-tolerant Expensive, complex
Cloud-Based Startups, SaaS apps Scalable, cost-effective Internet dependency
• 1-Tier = Best for local applications.
• 2-Tier = Suitable for medium-sized client-server applications.
• 3-Tier = Best for large, scalable web applications.
• Centralized = Used in secure environments like banks.
• Distributed = Ideal for global-scale applications.
• Cloud = Best for modern, flexible applications.
Conceptual Database Design: Entity Relationship (ER) Modeling
Entity-Relationship (ER) Modeling is a visual representation of the data requirements and
structure of a database. It helps design a conceptual schema by defining the entities,
attributes, and relationships among them.
Key Concepts in ER Modeling
1. Entity
• An entity represents a real-world object or concept in the database. It can be a person,
place, thing, or event.
• Entities have attributes, which are the properties that describe them.
Example:
• Entity: Student
• Attributes: Student_ID, Name, Email, Date_of_Birth
2. Attribute
• Attributes describe the properties or characteristics of an entity.
• There are two types of attributes:
o Simple Attribute: Cannot be divided (e.g., Name, Age).
o Composite Attribute: Can be divided into smaller parts (e.g., Address can have
Street, City, State, Postal Code).
o Derived Attribute: Can be derived from other attributes (e.g., Age can be
derived from Date_of_Birth).
o Multi-valued Attribute: Can hold multiple values (e.g., Phone Numbers).
Example:
• Student entity:
o Attributes: Student_ID (Primary Key), Name (Composite: First Name + Last
Name), Email (Unique), Date_of_Birth (Derived: Current Date - Date_of_Birth)
3. Entity Set
• An entity set is a collection of similar entities. For example, all students in a university
make up the Student Entity Set.
Example:
• Student Entity Set: Contains all student records, each with unique student IDs.
4. Relationship
• A relationship describes how entities interact with each other. Relationships can be
one-to-one (1:1), one-to-many (1:M), or many-to-many (M:M).
Example:
• Student (1) -> Enrolls In (M) -> Course
• A student enrolls in many courses, and a course can have many students enrolled.
5. Relationship Set
• A relationship set is a collection of similar relationships between entities.
Example:
• The Enrolls In relationship set would include all instances where students are enrolled
in courses.
Types of Relationships in ER Modeling
1. One-to-One (1:1) Relationship
• An entity A is associated with at most one entity B and vice versa.
• Example: A Student has only one Student ID Card, and each ID card belongs to one
student.
2. One-to-Many (1:M) Relationship
• An entity A is associated with many entities B, but each entity B is associated with only
one entity A.
• Example: A Department offers many Courses, but each Course belongs to one
Department.
3. Many-to-Many (M:M) Relationship
• An entity A can be associated with many entities B, and an entity B can be associated
with many entities A.
• Example: A Student can enroll in many Courses, and a Course can have many Students.
ER Diagram Components
1. Entities
• Represented by rectangles.
2. Attributes
• Represented by ovals connected to entities by a line.
3. Relationships
• Represented by diamonds.
4. Primary Key
• Represented by underlining the attribute.
5. Cardinality Constraints
• Describes the number of instances of one entity that can relate to the instances of
another entity.
o 1: One entity instance.
o M: Many entity instances.
Example of ER Diagram for a University System
Entities
• Student
o Attributes: Student_ID (Primary Key), Name, Email, Date_of_Birth
• Course
o Attributes: Course_ID (Primary Key), Course_Name, Credits
• Instructor
o Attributes: Instructor_ID (Primary Key), Name, Email
Relationships
• Enrollment (between Student and Course)
o A Student can enroll in many Courses, and each Course can have many
Students (Many-to-Many relationship).
o Attributes: Enrollment_Date
• Teaches (between Instructor and Course)
o An Instructor can teach many Courses, but a Course can be taught by one
Instructor (One-to-Many relationship).
ER Diagram Representation
+------------------+ +-----------------+ +------------------+
| Student | | Course | | Instructor |
|------------------| |-----------------| |------------------|
| Student_ID (PK) |<--> | Course_ID (PK) |<--> | Instructor_ID (PK)|
| Name | | Course_Name | | Name |
| Email | | Credits | | Email |
| Date_of_Birth | +-----------------+ +------------------+
+------------------+ | |
| |
v v
+----------------------+
| Enrollment |
|----------------------|
| Enrollment_Date |
+----------------------+
Steps in Conceptual Database Design Using ER Modeling
1. Identify Entities
• Determine the real-world objects (students, courses, departments, etc.) that need to
be represented in the database.
2. Identify Relationships
• Identify how entities interact with one another (e.g., students enroll in courses,
instructors teach courses).
3. Identify Attributes
• Define the properties of each entity (e.g., student's name, course's credits).
4. Define Primary Keys
• Determine the unique identifier for each entity (e.g., student ID, course ID).
5. Draw the ER Diagram
• Visualize the entities, relationships, and attributes using rectangles, diamonds, and
ovals.
• ER Modeling is essential for conceptual database design, helping developers and
designers understand data structures, relationships, and design the database before
implementing it.
• It provides a clear, visual representation of the system's data requirements, making it
easier to identify gaps, constraints, and necessary changes.