Database – Definition and Explanation
A database is an organized collection of data that is stored and managed electronically. It allows
users to efficiently store, retrieve, update, and manage data. Instead of keeping data in
scattered files, a database provides a structured and systematic way of handling large volumes
of information.
Key Features of a Database
1. Organization of Data – Data is stored in tables (rows and columns).
2. Data Integrity – Ensures accuracy and consistency of data.
3. Data Security – Provides access control so only authorized users can access or modify
data.
4. Concurrency Control – Allows multiple users to use the database at the same time
without conflict.
5. Backup and Recovery – Protects data against accidental loss or system failures.
Types of Databases
1. Relational Database (RDBMS) – Data stored in tables (e.g., MySQL, Oracle, SQL
Server).
2. NoSQL Database – Stores unstructured or semi-structured data (e.g., MongoDB,
Cassandra).
3. Hierarchical and Network Databases – Earlier models for data storage.
4. Distributed Database – Data stored across multiple locations but accessed as one.
Examples of Databases
Banking systems (customer records, transactions)
Online shopping platforms (product catalogs, orders, user accounts)
Social media apps (profiles, posts, messages
Purpose of Database Systems
The main purpose of a database system is to manage data efficiently and provide reliable
access to it. Instead of storing data in multiple scattered files, a database system centralizes it,
ensuring security, consistency, and easy management.
Key Purposes
1. Data Organization and Storage
o Provides a structured way of storing large volumes of data in tables, records, and
fields.
o Avoids confusion and duplication compared to traditional file systems.
2. Data Retrieval and Sharing
o Allows quick searching and retrieval of required information using query
languages (e.g., SQL).
o Supports multiple users to access data at the same time.
3. Data Integrity and Consistency
o Ensures accuracy and reliability of data.
o Prevents anomalies like duplication or conflicting information.
4. Data Security and Privacy
o Protects data from unauthorized access by using authentication and access
control.
o Provides different permission levels for users.
5. Data Redundancy Reduction
o Minimizes repetition of data by centralizing it.
o Saves storage space and maintains consistency.
6. Concurrency Control
o Manages simultaneous access by multiple users without errors or conflicts.
7. Backup and Recovery
o Maintains regular backups.
o Restores data in case of failure, corruption, or accidental loss.
8. Support for Decision-Making
o Helps organizations analyze stored data.
o Generates useful reports for planning and decision-making.
Traditional File System vs Database System
1. Data Storage
File System: Data is stored in separate files, often scattered and unorganized.
Database System: Data is stored in structured tables (rows and columns) with defined
relationships.
2. Data Redundancy
File System: High redundancy because the same data may be repeated in many files.
Database System: Low redundancy due to centralized storage and normalization.
3. Data Integrity & Consistency
File System: Difficult to maintain data accuracy; updates in one file may not reflect in
others.
Database System: Ensures integrity using constraints, keys, and relationships.
4. Data Security
File System: Limited security; anyone with access to files can read/modify them.
Database System: Provides user authentication, access control, and role management.
5. Data Sharing & Concurrency
File System: Poor support for multiple users accessing the same data simultaneously.
Database System: Supports concurrent access with transaction management and locking.
6. Backup & Recovery
File System: Backup and recovery are manual and not very reliable.
Database System: Provides automatic backup and recovery features.
7. Querying Data
File System: Searching requires custom programs, which is time-consuming.
Database System: Uses query languages like SQL for quick and complex searches.
8. Scalability
File System: Not efficient for handling large volumes of data.
Database System: Designed to handle very large datasets efficiently.
Components of a DBMS (Database Management System)
A DBMS is a software system that helps us store, organize, and manage data in a systematic
way. It is made up of several components that work together.
1. Hardware
These are the physical devices where the database runs.
Includes servers, computers, storage devices (like hard disks, SSDs), and networking
devices.
Example: A bank stores all customer records on a server computer that acts as the
database hardware.
2. Software
The actual DBMS software that manages the database.
It includes the DBMS program (like MySQL, Oracle, MS SQL Server, MongoDB) and
the operating system that supports it.
Example: A school may use MySQL DBMS to manage student details and exam results.
3. Data
The core component of a DBMS.
Includes all the actual information stored in the database.
Organized into tables (rows and columns).
Also includes metadata (data about data), such as field names, data types, and rules.
Example: In an online shopping app, data includes products, prices, orders, and customer
details.
4. Procedures
The rules, instructions, and guidelines for using and managing the database.
Ensures proper operation, security, and recovery in case of errors.
Example: In a hospital database, there may be a procedure to regularly back up patient
records every night.
5. Database Access Language
A special language used to interact with the database.
Most common is SQL (Structured Query Language).
Used for:
o Insert new data → e.g., adding a new student record.
o Update data → e.g., changing a customer’s phone number.
o Delete data → e.g., removing an old order.
o Retrieve data → e.g., displaying all employees in a company.
6. Users
Different people interact with a DBMS in different ways:
1. Database Administrators (DBAs): Manage the database, set permissions, take backups,
and ensure security.
o Example: IT staff in a university who manage the student database.
2. Application Programmers: Write software that connects with the database.
o Example: A developer writing a library management app that uses a database to
store books.
3. End Users: People who directly use applications without technical knowledge.
o Example: A student logging into a portal to see their exam results.
Types of Database Users
In a Database Management System (DBMS), different types of users interact with the database
in different ways depending on their role, knowledge, and purpose. The main types are:
1. Database Administrators (DBAs)
Responsible for managing the entire database system.
Duties: installing DBMS, managing storage, setting user permissions, backup and
recovery, monitoring performance.
Example: An IT staff member in a bank who controls access to customer databases.
2. Database Designers
Responsible for designing the structure of the database.
Decide what tables, fields, relationships, and constraints are needed.
Example: A software engineer designing the database schema for an e-commerce
website.
3. Application Programmers / Developers
Write programs and applications that connect to the database.
Use languages like Java, Python, C#, or PHP along with SQL.
Example: A developer writing a payroll management system that fetches salary details
from the database.
4. End Users
The largest group of database users who directly interact with applications.
They don’t need technical knowledge of how the database works.
Types of end users:
o Naïve users: Use predefined applications (e.g., a student checking exam results).
o Casual users: Occasionally query the database (e.g., a manager checking
monthly reports).
o Sophisticated users: Use complex tools for analysis (e.g., a data analyst running
SQL queries).
5. System Analysts
Act as a bridge between end users and programmers.
Understand the requirements of users and help design applications that meet those needs.
Example: In a hospital system, a system analyst ensures that the doctors’ requirements
are included in the database design.
Advantages of DBMS
A Database Management System (DBMS) offers several advantages over traditional file
systems.
1. Data Security Features
Provides authentication, authorization, and access control.
Only authorized users can access specific data.
Example: A bank clerk cannot access confidential loan records.
2. Data Backup and Recovery
DBMS automatically maintains backup and recovery features.
Restores the database to a consistent state after failures.
Example: Online orders are not lost after a server crash.
3. Concurrency (Resiliency) Control
Allows multiple users to access the database at the same time.
Uses transaction management to avoid conflicts.
Example: Two customers booking the last train seat won’t both succeed—only one is
confirmed.
4. Support for Complex Data and Relationships
Handles complex queries and relationships (one-to-many, many-to-many).
Supports E-R models and relational keys.
Example: A student database can relate students, courses, teachers, and results.
5. Advanced Capabilities
Provides query optimization, triggers, stored procedures, views.
Supports modern needs like multimedia, big data, and cloud storage.
Example: Amazon uses DBMS for recommendation engines.
6. Data Independence
Data storage is independent of application programs.
Structural changes do not affect existing applications.
Example: Adding “Email ID” to a student table does not break old software.
7. Redundancy Control
DBMS minimizes duplicate data by centralizing storage.
Avoids storing the same information multiple times.
Example: A student’s details are stored only once, not in every course file.
8. Consistency Constraints
DBMS enforces rules and constraints (like primary key, foreign key, unique, not null).
Ensures data entered is accurate and reliable.
Example: A student cannot register for a course that doesn’t exist.
Applications of Databases
Databases are widely used in almost every field where large amounts of information need to be
stored, organized, and retrieved. Some major applications are:
1. Banking and Finance
Used to manage customer accounts, transactions, loans, and credit cards.
Example: ATM systems use databases to update balances instantly after each transaction.
2. Education and Schools
Store student details, attendance, exam records, and course management.
Example: Universities use databases for online result portals and admission systems.
3. Healthcare and Hospitals
Maintain patient records, doctors’ schedules, billing, and medical histories.
Example: A hospital database stores patient diagnosis, treatment, and lab reports.
4. E-commerce and Online Shopping
Manage product catalogs, customer profiles, payments, and order tracking.
Example: Amazon uses massive databases to handle millions of products and customers
worldwide.
5. Social Media Platforms
Store user profiles, posts, messages, photos, and friend connections.
Example: Facebook and Instagram use databases to manage billions of user interactions
daily.
6. Government and Public Sector
Store citizen records, taxes, national ID information, and voter data.
Example: Aadhaar (India) uses one of the largest biometric databases in the world.
7. Transportation and Airlines
Manage flight schedules, reservations, tickets, and passenger details.
Example: Railway reservation systems store seat availability and passenger bookings in
real-time.
8. Business and Corporate Sector
Track employees, payroll, inventory, and customer relationship management (CRM).
Example: A supermarket uses a database to manage stock and generate bills quickly.
Types of Databases:
1. Relational Database (RDBMS):
o Definition: Stores data in tables (rows and columns) with relationships between
them.
o Example: MySQL, Oracle, PostgreSQL.
o Features: Uses SQL (Structured Query Language) for queries; ensures data
consistency with ACID properties (Atomicity, Consistency, Isolation, Durability).
o Use case: Banking systems, inventory management, and applications requiring
structured data.
2. NoSQL Database:
o Definition: Designed for large-scale data that may not fit well into tables;
supports flexible schemas.
o Types: Key-value, document, column-family, graph databases.
o Example: MongoDB, Cassandra.
o Use case: Big data applications, social media platforms, real-time analytics.
3. Graph Database:
o Definition: Stores data as nodes (entities) and edges (relationships). Ideal for
connected data.
o Example: Neo4j, Amazon Neptune.
o Use case: Social networks, recommendation engines, fraud detection.
4. Time Series Database:
o Definition: Optimized for storing time-stamped data (data recorded over time).
o Example: InfluxDB, TimescaleDB.
o Use case: Monitoring systems, IoT sensors, stock market analysis.
5. Document Database:
o Definition: Stores data in documents (usually JSON or XML), allowing flexible
and nested data.
o Example: MongoDB, CouchDB.
o Use case: Content management systems, e-commerce catalogs.
6. In-Memory Database:
o Definition: Stores data in the system’s RAM for faster access, rather than on disk.
o Example: Redis, Memcached.
o Use case: Real-time analytics, caching, high-speed transactions.
Structured Databases
Definition: Store data in a fixed format like rows and columns (tables).
Data Type: Highly organized, easily searchable using SQL.
Example: Relational databases like MySQL, Oracle, PostgreSQL.
Advantages:
o Easy to manage and query
o Ensures data consistency
o Supports complex transactions
Use Case: Banking systems, employee records, inventory management
Unstructured Databases
Definition: Store data in a flexible format without a fixed schema.
Data Type: Can include text, images, videos, emails, documents, JSON files, etc.
Example: NoSQL databases like MongoDB, Cassandra, CouchDB.
Advantages:
o Can handle large volumes of diverse data
o Flexible schema (easy to add new fields)
o Scales well horizontally
Use Case: Social media posts, multimedia storage, IoT sensor data
Relational Database
Feature Non-Relational Database (NoSQL)
(RDBMS)
Tables with rows and Flexible formats: key-value, document, column,
Data Structure
columns or graph
Fixed schema (predefined
Schema Dynamic/flexible schema
structure)
Query SQL (Structured Query Varies: MongoDB uses JSON-like queries,
Language Language) others may use custom APIs
Vertically scalable (upgrade
Scalability Horizontally scalable (add more servers)
server)
ACID Yes, ensures strong Usually eventual consistency; some support
Compliance consistency ACID partially
Slower for very large, Fast for large-scale, unstructured, or distributed
Performance
unstructured data data
MySQL, Oracle,
Examples MongoDB, Cassandra, Redis, CouchDB, Neo4j
PostgreSQL, SQL Server
Banking, ERP, Inventory Social media, real-time analytics, IoT, content
Use Cases
management management
✅ Tip to remember:
Relational = Structured, SQL, Tables, ACID
Non-Relational = Flexible, NoSQL, Big Data, Scalable
1. Introduction to the Relational Model
A relational database organizes data into tables (also called relations).
Each table represents an entity, such as Students, Employees, or Products.
Data is stored in rows and columns: rows = records, columns = fields.
Advantages: Simple, flexible, supports SQL, ensures data consistency.
2. Concepts
a) Tables (Relations)
A table is a collection of related data entries.
Each table has a name and columns (attributes).
Example:
StudentID Name Age Major
101 Alice 20 CS
102 Bob 21 IT
b) Tuples (Rows/Records)
Each row in a table is called a tuple.
A tuple represents a single instance of the entity.
Example: (101, Alice, 20, CS) is one tuple in the Students table.
c) Attributes (Columns/Fields)
Each column in a table is an attribute, representing a property of the entity.
Example: Name, Age, Major are attributes of the Students table.
3. Database Schema and Instances
Database Schema: The structure of the database; defines tables, attributes, and
relationships.
Database Instance: The actual data stored in the database at a particular time.
Example:
o Schema: Table Students(StudentID, Name, Age, Major)
o Instance: The rows (101, Alice, 20, CS) and (102, Bob, 21, IT)
4. Domains and Integrity Constraints
a) Domain Constraint
Each attribute must have a valid set of values (domain).
Example: Age must be an integer between 1 and 120.
b) Entity Integrity Constraint
Every table must have a primary key, and it cannot be NULL.
Ensures each tuple is unique.
Example: StudentID is a primary key in the Students table.
c) Referential Integrity Constraint
Ensures relationships between tables remain consistent.
A foreign key in one table must match a primary key in another table.
Example: If Enrollments(StudentID, CourseID) references Students(StudentID),
every StudentID in Enrollments must exist in Students.
1. Relation Schema
Definition: A relation schema defines the structure of a table in a relational database.
It lists all the attributes (columns) of the table.
Notation:
R=(A1,A2,…,An)R = (A_1, A_2, …, A_n)
where A1,A2,…,AnA_1, A_2, …, A_n are the attributes of the relation.
Example:
instructor=(ID,name,dept_name,salary)\text{instructor} = (ID, name, dept\_name, salary)
This means the instructor table has four attributes: ID, name, dept_name, and salary.
2. Relation Instance
Definition: A relation instance is the actual data stored in a table at a particular time.
Formally, if D1,D2,…,DnD_1, D_2, …, D_n are the domains (possible values) for each
attribute, then a relation rr is a subset of the Cartesian product:
r⊆D1×D2×…×Dnr \subseteq D_1 \times D_2 \times … \times D_n
Explanation:
o Each tuple (a1,a2,…,an)(a_1, a_2, …, a_n) in the relation satisfies ai∈Dia_i \in
D_i.
o The order of tuples in a relation does not matter.
Example:
For instructor = (ID, name, dept_name, salary), an instance could be:
ID Name Dept_Name Salary
101 Dr. Smith CS 50000
ID Name Dept_Name Salary
102 Dr. Lee Math 45000
Here, each row is a tuple, and the table is a set of tuples (relation instance).
✅ Key Points to Remember:
1. Relation Schema = structure (attributes).
2. Relation Instance = actual data (set of tuples).
3. Order of tuples is irrelevant.
4. Each tuple element belongs to the domain of the corresponding attribute.
1. Keys in Relational Databases
a) Primary Key (PK)
Definition: A primary key is a column (or set of columns) that uniquely identifies each
row in a table.
Rules:
o Must be unique
o Cannot be NULL
Example:
StudentID Name Age Major
101 Alice 20 CS
102 Bob 21 IT
Primary Key: StudentID
b) Candidate Key
Definition: A candidate key is a column (or set of columns) that can serve as a
primary key.
Example: In the Student table, both StudentID and Email could be candidate keys if
Email is unique.
c) Super Key
Definition: A super key is any combination of columns that uniquely identifies a row.
Example:
o StudentID → Super Key
o (StudentID, Name) → Also a Super Key (extra columns allowed)
d) Foreign Key (FK)
Definition: A foreign key is a column in one table that refers to the primary key in
another table.
Purpose: Maintains referential integrity between tables.
Example:
Student Table
StudentID Name MajorID
101 Alice 1
102 Bob 2
Major Table
MajorID MajorName
1 CS
2 IT
Here, MajorID in Student is a foreign key referencing MajorID in Major.
2. Schema Diagrams
A schema diagram visually shows tables, attributes, keys, and relationships.
Example:
Student(StudentID PK, Name, MajorID FK)
Major(MajorID PK, MajorName)
3. Summary Table of Keys
Key Type Purpose Example
Primary Key Uniquely identifies a row StudentID
Candidate Key Can be chosen as primary key StudentID, Email
Super Key Any combination that uniquely identifies row (StudentID), (StudentID, Name)
Foreign Key Refers to primary key in another table MajorID in Student table
1. Introduction to Query Languages
A query language is a language that allows users to retrieve, manipulate, and manage
data stored in a database.
It acts as an interface between the user and the database.
Main purpose: extract useful information without dealing directly with low-level
storage details.
2. Types of Query Languages
Query languages can be classified into two main types:
A) Procedural Query Languages
Definition: The user specifies how to obtain the desired result.
The user must describe the sequence of operations needed to retrieve data.
Focus: “How to get the data”
Example: Relational Algebra
Key Characteristics:
1. Requires knowledge of data structure.
2. Involves step-by-step operations like selection, projection, join, etc.
3. Powerful for complex queries, but less user-friendly.
Example:
Query: Get the names of students with age > 20.
In relational algebra:
πName(σAge>20(Student))\pi_{Name} (\sigma_{Age>20} (Student))
B) Non-Procedural (Declarative) Query Languages
Definition: The user specifies what data is needed, not how to retrieve it.
The DBMS decides the best way to execute the query.
Focus: “What data is required”
Example: SQL (Structured Query Language)
Key Characteristics:
1. User does not need to know the internal structure of the database.
2. Easier to use and understand for end users.
3. DBMS handles query optimization and execution.
Example (SQL):
SELECT Name FROM Student WHERE Age > 20;
3. Key Differences Between Procedural and Non-Procedural Query
Languages
Feature Procedural (Relational Algebra) Non-Procedural (SQL)
Approach Step-by-step instructions Describe what data is needed
User Knowledge Required High (must know database structure) Low (no detailed structure needed)
Ease of Use Complex Easy
Optimization User decides DBMS decides
Example Relational Algebra SQL
1. Introduction to Relational Algebra
Definition: Relational Algebra is a procedural query language used in relational
databases.
It specifies how to retrieve data from one or more relations (tables) using mathematical
operations.
It forms the foundation of SQL.
2. Key Concepts
Relation: A table with rows (tuples) and columns (attributes).
Tuple: A single row in a table.
Attribute: A column in a table.
Goal: Apply operations on relations to produce a new relation as a result.
• Introduction to Relational Algebra
• Advanced Relational Algebra Operations
• Join
• Division
• Rename
• Assignment
• Conclusion