PART 1 — INTRODUCTION TO
DATABASE INTEGRATION
Definition
Database Integration is the process of combining data from
multiple databases into a unified system or application to ensure
consistency, accessibility, and performance.
Why Database Integration Matters
Organizations often store data across different systems —
e.g., HR, Finance, and Student Information Systems.
Integration helps these systems “talk” to each other,
ensuring smooth operations such as:
o Real-time reporting
o Consolidated dashboards
o Streamlined decision-making
o Reduced data duplication
Common Scenarios
A university system that combines student data
(enrollment, grades, library, and finance).
An e-commerce platform integrating product inventory,
customer orders, and payment databases.
A healthcare system combining patient records, billing,
and laboratory systems.
PART 2 — TYPES OF DATABASE
INTEGRATION
1. Homogeneous Integration
Databases are of the same type (e.g., all MySQL).
Easier to manage since schema and query language are
consistent.
Example: Two MySQL databases linked using FEDERATED
tables.
2. Heterogeneous Integration
Databases are of different types (e.g., Oracle + MySQL +
MongoDB).
Requires middleware or ETL (Extract, Transform, Load)
processes.
Example: Using Apache NiFi or Talend to move and
combine data between SQL and NoSQL systems.
PART 3 — METHODS OF DATABASE
INTEGRATION
Example Tools /
Method Description
Features
Connect one database DBLINK
Database to another using (PostgreSQL/Oracle),
Linking database-specific FEDERATED (MySQL).
features.
Applications
communicate using RESTful APIs,
API-based
APIs to retrieve data GraphQL, JSON-based
Integration
from multiple APIs.
sources.
Data is extracted from
sources, transformed
ETL (Extract,
to a common format, Talend, Apache NiFi,
Transform,
and loaded into a SSIS, Pentaho.
Load)
target system (like a
data warehouse).
Combines data in real
Data
time without Denodo, Dremio.
Virtualization
physically moving it.
PART 4 — CHALLENGES IN
DATABASE INTEGRATION
Challenge Description Example
Student ID stored in
Data Same data may exist in
both EnrollmentDB and
Redundancy multiple systems.
LibraryDB.
Databases may use
Schema “Student_ID” vs.
different structures or
Mismatch “StudNo”.
naming conventions.
Updates in one Library fines updated
Data
database might not but not reflected in
Consistency
reflect in others. finance records.
Confidential
Security and Shared access may
information accessed
Privacy expose sensitive data.
by unauthorized users.
Cross-database joins Multiple database
Performance can slow down the queries increasing
system. latency.
PART 5 — DATABASE
NORMALIZATION AND
DENORMALIZATION
Normalization
The process of organizing data in a database to reduce
redundancy and improve data integrity.
Main Objectives:
Eliminate data redundancy
Ensure data dependencies make sense
Improve consistency and integrity
Common Normal Forms:
1. 1NF – Eliminate repeating groups (atomic columns).
2. 2NF – Remove partial dependencies (each non-key
depends on the whole key).
3. 3NF – Remove transitive dependencies (no non-key
depends on another non-key).
Example:
Unnormalized Table
Student(ID, Name, Subject1, Subject2, Subject3)
→ Normalized:
Student(ID, Name)
Enrollment(StudentID, Subject)
Denormalization
The process of combining tables or intentionally introducing
redundancy to improve query performance, especially in
integrated or reporting systems.
Reasons to Denormalize:
Reduce complex joins.
Improve performance for read-heavy systems.
Simplify reporting or data aggregation.
Example:
Instead of separate tables for student and department, store both
in one table for faster report generation.
Normalization vs. Denormalization Comparison
Aspect Normalization Denormalization
Data consistency and Faster performance and
Goal
integrity simplified access
Redundancy Reduced Increased
Slower queries (more
Speed Faster queries (less joins)
joins)
OLTP (Online
OLAP (Online Analytical
Use Case Transaction
Processing)
Processing)
PART 6 — STRATEGIES FOR
DATABASE INTEGRATION
1. Define Integration Objectives
What systems must share data?
What data elements are common?
2. Identify Common Keys
Find unique identifiers shared across databases (e.g.,
Student_ID, Employee_No).
3. Map Relationships
Match equivalent tables and attributes between databases.
4. Choose Integration Method
Direct link, API, or ETL.
5. Apply Normalization or Denormalization
Normalize for accuracy and consistency (data entry
systems).
Denormalize for reporting or analysis (data warehouse
systems).
6. Test and Monitor
Verify data accuracy and query performance after
integration.
PART 7 — SAMPLE INTEGRATION
SCENARIO
Case Study:
Buenavista Community College wants to integrate its:
Student Information System (MySQL)
Finance System (PostgreSQL)
Goal: Show the student’s enrollment status and outstanding
balance in one dashboard.
Steps:
1. Identify shared key → student_id
2. Create database link from MySQL to PostgreSQL (or use
API).
3. Perform joined query:
SELECT s.student_id, [Link], [Link]
FROM [Link] s
JOIN [Link] f
ON s.student_id = f.student_id;
4. Optimize queries:
o Normalize both systems to remove redundant
columns.
o Denormalize the integrated view for faster reporting.
PART 8 — SUMMARY
Key Takeaways
Database Integration connects multiple databases into one
logical system.
Normalization maintains integrity and reduces
redundancy.
Denormalization improves speed and simplifies queries.
The right integration strategy depends on the purpose —
transaction, reporting, or analytics.
Real-World Applications
Enterprise Resource Planning (ERP) systems
University information systems
Banking and e-commerce platforms
PART 9 — REFLECTION /
DISCUSSION QUESTIONS
1. What are the risks of integrating multiple databases without
proper normalization?
2. When would you choose denormalization over
normalization?
3. How can APIs simplify database integration in modern
applications?
4. What are some tools or frameworks that can help in
integrating databases in real-world projects?