Name: Vaibhav Raj
Roll No: 2300291520199
Branch-CSE(AI)
Q1. Differentiate between Data Warehouse and Data Marts. Also, explain
the Block Architecture of Data Warehouse.
Difference between Data Warehouse and Data Mart:
Basis Data Warehouse Data Mart
Definition Centralized repository Subset of data warehouse
integrating data from focused on a specific
multiple sources. business area.
Scope Enterprise-wide Departmental or subject-
oriented
Data Source Multiple operational Usually derived from data
databases and external warehouse or few
sources operational systems
Size Very large Smaller (gigabytes or few
(terabytes/petabytes) terabytes)
Purpose Used for strategic decision- Used for tactical decision-
making making
Example Warehouse integrating Sales mart containing only
sales, HR, finance data sales-related data
Block Architecture of Data Warehouse:
1. Data Source Layer – Contains operational databases and external data sources.
2. Data Staging Area (ETL) – Extract, Transform, and Load processes.
3. Data Storage Layer – Central repository for integrated and cleaned data.
4. Data Presentation Layer – Data marts and views for analysis.
5. Metadata Layer – Stores schema, mappings, and definitions.
6. Front-End Tools – Used for reporting and analysis.
Diagram (Text Representation):
+---------------------------+
| Front-End Tools |
+---------------------------+
| Metadata Layer |
+---------------------------+
| Data Presentation Layer |
+---------------------------+
| Data Storage Layer |
+---------------------------+
| ETL / Staging Area |
+---------------------------+
| Data Source Layer |
+---------------------------+
Q2. Difference between OLTP and OLAP Tools.
Feature OLTP OLAP
Purpose Handles day-to-day Performs complex
transactions. analytical queries.
Data Type Current, real-time data. Historical and summarized
data.
Operations Insert, Update, Delete. Aggregation, Trend
Analysis.
Users Clerks, end-users. Analysts, managers.
Speed Fast for small transactions. Optimized for complex
queries.
Example ATM, e-commerce systems. Business Intelligence tools.
Q3. Explain various OLAP Operations with an example.
1. Roll-Up: Summarizes data along a dimension hierarchy (e.g., City → State → Country).
2. Drill-Down: Moves from summary data to detailed data (e.g., Country → State → City).
3. Slice: Selects a single dimension value to create a sub-cube (e.g., Year = 2024).
4. Dice: Selects data based on multiple dimensions (e.g., Year = 2024 and Region = North).
5. Pivot: Reorients multidimensional data (e.g., swapping rows and columns).
Q4. Explain Snowflake Schema with example, functionalities, advantages, and
disadvantages.
Definition: A normalized form of the Star Schema where dimension tables are split into
multiple related tables to remove redundancy.
Example:
Fact Table: Sales_Fact(Date_Key, Product_Key, Customer_Key, Store_Key, Sales_Amount,
Quantity)
Dimension Tables:
- Product_Dim(Product_Key, Product_Name, Category_Key)
- Category_Dim(Category_Key, Category_Name)
- Customer_Dim(Customer_Key, Customer_Name, City_Key)
- City_Dim(City_Key, City_Name, State_Key)
- State_Dim(State_Key, State_Name, Country_Key)
- Country_Dim(Country_Key, Country_Name)
Functionalities:
• Supports complex queries with multiple joins.
• Ensures data integrity.
• Saves storage space.
Advantages:
Reduces data redundancy.
Easier maintenance.
Better data consistency.
Disadvantages:
Complex joins reduce performance.
Difficult to design and manage.
Slower retrieval compared to Star Schema.
Q5. Apply Roll up, Drill down, Slicing, Dicing, Pivot, ROLAP, to the given
Multidimensional Cube.
Given dimensions: Branch (A, B, C, D), Item Type (Home, Entertainment, Computer, Phone,
Security), Year (1997, 1998, 1999).
1️⃣ Roll-Up (Aggregation)
Summarizes data along a dimension hierarchy.
Example: Aggregate all branches (A, B, C, D) into a single total for each Item Type and Year.
Result: Total sales per item type per year regardless of branch.
2️⃣ Drill-Down (Detail Analysis)
Moves from summarized data to detailed levels.
Example: From total sales per item to viewing branch-wise sales (A, B, C, D) for each item
type.
3️⃣ Slicing
Fix one dimension value to create a sub-cube.
Example: Slice Year = 1998 → Gives 2D table (Item Type × Branch) for 1998 only.
4️⃣ Dicing
Select a sub-cube by applying filters on multiple dimensions.
Example: Year ∈ (1998, 1999), Branch ∈ (A, B), Item ∈ (Home, Computer, Phone).
5️⃣ Pivot (Rotation)
Reorients cube view by rotating dimensions.
Example: Swap 'Item Type vs Year' to 'Branch vs Year' to view branch performance by year.
6️⃣ ROLAP (Relational OLAP)
Implements OLAP operations using relational databases via SQL.
Example SQL:
SELECT item_type, year, SUM(sales)
FROM sales_data
GROUP BY item_type, year;