Skip to content

DECTEN0/sql-data-warehouse-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Warehouse and Analytics Project

Welcome to my Data Warehouse and Analytics Project repository. This project demonstrates my comprehensive data warehousing and analytics solution, from building a data warehouse to generating actionable insights. Designed as a portfolio project, it highlights industry best practices in data engineering and analytics.


🏗️ Data Architecture

The data architecture for this project follows Medallion Architecture with Bronze, Silver, and Gold layers:

Layer Description
Bronze Stores raw data as-is from source systems. Data is ingested from CSV files into SQL Server.
Silver Includes data cleansing, standardization, and normalization to prepare data for analysis.
Gold Houses business-ready data modeled into a star schema required for reporting and analytics.

📖 Project Overview

This project involves:

  • Data Architecture: Designing a Modern Data Warehouse using Medallion Architecture (Bronze, Silver, and Gold layers)
  • ETL Pipelines: Extracting, transforming, and loading data from source systems into the warehouse
  • Data Modeling: Developing fact and dimension tables optimized for analytical queries
  • Analytics & Reporting: Creating SQL-based reports and dashboards for actionable insights

🎯 Skills Showcased

This repository is my introduction to Data Engineering:

  • SQL Development
  • Data Architecture
  • Data Engineering
  • ETL Pipeline Development
  • Data Modeling
  • Data Analytics

🛠️ Tools & Resources

  • Datasets: CSV files for the project dataset
  • SQL Server Express: Lightweight server for hosting your SQL database
  • SSMS: GUI for managing and interacting with databases
  • Git: Version control and collaboration
  • DrawIO: Design data architecture, models, flows, and diagrams
  • Notion: Project template and task management

🚀 Project Requirements

Building the Data Warehouse (Data Engineering)

Objective: Develop a modern data warehouse using SQL Server to consolidate sales data, enabling analytical reporting and informed decision-making.

Specifications:

  • Data Sources: Import data from two source systems (ERP and CRM) provided as CSV files
  • Data Quality: Cleanse and resolve data quality issues prior to analysis
  • Integration: Combine both sources into a single, user-friendly data model for analytical queries
  • Scope: Focus on the latest dataset only; historization of data is not required
  • Documentation: Provide clear documentation of the data model for business stakeholders and analytics teams

BI: Analytics & Reporting (Data Analysis)

Objective: Develop SQL-based analytics to deliver detailed insights into:

  • Customer Behavior
  • Product Performance
  • Sales Trends

These insights empower stakeholders with key business metrics, enabling strategic decision-making.


📌 Project Management

The project was managed via notion, here is the project task tracker and breakdown;

Notion


📂 Repository Structure

data-warehouse-project/
│
├── datasets/                           # Raw datasets (ERP and CRM data)
│
├── docs/                               # Project documentation and architecture
│   ├── etl.drawio                      # ETL techniques and methods diagram
│   ├── data_architecture.drawio        # Project architecture diagram
│   ├── data_catalog.md                 # Dataset catalog with field descriptions
│   ├── data_flow.drawio                # Data flow diagram
│   ├── data_models.drawio              # Data models (star schema)
│   └── naming-conventions.md          # Naming guidelines for tables, columns, and files
│
├── scripts/                            # SQL scripts for ETL and transformations
│   ├── bronze/                         # Scripts for extracting and loading raw data
│   ├── silver/                         # Scripts for cleaning and transforming data
│   └── gold/                           # Scripts for creating analytical models
│
├── tests/                              # Test scripts and quality files
│
├── README.md                           # Project overview and instructions
├── LICENSE                             # License information
├── .gitignore                          # Git ignore rules
└── requirements.txt                    # Project dependencies

☕ Stay Connected

Let's stay in touch! Feel free to connect with me:

LinkedIn


🛡️ License

This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.

Releases

No releases published

Packages

 
 
 

Contributors

Languages