School of Engineering & Technology
Department: SOET Session: 2025-26
Program: BCA (Sp. AI & DS) Semester: I
Course Code: ETCCCPP103 Number of students:
Course Name: Problem Solving with Faculty: Dr. Satinder Pal Singh
Python
Assignment -4 (Unit-4)
Submission Instructions:
Submission Deadline: Assignments must be submitted within one week of the
assignment's release date.
Submission Platform: All assignments are to be submitted via the Learning
Management System (LMS) or Moodle ([Link]
GitHub Link: You must provide a link to your GitHub repository with your
submission (Optional).
Individual Submission: Assignments are to be completed and submitted by each
individual student.
Formatting: All assignments must adhere to the specific format shared in class.
Mini Project Assignment: Air Quality Data Visualizer
Course: Problem Solving with Python
Assignment Title: Data Analysis and Visualization with Real-World Air Quality Data
Assignment Type: Individual
Estimated Duration: 12–15 hours
Weightage: 15% of course grade
Real-World Problem Context
Air pollution has become one of the most critical environmental issues in urban areas.
Monitoring and analyzing air quality trends (such as PM2.5, PM10, CO, NO₂, AQI) is
essential for public health and policy awareness.
This assignment requires you to work with a real-world air pollution dataset, analyze it,
compute statistical summaries, and visualize pollution patterns using Pandas, NumPy, and
Matplotlib. The insights can help support awareness campaigns and research projects.
Learning Objectives
By completing this lab assignment, you will:
Load and explore real-world CSV air quality datasets using Pandas
Clean and preprocess environmental data
Use NumPy to compute pollution statistics (mean, min, max, std deviation)
Use Matplotlib to create trend plots and comparative charts
Apply Pandas group-by and resampling for monthly/seasonal aggregation
Export cleaned data and create an analytical report
Assignment Tasks
Task 1: Data Acquisition & Loading
Download real Air Quality data (from Kaggle / CPCB / OpenAQ)
Load CSV into a Pandas DataFrame
View initial data:
o head()
o info()
o describe()
Task 2: Data Cleaning & Processing
Handle missing values (dropna() or fillna())
Convert Date column to datetime format
Select relevant columns (example: Date, PM2.5, PM10, AQI, NO2, SO2)
Task 3: Statistical Analysis using NumPy
Using NumPy calculate:
Daily & monthly average PM2.5 & PM10
Min, Max, and Standard deviation of AQI
Identify pollution peaks
Task 4: Visualization using Matplotlib
Create the following plots:
Plot Type Description
Line Chart Daily AQI trend
Bar Chart Monthly average PM2.5
Plot Type Description
Scatter Plot PM2.5 vs PM10 concentration
Subplot Figure Combine any two relevant charts
Save each chart as .png
Task 5: Grouping & Aggregation
Group by month or season (Winter/Summer/Monsoon)
Compute average AQI and PM levels
Generate summary tables
Task 6: Export & Insight Report
Export cleaned data to cleaned_air_quality.csv
Save all charts
Prepare a Markdown/Text report including:
✔ Introduction
✔ Methodology
✔ Graphs & Observations
✔ Key findings (e.g., most polluted months)
✔ Conclusion
Submission Instructions
Create a GitHub repository titled:
air-quality-visualizer-<yourname>
Repository must include:
Required Files Description
.ipynb / .py Full code
cleaned_air_quality.csv Clean dataset
.png images Saved plots
[Link] / .txt Report & insights
[Link] Project description + usage
Evaluation Rubric
Criterion Weight Excellent (5) Good (4) Fair (3)
Data Loading & Cleaning 15% Well-cleaned Mostly done Partial
Statistical Analysis 15% Correct & detailed Adequate Few metrics
Visualization Quality 20% Clear, labelled, insightful Basic Messy
Grouping & Aggregation 10% Logical & effective Basic Weak
Export & Reporting 10% Complete & clear Some parts Missing
Code Quality 10% Modular, readable Acceptable Poor
Documentation & GitHub 10% Proper repo Some missing Messy
Bonus: Advanced Plotting 10% Subplots/Interactive Simple Basic
Academic Integrity Policy
Work must be original
Plagiarism = Zero marks
Cite references if used
Deadline: 7 Days
Contact for Support: ([Link]@[Link])
Happy Coding — Think • Code • Learn!