CleanEasy is a powerful, user-friendly Python library designed to simplify data cleaning and preprocessing for data scientists and analysts
-
Updated
Jul 5, 2025 - Python
CleanEasy is a powerful, user-friendly Python library designed to simplify data cleaning and preprocessing for data scientists and analysts
This exploratory data analysis (EDA) project focuses on examining sugarcane production data. Through this analysis, we seek to gain valuable insights into factors influencing sugarcane production, develop predictive models for future yields, and ultimately support efforts to optimize production efficiency and sustainability.
Black Friday Sales Analysis explores customer demographics, purchasing behaviors, and product trends to uncover insights and patterns driving sales during Black Friday events.
📊 An analysis of voting patterns in São Paulo's 2024 elections, focusing on voter behavior, absenteeism, and geographic trends.
Leveraging advanced data cleaning techniques and feature engineering, a robust food delivery prediction model was developed using regression algorithms.
Welcome to my data science repository! Here you will find a collection of resources and examples for exploring, analyzing, and manipulating data using Python. The repository includes code templates, case studies, and exercises to help you learn and practice data science concepts and techniques. The topics covered include data exploration, data visu
A repository where I keep all of my data cleaning samples/portfolio items.
This is an AI model for predicting laptop price, trained on about 1200 data.
A project analyzing the Indian startup ecosystem between 2018 and 2021.
Developed a 3-page Power BI dashboard (global and Asian overview) using Python scripts to load and clean World Bank data (1960–2020), reducing data processing time by 25\%. and Containerized the database in Docker, enabling scalable access, and visualized trends (e.g., 3\% annual GDP growth in Asia), enhancing stakeholder insights.
Global Superstore BI Dashboard
[Information System] SMUTF: Schema Matching Using Generative Tags and Hybrid Features
The Ethiopian Medical Business Data Warehouse & Analytics Platform is a comprehensive data solution tailored to enhance the efficiency and efficacy of Ethiopia's healthcare and medical sectors.
Power Outage Data Analysis in USA
EDA and Prediction of F1 Race WInners
This project involves analyzing real-world medical appointment data through Time Series Analysis. The tasks include dataset cleaning, comprehensive analysis, and extracting insights using Python and MySQL.
Designed and implemented machine learning and deep learning models to diagnose gearbox faults. Preprocessed sensor data, engineered features, and trained models using techniques like SVM, random forests, LSTM and naive bias. Evaluated model performance and optimized hyperparameters to achieve high diagnostic accuracy.
A comprehensive Data analysis project using SQL for data cleaning and pre-processing and Tableau for visualization, focusing on key HR KPIs. Features interactive dashboards and detailed insights.
Scripts for cleaning, converting, and managing image datasets for ML training. (Zsh/Python)
This project analyses transportation data from the Bureau of Transportation Statistics (BTS) to uncover insights into cross-border freight's efficiency, safety and environmental impacts across road, rail, air and water modes.
Add a description, image, and links to the data-cleaning-and-preprocessing topic page so that developers can more easily learn about it.
To associate your repository with the data-cleaning-and-preprocessing topic, visit your repo's landing page and select "manage topics."