0% found this document useful (0 votes)
13 views19 pages

Diabetes Prediction Using Machine Learning

Uploaded by

codervishv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views19 pages

Diabetes Prediction Using Machine Learning

Uploaded by

codervishv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DIABETES PREDICTION

MODEL
DIABETES PREDICTORS
Vishv Ghiya Tanishq Patel
Increased accuracy through implemented Logistic Regression ,
hyper parameter tuning Random Forest Classifier , Decision
Tree Classifier

Tirth Patel Falak Patel


Implemented Support Vector Responsible for making
Machine (SVM) , K-Nearest logbook, ppt, video of the
Neighbors (KNN) , AdaBoost project
WORKFLOW
DATASET
Our dataset contains 768 rows of
people with 8 different parameters
important for predicting diabetes
which include - Pregnancies, Glucose,
B.P, skin thickness, insulin, BMI,
Diabetes Pedigree Function and age
WORKFLOW
DETECT OUTLIERS AND
NULL VALUES
During preprocessing, we specifically
handled outliers in the Insulin and
DiabetesPedigreeFunction columns to
improve data quality and prevent
biased predictions.
Models Used
WORKFLOW We applied and compared several
machine learning models using
Python, Pandas, and Scikit-learn:
• Logistic Regression
• Random Forest Classifier
• Decision Tree Classifier
• Support Vector Machine (SVM)
• K-Nearest Neighbors (KNN)
• AdaBoost
• Gradient Boosting Classifier
WORKFLOW
Key Learning
For models where we initially
observed overfitting (such as Random
Forest, Decision Tree, and Gradient
Boosting), we applied
hyperparameter tuning using
GridSearchCV. This helped optimize
performance and reduce overfitting,
making it one of the most impactful
parts of our project.
Objective

By identifying important risk factors and accurately predicting


the likelihood of diabetes, this system can assist healthcare
providers in early diagnosis, proactive intervention, and better
patient care.
Training Validation
Algorithm
Accuracy Accuracy
Decision Tree
Classifier
82.2% 85.5%

Random Forest
89.2% 87.6%

RESULTS
Classifier

Support Vector
Machine (SVM)
76.22% 77.27%

Logistic Regression
Accuracy
76.87% 79.87%

KNN Accuracy 79.80% 73.38%

ADABOOST Accuracy 80.29% 79.22%


In this project we aim to leverage
machine learning techniques to improve
the accuracy of diabetes predictions by
analysing complex datasets and
identifying patterns that traditional
methods may overlook.
The objective includes comparing
different classification models to
determine their effectiveness in
predicting diabetes, which will help
identify the most reliable methods for
healthcare applications.
Comparing different algorithms helps
identify the best- performing model for a
specific dataset. Techniques such as cross-
validation and ROC- AUC curves.

You might also like