Machine Learning Assignment 4 Guide

This document outlines Assignment-4 for the Machine Learning course, detailing instructions for submission, including individual work requirements and plagiarism policies. It consists of three main tasks involving Decision Trees, Random Forests, Ensemble Methods, and K-Means clustering, with specific points allocated for each section. Students must submit their work in a specified format, including a .zip file containing code and reports, by the deadline of April 7, 2024.

Uploaded by

akshatrajsaxena22032003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views3 pages

Machine Learning Assignment 4 Guide

Uploaded by

akshatrajsaxena22032003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CSE343/CSE543/ECE363/ECE563: Machine Learning

Winter 2025

Assignment-4 (30 points)

Release: April 1, 2024 (Tuesday) Submission: 11:59 pm April 7, 2024 (Monday)

Instructions
• Institute Plagiarism Policy Applicable. Both programming and theoretical questions will be subjected
to strict plagiarism check.
• This assignment should be attempted individually. All questions are compulsory.
• Theory [T]: For theory questions, only hand-written solutions are acceptable. Attempt each question on a
different sheet & staple them together (for the ease of checking). Do not start a new question at the back of
the previous one. Do not forget to mention page number (bottom centre) and your credentials (bottom right)
on each sheet. It must be submitted in Assignment submission box kept in class during class time.
• Programming [P]: For programming questions, the use of python programming language is allowed only.
You must submit a single .py file named as A4 [Link]. Make sure the submission is self-complete &
replicable i.e., you are able to reproduce your results with the submitted files only. Use random seed wherever
applicable to retain reproducability.
• [Link] : Create a .pdf report of programming questions that contains your applied approach, pre-
processing, assumptions, analysis, visualizations, etc.. Anything not in the report will not be evaluated. Al-
ternatively, a well-documented .ipynb file (in addition to a single .py file mentioned in the previous bullet) with
answers to all the questions may be submitted as a report. The report must be named as A4 RollNo [Link]
or A4 RollNo [Link].
• File Submission: Submit a .zip named A4 [Link] (e.g., A4 [Link]) file containing the report and
code files.
• Submission Policy: Turn-in your submission as early as possible to avoid late submissions. In case of
multiple submissions, the latest submission will be evaluated. Late submissions will not be evaluated
and hence will be awarded zero marks.
• Clarifications: Symbols have their usual meaning. Assume the missing information & mention it in the
report. You are allowed to use any machine learning library until exclusively mentioned in the question that
it is supposed to be done from scratch. You can always use basic python libraries such as numpy, pandas,
and matplotlib, unless specified otherwise. Use Google Classroom for any queries. In order to keep it fair for
all, no email queries will be entertained. You may attend office/TA hours for personal resolutions. Start your
assignment early. No queries will be answered in Google Classroom comments during the last time.
• Compliance: The questions in this assignment are structured to meet the Course Outcomes CO1, CO2,
CO3, and CO4, as described in the course directory.

• There could be multiple ways to approach a question. Please justify your answers mathematically in the
theory questions, and via commented text in the programming questions appropriately. Questions without
justification will get zero marks.
1. Decision Trees and Random Forest (10 points)
[P ∥ CO1, CO2, CO3, CO4] Bank Marketing Data
(a) Use the ”Bank Marketing” dataset to preprocess the data, handling missing values, encoding categorical
variables, and splitting the dataset into training and testing sets. (1 point)
(b) Decision Tree Classifier:
i. Implement a Decision Tree classifier from scratch (without using high-level libraries like scikit-learn).
Your implementation should include:
• Building the decision tree using appropriate splitting criteria (e.g., information gain, Gini impu-
rity).
• Pruning techniques to prevent overfitting.
• Handling continuous and categorical features.
(3 points)
ii. Evaluate the performance of your Decision Tree classifier on the test set using appropriate metrics.
You should report accuracy, precision, recall, F1-score, area under the ROC curve. (1 point)
(c) Random Forest Classifier:
i. Implement a Random Forest classifier from scratch (without using high-level libraries). Your imple-
mentation should include:
• Building individual decision trees on bootstrapped samples of the training data.
• Combining the predictions of multiple decision trees using techniques like majority voting (for
classification) or averaging (for regression).
• Tuning hyperparameters like the number of trees, maximum depth, and others.
(3 points)
ii. Evaluate the performance of your Random Forest classifier on the test set using appropriate metrics.
You should report accuracy, precision, recall, F1-score, area under the ROC curve. (1 point)
(d) Compare the performance of your Decision Tree and Random Forest classifiers, and discuss the strengths
and weaknesses of each approach. (1 point)

2. Ensemble Methods for Credit Card Default Prediction (13 points)

[P ∥ CO1, CO2, CO3, CO4] Credit Card Default Data
(a) Use the ”Credit Card Default” dataset to preprocess the data, handling missing values, encoding cate-
gorical variables, and splitting the data into training and testing sets. (1
mark)
(b) Individual Classifiers:
i. Implement the following individual classifiers from scratch (without using high-level libraries like
scikit-learn):
• Logistic Regression
• Decision Tree Classifier
• K-Nearest Neighbors Classifier
(3 marks)
ii. Evaluate the performance of each individual classifier on the test set using appropriate metrics (e.g.,
accuracy, precision, recall, F1-score, area under the ROC curve). (1 mark)
(c) Ensemble Methods:
i. Implement the following ensemble methods from scratch:
• Bagging (Bootstrap Aggregating) with Decision Trees
• Boosting (AdaBoost) with Decision Trees
(3 marks)
ii. Evaluate the performance of your ensemble models on the test set using the same metrics as in step
3. You should report accuracy, precision, recall, F1-score, area under the ROC curve. (1.5 mark)
(d) Compare the performance of your ensemble models with the individual classifiers, and discuss the
strengths and weaknesses of the ensemble approaches. (1 mark)
(e) Experiment with five different set of hyperparameters (e.g., number of estimators, maximum depth of
trees, splitting criteria, max depth of the the tree) for your ensemble models and analyze their impact
on performance. ( 2.5 mark)
3. K-Means and Gaussian Mixture Models on the Iris Dataset (7 points)
[P ∥ CO1, CO2, CO3, CO4] Iris Dataset Analysis
(a) Load and preprocess the Iris dataset, handling any missing values and scaling the features as necessary.
(1 marks)
(b) K-Means Clustering Algorithm:
i. Implement the K-Means clustering algorithm from scratch:
• Implement the algorithm using random initialization and the k-means++ initialization method.
• Experiment with different values of K (number of clusters) and determine the optimal value
using evaluation metrics like the Silhouette score or the Elbow method.
• Visualize the clusters using scatter plots and t-sne plots.
(5 marks)
(c) Compare the performance of the K-Means with random initialization and k-means++ initialization the
Iris dataset using appropriate evaluation metrics. (1 mark)

Machine Learning Assignment 3 Guide
No ratings yet
Machine Learning Assignment 3 Guide
2 pages
Data Mining Lab Assignment Guide
No ratings yet
Data Mining Lab Assignment Guide
2 pages
Machine Learning Assignment: Decision Trees
No ratings yet
Machine Learning Assignment: Decision Trees
4 pages
Applied Machine Learning Assignment 1
No ratings yet
Applied Machine Learning Assignment 1
3 pages
Machine Learning Assignment 1 Guide
No ratings yet
Machine Learning Assignment 1 Guide
2 pages
Classification Algorithms Exercise Guide
No ratings yet
Classification Algorithms Exercise Guide
2 pages
Assignment 2
No ratings yet
Assignment 2
4 pages
Machine Learning Assignment CSL7620
No ratings yet
Machine Learning Assignment CSL7620
2 pages
ML Assignment: Decision Trees & Cross-Validation
No ratings yet
ML Assignment: Decision Trees & Cross-Validation
2 pages
2025 Machine Learning Homework 3
No ratings yet
2025 Machine Learning Homework 3
23 pages
ML Assignment 1
No ratings yet
ML Assignment 1
4 pages
FINC 614 Data Science Midterm Exam
No ratings yet
FINC 614 Data Science Midterm Exam
1 page
Machine Learning Assignment: Model Tuning
No ratings yet
Machine Learning Assignment: Model Tuning
3 pages
CSE455 Homework 1: ML Experiments
No ratings yet
CSE455 Homework 1: ML Experiments
2 pages
Disease Prediction ML Assignment
No ratings yet
Disease Prediction ML Assignment
7 pages
Data Mining and Machine Learning Course
No ratings yet
Data Mining and Machine Learning Course
7 pages
AI Classification Models Research Assignment
No ratings yet
AI Classification Models Research Assignment
3 pages
Decision Tree Analysis and Evaluation Techniques
No ratings yet
Decision Tree Analysis and Evaluation Techniques
9 pages
Machine Learning Coursework Guidelines
No ratings yet
Machine Learning Coursework Guidelines
11 pages
Machine Learning Assignment 3: SVM Tasks
No ratings yet
Machine Learning Assignment 3: SVM Tasks
3 pages
AI Classification Assignment Guidelines
No ratings yet
AI Classification Assignment Guidelines
3 pages
AI Complex Engineering Problem
No ratings yet
AI Complex Engineering Problem
4 pages
Predicting Customer Churn with ML
No ratings yet
Predicting Customer Churn with ML
2 pages
Practical Data Mining Assignment Overview
No ratings yet
Practical Data Mining Assignment Overview
5 pages
CSE 2C FDS Assignments
No ratings yet
CSE 2C FDS Assignments
9 pages
KTU Machine Learning Syllabus Overview
No ratings yet
KTU Machine Learning Syllabus Overview
7 pages
Key Machine Learning Concepts and Questions
No ratings yet
Key Machine Learning Concepts and Questions
2 pages
Machine Learning Assignment 3 Guide
No ratings yet
Machine Learning Assignment 3 Guide
2 pages
AMT 305 Machine Learning Syllabus
No ratings yet
AMT 305 Machine Learning Syllabus
16 pages
Machine Learning Project Guidelines
No ratings yet
Machine Learning Project Guidelines
6 pages
Python Packages and Data Analysis Concepts
No ratings yet
Python Packages and Data Analysis Concepts
7 pages
AI & ML Practical Exam Guidelines
No ratings yet
AI & ML Practical Exam Guidelines
2 pages
Heart Disease ML Classification Assignment
No ratings yet
Heart Disease ML Classification Assignment
2 pages
CS2207 Assignment 2
No ratings yet
CS2207 Assignment 2
8 pages
Machine Learning Case Studies for Businesses
No ratings yet
Machine Learning Case Studies for Businesses
5 pages
MLOps Course Placement Test Guide
No ratings yet
MLOps Course Placement Test Guide
6 pages
Finall Paper
No ratings yet
Finall Paper
7 pages
2025 Batch 2 Assignment 2 Spark
No ratings yet
2025 Batch 2 Assignment 2 Spark
2 pages
Sample Classification Question Paper
No ratings yet
Sample Classification Question Paper
36 pages
Machine Learning May 2024
No ratings yet
Machine Learning May 2024
8 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
7 pages
Assignment 3
No ratings yet
Assignment 3
7 pages
Semester - 3
No ratings yet
Semester - 3
122 pages
Essential Machine Learning Exam Topics
No ratings yet
Essential Machine Learning Exam Topics
3 pages
E-commerce Shopper Intent Classification
No ratings yet
E-commerce Shopper Intent Classification
1 page
Assignment 3
No ratings yet
Assignment 3
4 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
10 pages
Machine Learning Assignment Overview
No ratings yet
Machine Learning Assignment Overview
5 pages
Multiclass Classification Assignment Guide
No ratings yet
Multiclass Classification Assignment Guide
5 pages
Machine Learning Exam Solutions Guide
No ratings yet
Machine Learning Exam Solutions Guide
6 pages
Final Portfolio 2026 Instructions - Biratnagar
No ratings yet
Final Portfolio 2026 Instructions - Biratnagar
26 pages
Advanced Machine Learning Exam Solutions
No ratings yet
Advanced Machine Learning Exam Solutions
5 pages
Census Income Dataset Analysis Lab
No ratings yet
Census Income Dataset Analysis Lab
3 pages
CSE 575 Statistical ML Assignment 2
No ratings yet
CSE 575 Statistical ML Assignment 2
3 pages
Credit Risk Assessment Assignment Guide
No ratings yet
Credit Risk Assessment Assignment Guide
3 pages
Stock Price Forecasting with ML Methods
No ratings yet
Stock Price Forecasting with ML Methods
2 pages
Detection and Analysis of Autism Spectrum Disorder Using Random Forest Classifier
No ratings yet
Detection and Analysis of Autism Spectrum Disorder Using Random Forest Classifier
5 pages
Supervised Learning: Classification & Algorithms
No ratings yet
Supervised Learning: Classification & Algorithms
24 pages
Software Defect Prediction Framework
No ratings yet
Software Defect Prediction Framework
36 pages
33 Decision Tree Notes
No ratings yet
33 Decision Tree Notes
11 pages
Heart Disease Prediction Using Data Science
No ratings yet
Heart Disease Prediction Using Data Science
16 pages
Feature Selection Techniques Explained
No ratings yet
Feature Selection Techniques Explained
2 pages
Machine Learning Quiz Questions and Answers
No ratings yet
Machine Learning Quiz Questions and Answers
11 pages
Insurance Fraud Detection Model Insights
No ratings yet
Insurance Fraud Detection Model Insights
9 pages
Customer Churn Prediction in OTT
No ratings yet
Customer Churn Prediction in OTT
5 pages
Optimize Textile Processes with ML
No ratings yet
Optimize Textile Processes with ML
13 pages
Machine Learning for Nitrate Prediction in Ganga
No ratings yet
Machine Learning for Nitrate Prediction in Ganga
20 pages
Card Fraud Detection Using CatBoost & DNN
No ratings yet
Card Fraud Detection Using CatBoost & DNN
10 pages
Machine Learning for Fake Review Detection
No ratings yet
Machine Learning for Fake Review Detection
22 pages
Machine Learning for Earthquake Prediction
No ratings yet
Machine Learning for Earthquake Prediction
13 pages
Machine Learning for Employee Retention
No ratings yet
Machine Learning for Employee Retention
16 pages
Battery Health Prediction in EVs: A Review
No ratings yet
Battery Health Prediction in EVs: A Review
16 pages
Loan Default Prediction Framework Using ML
No ratings yet
Loan Default Prediction Framework Using ML
14 pages
CropTrack: Intelligent Plant Advisor
No ratings yet
CropTrack: Intelligent Plant Advisor
56 pages
Advanced Machine Learning for ROP Prediction
No ratings yet
Advanced Machine Learning for ROP Prediction
14 pages
FPIMS: Measuring Manufacturing Performance
100% (1)
FPIMS: Measuring Manufacturing Performance
8 pages
Regression Guide for Supporting Characters
100% (1)
Regression Guide for Supporting Characters
21 pages
Enhancing Network Intrusion Detection
No ratings yet
Enhancing Network Intrusion Detection
24 pages
Machine Learning for Airfare Prediction
No ratings yet
Machine Learning for Airfare Prediction
6 pages
Overview of Random Forest Algorithm
No ratings yet
Overview of Random Forest Algorithm
13 pages
Customer Churn Prediction Models Analysis
No ratings yet
Customer Churn Prediction Models Analysis
5 pages
Predictive Analytics in Retail Banking
No ratings yet
Predictive Analytics in Retail Banking
28 pages
Comparative Analysis of XGBoost
No ratings yet
Comparative Analysis of XGBoost
20 pages
Enhancing ML Robustness with Bagging
No ratings yet
Enhancing ML Robustness with Bagging
7 pages
Email Spam Detection with ML & NLP
No ratings yet
Email Spam Detection with ML & NLP
11 pages

Machine Learning Assignment 4 Guide

Uploaded by

Machine Learning Assignment 4 Guide

Uploaded by

CSE343/CSE543/ECE363/ECE563: Machine Learning

Assignment-4 (30 points)

Release: April 1, 2024 (Tuesday) Submission: 11:59 pm April 7, 2024 (Monday)

2. Ensemble Methods for Credit Card Default Prediction (13 points)

You might also like