0% found this document useful (0 votes)

83 views5 pages

Machine Learning - Question

Machine Learning QuestionBank

Uploaded by

Kaustubh Desale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views5 pages

Machine Learning - Question

Machine Learning QuestionBank

Uploaded by

Kaustubh Desale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Machine Learning

Module 1 :
1. Explain any five business applications of Machine learning(5/20)
2. Write a short note on issues in Machine Learning.(10/24)
3. Consider the use case of Email spam detection. Identify and explain the suitable
machine learning technique for this task.(10/24)
4. How to choose the right ML algorithm? (5/23)
5. Explain any five applications of Machine Learning. (10/23)
6. Explain how to choose the right algorithm for machine learning application.(10/23)
7. Explain the terms overfitting, underfitting, bias & variance tradeoff w.r.t. Machine
Learning.(10/23)
8. What are the issues in Machine learning?(5/22)
9. Explain the steps of developing Machine Learning applications.(10/22)
10. Define Machine Learning and Explain with example importance of Machine
Learning(5/19)

Module 2:
1. Explain performance evaluation metrics for binary classification with suitable
example.(5/24)
2. Explain Gini index along with an example.(5/24)
3. Consider the example below where the mass, 𝑦 (grams), of a chemical is related to the
time, 𝑥 (seconds), for which the chemical reaction has been taking place according to
the table. Find the equation of the regression line. Also explain performance
evaluation measures for regression.

(10/24)
4. Explain Clustering with minimal spanning tree along with [Link] the
dataset given below with 3 features Color, Wig, Num. Ears and one output variable
Emotion.

● Find root node of decision tree using GINI index.

● Explain techniques can be used to handle over fitting in decision trees?(10/24)
5. Explain Regression line, Scatter plot, Error in prediction and Best fitting line. (5/23)
6. Explain the concept of Logistic Regression (5/23)
7. Explain Multivariate Linear regression method. (10/23)
8. Create a decision tree using Gini Index to classify following dataset for profit.
Find SVD for A = (10/23)

9. Linear Regression (10/23)

10. Explain any five performance measures along with example.(5/23)
11. Differentiate between Logistic regression and Support vector machine.(5/23)
12. Explain the following Receiver operating characteristics curve and Area under
curve.(10/23)
13. Explain the concept of regression and enlist its types. A clinical trial gave the data for
BMI and Cholesterol level for 10 Patients as shown in table below, Identify the
machine learning method used to solve the above problem and predict the likely value
of Cholesterol level for someone who has BMI of 27.(10/23)

14. Explain the concept of decision tree. Consider the dataset given in a table below. The
dataset has 3 features as Past Trend, Open interest. Trading volume and one class
label as Return. Compute the Gini Index for all features and specify which node will
be chosen as a root node in decision tree.(10/23)

15. Explain Regression line, Scatter plot, Error in prediction and Best fitting line(5/22)
16. Explain Logistic Regression(5/22)
17. Explain Linear regression along with an example. (10/22)

18. (10/22)
19. Performance Metrics for Classification (10/22)
20. List some advantages of derivative-based optimization techniques. Explain Steepest
Descent method for optimization(10/19)
21. Explain various basic evaluation measures of supervised learning Algorithm for
Classification.(10/19)
22. Consider following table for binary classification. Calculate the root of the decision
tree using Gini index.(10/19)

23. Logistic Regression(5/19)

Module 3:
1. Explain the concept of k fold cross validation.(5/24)
2. Compare Bagging and Boosting with reference to ensemble learning. Explain how
these methods help to improve the performance of the machine learning model(10/24)
3. Explain Ensemble learning algorithm Random Forest and its use cases in real world
applications.(10/24)
4. Explain the Random Forest algorithm in detail. (10/23)
5. Explain the concept of bagging and boosting.(10/23)
6. Explain the necessity of cross validation in Machine learning applications and K-fold
cross validation in detail.(10/23)
7. Explain different ways to combine classifiers.(10/23)
8. Explain the Random Forest algorithm in detail.(10/22)
9. Explain the different ways to combine the classifiers.(10/22)

Module 4:
1. Define following terminologies with reference to Support vector machine: Hyper
plane, Support Vectors, Hard Margin, Soft Margin, Kernel (10/24)
2. Describe Multiclass classification.(10/23)
3. Explain support vector machine as a constrained optimization problem.(10/23)
4. Explain kernel Trick in support vector machine.(10/24)
5. Explain multiclass classification techniques.(10/23)
6. Explain the concept of margin and support vector(5/22)
7. Describe Multiclass classification.(10/22)
8. Why is SVM more accurate than logistic regression?(5/19)
9. Explain Radial Basis Function with example.(5/19)
10. Define Support Vector Machine. Explain how margin is computed and optimal
hyper-plane is decided.(10/19)

Module 5:
1. What is Density based clustering? Explain the steps used for clustering task using
Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
algorithm.(10/24)
2. Explain K-means algorithm. (5/23)
3. DBSCAN (10/23)
4. Explain clustering with minimal spanning tree with reference to Graph based
clustering.(10/23)
5. Explain the concept of Expectation Maximization Algorithm. (10/23)
6. Explain DBSCAN algorithm along with example(10/23)
7. Explain the distance metrics used in clustering. (5/22)
8. Explain EM algorithm. (10/22)
9. DBSCAN(10/22)
10. EM Algorithm(5/19)
Module 6:
1. What is dimensionality reduction? Explain how it can be utilized for classification and
clustering task in Machine learning.(5/24)
2. Explain the Dimensionality reduction technique Linear Discriminant Analysis and its
real-world applications.(10/24)
3. Explain the concept of feature selection and extraction.(5/23)
4. Linear Discriminant Analysis for Dimension Reduction (10/23)
5. Explain Linear Discriminant Analysis.(5/23)
6. Explain in detail Principal Component Analysis for Dimensionality reduction(10/23)
7. Compute the Linear Discriminant projection for the following two-dimensional
dataset. X1= (x1, x2) = {(4,1), (2,4), (2,3), (3,6), (4,4)} and X2= (x1, x2) = {(9,10),
(6,8), (9,5), (8,7), (10,8)}(10/22)
8. Principal Component Analysis for Dimension Reduction (10/22)
9. What is Dimensionality reduction? Describe how Principal Component Analysis is
carried out to reduce dimensionality of data sets.(10/19)

10. Find the singular value decomposition of (10/19)

Common questions

Dimensionality reduction is crucial for enhancing model performance by reducing the number of input variables, mitigating the curse of dimensionality, and improving interpretability without significant loss of information. Techniques like Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) help uncover latent structures, leading to more manageable and computationally efficient models. This often results in faster training times, less overfitting, and better generalization to new data .

Logistic Regression is preferable when interpretability is key, as it provides direct insights into feature contributions through coefficients. It is effective for binary classification with linear boundaries and is computationally less intensive when large datasets are involved. Conversely, SVM is better suited for non-linear and high-dimensional data due to its use of kernel functions. Logistic Regression is also preferred when the problem is less sensitive to outliers .

ROC curve plots the true positive rate against the false positive rate for various threshold settings, visualizing classifier performance across thresholds. AUC quantifies the entire two-dimensional area underneath the ROC curve, summarizing the model's ability to distinguish between classes. A higher AUC indicates better model performance, with a value of 0.5 representing no discriminative power. Both metrics help in model selection by comparing the diagnostic ability of different classifiers .

Binary classification performance metrics include precision, recall, F1-score, and accuracy, focusing on correctly distinguishing between two classes. Metrics such as the Receiver Operating Characteristics (ROC) curve and Area Under Curve (AUC) provide insights into a model's ability to avoid false positives and negatives. In regression tasks, metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared measure how well a model predicts continuous values by quantifying prediction errors and variance explained by the model .

SVM is formulated as a constrained optimization problem where the goal is to find the optimal hyperplane that maximizes the margin between different classes in a feature space. This involves solving for a hyperplane that separates data points with the widest margin, subject to the constraints that data points are classified correctly. This is achieved by minimizing classification error and maximizing the geometric margin, often using Lagrange multipliers and kernel functions to handle linear and non-linear relationships .

Overfitting in decision trees occurs when the model becomes too complex and captures noise along with the actual data patterns. Techniques to address this include pruning, which removes sections of the tree that provide little predictive power, setting a maximum depth for the tree, and using ensemble methods like Random Forests to average out individual tree overfitting tendencies .

The Gini Index measures the impurity or purity of a dataset split. It quantifies how well a decision tree split can separate different classes. The feature with the lowest Gini Index after a split is selected as the root node because it best differentiates the data into the desired classes. This helps optimize tree structure, making decisions more efficient and improving prediction accuracy .

Bagging, or Bootstrap Aggregating, creates multiple versions of a model using subsets of data sampled with replacement and averages results to improve stability and accuracy. Boosting focuses on creating a sequence of models that fix errors made by previous models, emphasizing hard-to-learn instances with each iteration. Both methods enhance accuracy by reducing variance (bagging) and bias (boosting), thus improving generalization in models .

Selecting a suitable machine learning algorithm involves understanding the problem type (classification, regression, clustering, etc.), the data characteristics, and computational efficiency. One must consider the size and nature of the dataset, the desired interpretability of the model, potential overfitting concerns, and computational resources. Comparing performance metrics across different models using validation techniques like cross-validation can also help in choosing the right algorithm .

Machine learning models face several issues that impact their deployment, such as overfitting, where a model captures noise along with the underlying pattern in the data; underfitting, where the model is too simple to capture the underlying pattern; the bias-variance tradeoff, which involves balancing model complexity and prediction accuracy; data quality and quantity issues, as insufficient or poor-quality data can lead to inaccurate models; and computational complexity, which affects model scalability and real-time processing .

Supervised Learning: Regression & Classification
No ratings yet
Supervised Learning: Regression & Classification
30 pages
ML Lab Viva Questions and Answers
100% (1)
ML Lab Viva Questions and Answers
9 pages
Decision Tree Algorithm Overview
No ratings yet
Decision Tree Algorithm Overview
34 pages
Hyperparameter Selection in Deep Learning
No ratings yet
Hyperparameter Selection in Deep Learning
4 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
4 pages
Types of Machine Learning Algorithms
No ratings yet
Types of Machine Learning Algorithms
9 pages
Machine Learning Question Bank 2024
No ratings yet
Machine Learning Question Bank 2024
6 pages
Dimensionality Reduction Techniques Explained
No ratings yet
Dimensionality Reduction Techniques Explained
6 pages
PyTorch Autoencoder Architecture Guide
No ratings yet
PyTorch Autoencoder Architecture Guide
42 pages
Machine Learning Fundamentals Notes
100% (1)
Machine Learning Fundamentals Notes
4 pages
Machine Learning Exam Question Paper
No ratings yet
Machine Learning Exam Question Paper
3 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
3 pages
Sigmoid Deep Learning
No ratings yet
Sigmoid Deep Learning
8 pages
PCA in Unsupervised Learning
No ratings yet
PCA in Unsupervised Learning
14 pages
Understanding Linear Threshold Units
No ratings yet
Understanding Linear Threshold Units
19 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
28 pages
Generative Models in Deep Learning
No ratings yet
Generative Models in Deep Learning
21 pages
Decision Tree Algorithm with Tuning
No ratings yet
Decision Tree Algorithm with Tuning
5 pages
Machine Learning in Self-Driving Cars
No ratings yet
Machine Learning in Self-Driving Cars
43 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
38 pages
Machine Learning Overview by Sugata Ghosal
100% (1)
Machine Learning Overview by Sugata Ghosal
43 pages
Introduction to Data Science Notes
No ratings yet
Introduction to Data Science Notes
5 pages
Machine Learning Techniques Overview
100% (1)
Machine Learning Techniques Overview
99 pages
NLP Question Bank for ODD SEM 2023-24
No ratings yet
NLP Question Bank for ODD SEM 2023-24
3 pages
Deep Learning Module 1 Overview
No ratings yet
Deep Learning Module 1 Overview
46 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
20 pages
Machine Learning in Data Science: Unit 5
No ratings yet
Machine Learning in Data Science: Unit 5
19 pages
Dimensionality Reduction in Machine Learning
No ratings yet
Dimensionality Reduction in Machine Learning
4 pages
K-Nearest Neighbor Algorithm Overview
No ratings yet
K-Nearest Neighbor Algorithm Overview
28 pages
Deep Learning: Definition and Applications
No ratings yet
Deep Learning: Definition and Applications
63 pages
5 Applications
No ratings yet
5 Applications
48 pages
21CS743 Deep Learning Exam Solutions
No ratings yet
21CS743 Deep Learning Exam Solutions
33 pages
Gaussian Mixture Model Parameters Analysis
No ratings yet
Gaussian Mixture Model Parameters Analysis
24 pages
Machine Learning Techniques Question Bank
No ratings yet
Machine Learning Techniques Question Bank
13 pages
Deep Learning Overview Presentation
No ratings yet
Deep Learning Overview Presentation
14 pages
Key Concepts in Machine Learning
No ratings yet
Key Concepts in Machine Learning
4 pages
Machine Learning Lab Exam Questions 2023
100% (2)
Machine Learning Lab Exam Questions 2023
2 pages
Machine Learning in Retail Analysis
No ratings yet
Machine Learning in Retail Analysis
8 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
111 pages
Supervised Learning: K-NN & Decision Trees
No ratings yet
Supervised Learning: K-NN & Decision Trees
26 pages
Understanding Random Forest in ML
No ratings yet
Understanding Random Forest in ML
4 pages
Deep Learning Exam Answers and Matrix G
No ratings yet
Deep Learning Exam Answers and Matrix G
20 pages
Deep Reinforcement Learning Overview
No ratings yet
Deep Reinforcement Learning Overview
75 pages
Deep Learning: Machine Learning Basics
No ratings yet
Deep Learning: Machine Learning Basics
35 pages
Deep Learning Performance Metrics Guide
100% (1)
Deep Learning Performance Metrics Guide
27 pages
Introduction to Reinforcement Learning
No ratings yet
Introduction to Reinforcement Learning
70 pages
LDA and SVM in Machine Learning
No ratings yet
LDA and SVM in Machine Learning
38 pages
Data Mining Lab Manual with Python
No ratings yet
Data Mining Lab Manual with Python
35 pages
Machine Learning Lecture Notes PDF
No ratings yet
Machine Learning Lecture Notes PDF
19 pages
Deep Learning Data Processing Guide
No ratings yet
Deep Learning Data Processing Guide
41 pages
Deep Learning Regularization Techniques
No ratings yet
Deep Learning Regularization Techniques
27 pages
Supervised vs. Deep Learning Overview
No ratings yet
Supervised vs. Deep Learning Overview
83 pages
Key Concepts in Data Warehousing
No ratings yet
Key Concepts in Data Warehousing
4 pages
Data Preprocessing Steps in ML
100% (1)
Data Preprocessing Steps in ML
5 pages
Machine Learning Types and Applications
No ratings yet
Machine Learning Types and Applications
21 pages
Machine Learning Question Bank Module
No ratings yet
Machine Learning Question Bank Module
7 pages
Association Analysis in Data Mining
No ratings yet
Association Analysis in Data Mining
34 pages
Comprehensive Guide to Machine Learning
No ratings yet
Comprehensive Guide to Machine Learning
6 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
3 pages
EDAC Exam Sample Questions Guide
No ratings yet
EDAC Exam Sample Questions Guide
8 pages
Non-Balanced Binary Tree Query
No ratings yet
Non-Balanced Binary Tree Query
4 pages
Proposed WPS for API 1104 Welding
No ratings yet
Proposed WPS for API 1104 Welding
1 page
DMGT Mid-II Paper
No ratings yet
DMGT Mid-II Paper
1 page
NIKE Valuation: $175 Share Price Insight
No ratings yet
NIKE Valuation: $175 Share Price Insight
35 pages
Lehlohonolo Thabiso Godfrey Lebete Amended
No ratings yet
Lehlohonolo Thabiso Godfrey Lebete Amended
3 pages
Health Facility Preparedness SOP
No ratings yet
Health Facility Preparedness SOP
9 pages
Jembatan Bah Log Bore Machine Report
No ratings yet
Jembatan Bah Log Bore Machine Report
1 page
Microprocessor and Peripheral Devices Overview
No ratings yet
Microprocessor and Peripheral Devices Overview
70 pages
Kang Daniel: Biography and Career Overview
No ratings yet
Kang Daniel: Biography and Career Overview
13 pages
Caractère 128x128 cm Garden Table
No ratings yet
Caractère 128x128 cm Garden Table
5 pages
Ormoc City LPCC Meeting Minutes 2019
No ratings yet
Ormoc City LPCC Meeting Minutes 2019
4 pages
Understanding Metaphors in Literature
No ratings yet
Understanding Metaphors in Literature
5 pages
Delhi's Best Mithai and Desserts
No ratings yet
Delhi's Best Mithai and Desserts
26 pages
Edison Fuse Links Catalog Ca132008en
No ratings yet
Edison Fuse Links Catalog Ca132008en
12 pages
Squash and Syrup Production Guide
No ratings yet
Squash and Syrup Production Guide
2 pages
Executives' Challenge XIX Registration Form
No ratings yet
Executives' Challenge XIX Registration Form
1 page
DRG Testosterone ELISA Kit Overview
No ratings yet
DRG Testosterone ELISA Kit Overview
12 pages
Qwikcilver Corporate Reward Solutions
No ratings yet
Qwikcilver Corporate Reward Solutions
6 pages
Production Function and Cost Analysis
No ratings yet
Production Function and Cost Analysis
14 pages
GCSE English Language Exam Paper 1
No ratings yet
GCSE English Language Exam Paper 1
24 pages
Feasibility Study: Cookie Mango Corner
No ratings yet
Feasibility Study: Cookie Mango Corner
90 pages
Understanding Kidney Failure and Care
No ratings yet
Understanding Kidney Failure and Care
2 pages
AIS 1 Quiz: Information Systems Basics
No ratings yet
AIS 1 Quiz: Information Systems Basics
1 page
The Publishing Ministry
No ratings yet
The Publishing Ministry
351 pages
Polygon Types in Engineering Drawing
No ratings yet
Polygon Types in Engineering Drawing
24 pages
Grade 1 Mother Tongue Curriculum Standards
100% (1)
Grade 1 Mother Tongue Curriculum Standards
8 pages
Fatigue Curve For SCM440
No ratings yet
Fatigue Curve For SCM440
6 pages
Crumbl Cookies - Freshly Baked & Delivered Cookies
No ratings yet
Crumbl Cookies - Freshly Baked & Delivered Cookies
1 page
Sufism As A Category in Indonesian Literature and History
No ratings yet
Sufism As A Category in Indonesian Literature and History
15 pages

Machine Learning - Question

Uploaded by

Machine Learning - Question

Uploaded by

Machine Learning

● Find root node of decision tree using GINI index.

9. Linear Regression (10/23)

23. Logistic Regression(5/19)

10. Find the singular value decomposition of (10/19)

Common questions

Discuss the importance of dimensionality reduction in machine learning and its impact on model performance.

In what scenarios is Logistic Regression preferable over Support Vector Machines (SVM), and why?

Explain the role of Receiver Operating Characteristics (ROC) and Area Under Curve (AUC) in assessing the performance of a binary classifier.

How do performance metrics differ for binary classification tasks compared to regression tasks?

What is the concept of a Support Vector Machine (SVM) as a constrained optimization problem?

Describe the concept of overfitting and the techniques used to address it in decision trees.

How is the Gini Index used to determine the root node in a decision tree, and why is it important?

In what ways do bagging and boosting differ, and how do these ensemble methods improve model accuracy?

How would you select an appropriate machine learning algorithm for a specific business use case?

What are the potential issues faced in machine learning applications and how do they impact the deployment of these models?

You might also like