60% found this document useful (10 votes)

6K views2 pages

Machine Learning Exam Paper GR20D5129

This document is a machine learning exam question paper that contains two parts: Part A consists of 10 short answer questions worth 2 marks each on topics like reinforcement learning, overfitting, entropy, linear discriminant analysis, K-means clustering, and active learning. Part B consists of 5 long answer questions worth 10 marks each, including questions on computer vision applications, logistic regression, decision trees, clustering, dimensionality reduction, and sequential data modeling. Students are required to answer all questions in the paper over its 3 hour duration for a total of 70 marks.

Uploaded by

SH Gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

60% found this document useful (10 votes)

6K views2 pages

Machine Learning Exam Paper GR20D5129

Uploaded by

SH Gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Part A
Part B
Advanced Questions

CODE: GR20D5129 GR 20 SET 4

Regular Eaaminations, October/November 2021

[Link] Year l Senmester Regu
Machine Learning
(Data Science) Max Marks: 70
Time: 3 hours

Instructions:
[Link] paper comprises of Part-A
and
Part-B
at one place in the answer book.
2. Part-A (for 20 marks) must be answercd
questions.
3. Part-B (for 50 marks) consists of
tivequestions with internal choice, answer all

PART A
(Answer ALL questions. All questions carry equal marks)
10*2 20 Marks

1. a. How Reinforcement learning is difierent from Unsupervised Learning. 121

b. Define the term Overfitting. Give one solution for it. 121

Give the formula for What is the role of Information gain in constructing the 21
C. Entropy.
decision tress?
Analysis in Machine learning? 121
d. What is role of Linear Discriminant

Mixture and Latent factor models.

2
e. Compare
.
Write the importance of K-Means Clustering. 121
Write about R--Score and its significance. 121

h. Give the significance of Hyperparameter

Optimization. 12
with sparse data? 121
i. How does Machine Learning deal
What is the advantage of Active learning? 121
j
PART B

(Answer ALL questions. All questions

carry equal marks)
5 10 50 Marks

applications related to Computer

Vision. 101
(a) Discuss on any three
bias-variance trade off.
(b) Define the terms bias and variance. Elaborate on

3. (a) Explain in detail about learning curves. 10

in
(b) How Machine Lesrming and Deep Learning are related? Explain regularization
deep learning

the mathematical modelling involved in 10

4. (a) What is Logistic Regression? Explain
Logistic Regression classifier with an example.

(b) Elaborate on Distance based Methods.

Pagel of 2
CODE:GR20D5129 GR2 0 SET -4
tree by using iD3 algorithm. |10)
Consider the following
data and
Construct the dec ision

Competition Type
Profit(Class)
Age
Yes Software Down
Olk Down
No S o f t w a r e

Old
No H a r d w a r e
Dowd
Old Down
Yes Software

Mid
Yes Hardware
Down
Mid
No Hardware Up
Mid
No Software Up
Mid
Yes Software Up
New
No Hardwaré Up
New
No Software Up
New

[10
based clustering algorithms.
Write about Density
6. (a) Hierarchical clustering? Justify
c o m e s under
Agglomerative or
Does K-means
(b) example.
answer with a numerical
your
OR
Factorization approach for Dimensionality 101
Matrix
Kernel PCA
Describe PCA and
Reduction.

models. Write
about Boosting and Bagging 10
need of Ensemble
(a) Explain metrics effects Regression analysis.
Error and R-
Mean Square
(b) Explain
OR
l0
evaluating
machine learning algorithms.
9. (a) Discuss on will increase accuracy
of the models?
mechanism
How this
(b) What is Pipeline? example.
relevant
Explain with
10
data.
RNN algorithm for sequential
6.(a) Explain
Learning.
Active Learning in Machine
the need of
(b) Explain
OR
market data.
10
Stock
for Time Series analvsis for
Elaborate the procedure and machine
11. (a) deep learning
reinforcement learning from
distinguishes
(b) What
learning?

****

Page2 of 2

Common questions

The formula for entropy in the context of information theory is given by: \( H(S) = -\sum_{i=1}^{n} P(x_i) \log_2 P(x_i) \), where \( P(x_i) \) is the probability of occurrence for each class \( x_i \) in a dataset \( S \). Information gain is a key metric used in the construction of decision trees. It measures the reduction in entropy or impurity after a dataset is split on an attribute. Information gain is used to choose the attribute that best separates the data into distinct classes, thereby resulting in an optimal decision tree. The chosen attribute is the one that, when divided into branches, results in the most significant reduction in weighted entropy, which is crucial for building a tree that generalizes well to unseen data .

Machine Learning handles sparse data using techniques that accommodate or reduce sparsity, which is characterized by the presence of a large number of zeros in datasets like text or collaborative filtering scenarios . One common approach is the use of Regularization, such as L1 regularization (Lasso), which encourages sparsity in the model coefficients themselves. Another method is Matrix Factorization, better suited for recommendation systems, where sparse matrices are approximated by two lower-dimensional matrices that capture the latent patterns . Feature selection methods help by reducing dimensionality, retaining only the most informative features. Additionally, techniques like Sparse Coding and Compressive Sensing explicitly focus on representing data as a sparse combination of basis elements, thus dealing effectively with high-dimensional and sparse environments . These strategies ensure that sparse data does not compromise the performance, efficiency, and scalability of Machine Learning models.

Mixture Models and Latent Factor Models are both probabilistic in nature but have distinct purposes and methodologies. Mixture Models, such as Gaussian Mixture Models, assume that data points are generated by a mixture of several distributions, each representing a different cluster or group within the data. They are useful for capturing population heterogeneity and are often used for clustering tasks without considering any underlying structure beyond the mixture . On the other hand, Latent Factor Models, such as those used in collaborative filtering, assume that observed data is influenced by unobserved (latent) factors. These models aim to uncover the latent factors responsible for observed correlations and are commonly used in recommendation systems to model interactions between entities, such as users and items . Therefore, while Mixture Models focus on clustering based on data distribution, Latent Factor Models emphasize discovering hidden structures influencing the observable data.

Linear Discriminant Analysis (LDA) is used in Machine Learning primarily for dimensionality reduction and classification. It projects data from a higher-dimensional space to a lower-dimensional space while maintaining separability among classes . LDA maximizes the ratio of between-class variance to the within-class variance in any particular dataset, ensuring that the classes remain as distinct as possible when mapped to a smaller subspace. Unlike PCA, which focuses solely on maximizing variance without regard to class labels, LDA explicitly accounts for the class label information, making it better suited for classification tasks where class separability is essential . Thus, LDA is potent in scenarios where the objective is to find the feature space that best discriminates between known classes.

Active Learning is beneficial in Machine Learning because it enhances learning efficiency by selectively querying the most informative data points for labeling, thereby reducing the overall labeling cost and improving model performance with fewer labeled instances . This is particularly advantageous in scenarios where labeling data is expensive, time-consuming, or requires expert input, such as medical diagnosis or fine-tuning language models where huge labeled datasets are scarce. Active Learning helps against the downsides of random sampling by focusing on data points that are likely to improve the decision boundary or fill knowledge gaps in the model’s current understanding . This selective querying process ensures that the model obtains the most value per label, making it an impactful strategy when resources are constrained.

The R-Square, or R² Score, is a statistical measure that represents the proportion of the variance for a dependent variable that's explained by an independent variable or variables in a regression model . It provides a measure of how well the observed outcomes are replicated by the model, based on the proportion of total variation that is explained by the model. The R² Score is important because it offers a quantifiable value to assess the goodness-of-fit of the model, with values closer to 1 indicating better model performance. However, a high R² does not necessarily mean the model is optimal, as it can sometimes increase with more variables without improving model prediction, leading to overfitting . As such, R² needs to be interpreted in the context of the model complexity and the specific characteristics of the data.

K-Means clustering is significant in Machine Learning because it provides a simple yet efficient way to categorize data into distinct groups, facilitating data analysis and summarization . The algorithm works by partitioning the dataset into \( k \) clusters, where each data point belongs to the cluster with the nearest mean, serving as a prototype of the cluster. Key advantages include its efficiency in handling large datasets and its straightforward implementation. However, K-Means has notable limitations. It is sensitive to the initial placement of centroids, which can lead to different results in different runs. The algorithm assumes that clusters are spherical and equally sized, which may not align with the real cluster structure in data. Additionally, it struggles with varying cluster sizes and densities, and is not robust against outliers and noise, highlighting the importance of careful preprocessing and parameter selection .

Hyperparameter Optimization involves the process of finding the best combination of hyperparameters for a Machine Learning model, which are the external configurations not learned from the training data but set prior to the learning process . This optimization is crucial for model performance because hyperparameters can significantly affect the model's predictive power, convergence, and computational efficiency. Poorly chosen hyperparameter values can lead to model underfitting, overfitting, or inefficient learning. Techniques like grid search, random search, and Bayesian optimization are employed to systematically explore the hyperparameter space for optimal settings . Optimizing these values ensures that the model is well-tuned to extract meaningful patterns from the data, thereby enhancing generalization and improving predictive performance on unseen data.

Overfitting occurs in Machine Learning when a model learns not only the training data but also the noise and outliers, making it perform well on the training data but poorly on unseen data . It indicates that the model has become too complex and specific to the training dataset. One common solution to overfitting is to implement regularization techniques, such as adding a penalty for larger coefficients in linear models (L1 or L2 regularization). Regularization helps to keep the model complexity in check and ensures that the model generalizes better to new data by preventing it from fitting the noise.

Reinforcement Learning (RL) differs from Unsupervised Learning in its learning approach. In RL, an agent learns to make decisions by taking actions in an environment to maximize cumulative reward without explicit supervision. The learning is based on the feedback from its actions in the form of rewards or penalties . In contrast, Unsupervised Learning involves finding hidden patterns or intrinsic structures in input data without labeled responses. Here, data is not associated with any output labels, and algorithms attempt to learn the underlying structure without any specific signals for success . Thus, RL focuses on sequential decision-making with performance improvements guided by rewards, while Unsupervised Learning focuses on data organization and understanding inherent patterns.

CODE: GR20D5129
GR 20
SET 4
M.Tech Year l SenmesterRegu
Regular Eaaminations, October/November 2021
Machine Learning
(D

CODE:GR20D5129
SET -4
GR 20
Consider the following data and Construct the dec
ision tree by using iD3 algorithm.
|10)
C

MC4301 Machine Learning Exam Paper
No ratings yet
MC4301 Machine Learning Exam Paper
3 pages
JNTUH Machine Learning Exam Papers
67% (3)
JNTUH Machine Learning Exam Papers
7 pages
Machine Learning Exam Paper CP 5191
100% (3)
Machine Learning Exam Paper CP 5191
2 pages
Key Machine Learning Exam Questions
100% (4)
Key Machine Learning Exam Questions
2 pages
Candidate Elimination Algorithm Overview
No ratings yet
Candidate Elimination Algorithm Overview
6 pages
Deep Learning Exam Questions 2021-22
83% (6)
Deep Learning Exam Questions 2021-22
7 pages
Deep Learning Question Bank 2024-25
No ratings yet
Deep Learning Question Bank 2024-25
2 pages
IITKGP Machine Learning Assignment 1
100% (1)
IITKGP Machine Learning Assignment 1
7 pages
Machine Learning Assignment Overview
No ratings yet
Machine Learning Assignment Overview
46 pages
Machine Learning Assignment 3 Overview
No ratings yet
Machine Learning Assignment 3 Overview
5 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
Deep Learning Exam Question Papers
100% (4)
Deep Learning Exam Question Papers
3 pages
Reinforcement Learning Model Paper
100% (1)
Reinforcement Learning Model Paper
1 page
SVM Classifier on Modified Iris Dataset
No ratings yet
SVM Classifier on Modified Iris Dataset
45 pages
AD3501 Deep Learning Course Syllabus
100% (2)
AD3501 Deep Learning Course Syllabus
1 page
Anna University ML Question Paper Set 1
0% (1)
Anna University ML Question Paper Set 1
4 pages
Machine Learning Course Overview
100% (1)
Machine Learning Course Overview
88 pages
Solved Machine Learning Question Paper
No ratings yet
Solved Machine Learning Question Paper
55 pages
AD3501 - Deep Learning University Question
100% (1)
AD3501 - Deep Learning University Question
2 pages
Deep Learning Question Bank 2022
No ratings yet
Deep Learning Question Bank 2022
14 pages
Machine Learning Exam Question Papers
100% (1)
Machine Learning Exam Question Papers
6 pages
NPTEL Machine Learning Assignment Week 1
No ratings yet
NPTEL Machine Learning Assignment Week 1
18 pages
Reinforcement Learning Exam Questions
80% (5)
Reinforcement Learning Exam Questions
2 pages
Deep Learning Exam Paper 2022-23
0% (1)
Deep Learning Exam Paper 2022-23
2 pages
Naive Bayes Classifier Analysis
No ratings yet
Naive Bayes Classifier Analysis
4 pages
AL3451 Machine Learning Exam Paper
100% (2)
AL3451 Machine Learning Exam Paper
3 pages
Machine Learning Question Bank Module
No ratings yet
Machine Learning Question Bank Module
7 pages
IITKGP Machine Learning Week 2 Solutions
100% (1)
IITKGP Machine Learning Week 2 Solutions
14 pages
Machine Learning Question Bank for B.Tech
No ratings yet
Machine Learning Question Bank for B.Tech
29 pages
Foundations of Deep Learning Syllabus
100% (1)
Foundations of Deep Learning Syllabus
13 pages
IIT Kanpur Machine Learning Exam Paper
100% (2)
IIT Kanpur Machine Learning Exam Paper
10 pages
AI & ML Question Bank for 18CS71
100% (2)
AI & ML Question Bank for 18CS71
8 pages
Machine Learning Exam Paper 2021-22
100% (1)
Machine Learning Exam Paper 2021-22
3 pages
Machine Learning Assignment Overview
100% (1)
Machine Learning Assignment Overview
45 pages
BCS602 Machine Learning Question Bank
100% (2)
BCS602 Machine Learning Question Bank
2 pages
ML Question Paper
No ratings yet
ML Question Paper
2 pages
Week 11 Machine Learning Assignment
100% (1)
Week 11 Machine Learning Assignment
3 pages
Neural Networks and Deep Learning Exam Paper
100% (2)
Neural Networks and Deep Learning Exam Paper
4 pages
Unit I: Machine Learning Techniques
No ratings yet
Unit I: Machine Learning Techniques
21 pages
BAI602 Machine Learning Syllabus
100% (1)
BAI602 Machine Learning Syllabus
4 pages
Machine Learning Exam Paper - B.Tech 2025
No ratings yet
Machine Learning Exam Paper - B.Tech 2025
3 pages
CS3491 AI & ML Question Bank
100% (1)
CS3491 AI & ML Question Bank
4 pages
NPTEL Machine Learning MCQ Assignment
No ratings yet
NPTEL Machine Learning MCQ Assignment
26 pages
Deep Learning Question Paper 2024
No ratings yet
Deep Learning Question Paper 2024
2 pages
Deep Learning Important Questions Guide
No ratings yet
Deep Learning Important Questions Guide
2 pages
CP4252 Machine Learning Course Syllabus
No ratings yet
CP4252 Machine Learning Course Syllabus
4 pages
JNTUH R22 Machine Learning Course Notes
100% (1)
JNTUH R22 Machine Learning Course Notes
33 pages
Week 1 Machine Learning Assignment
100% (5)
Week 1 Machine Learning Assignment
5 pages
NLP Assignment 1: MCQ Solutions
No ratings yet
NLP Assignment 1: MCQ Solutions
4 pages
Week 2 Machine Learning Assignment Solutions
100% (1)
Week 2 Machine Learning Assignment Solutions
8 pages
Machine Learning Categories Explained
100% (2)
Machine Learning Categories Explained
12 pages
Deep Learning Question Bank 18CS731
75% (4)
Deep Learning Question Bank 18CS731
5 pages
Soft Computing Techniques Question Bank
100% (1)
Soft Computing Techniques Question Bank
10 pages
MP Neuron Threshold Calculations
100% (3)
MP Neuron Threshold Calculations
12 pages
KTU Deep Learning Course Overview
No ratings yet
KTU Deep Learning Course Overview
6 pages
CP4252 Machine Learning Lab Manual
100% (1)
CP4252 Machine Learning Lab Manual
48 pages
Machine Learning Assignment MCQs
No ratings yet
Machine Learning Assignment MCQs
57 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
Machine Learning Applications Exam Guide
No ratings yet
Machine Learning Applications Exam Guide
2 pages
Keras Neural Network Life-Cycle Overview
No ratings yet
Keras Neural Network Life-Cycle Overview
13 pages
SBL How To Earn Marks
No ratings yet
SBL How To Earn Marks
22 pages
Tala High School Speaking Rubric
No ratings yet
Tala High School Speaking Rubric
2 pages
Module 2 - NC II - Working in A Team Environment - Final
0% (1)
Module 2 - NC II - Working in A Team Environment - Final
92 pages
Future Jobs in the AI Era
No ratings yet
Future Jobs in the AI Era
2 pages
Grade 5 English Budget of Work
100% (4)
Grade 5 English Budget of Work
14 pages
Data Science: Techniques and Applications
No ratings yet
Data Science: Techniques and Applications
7 pages
Overview of Research Design Types
No ratings yet
Overview of Research Design Types
30 pages
CELPIP Speaking Test Rubric Guide
No ratings yet
CELPIP Speaking Test Rubric Guide
2 pages
Jazz Chants ®: Teaching TO Young Learners
No ratings yet
Jazz Chants ®: Teaching TO Young Learners
33 pages
Compound Adjectives in Lifestyle Magazines
No ratings yet
Compound Adjectives in Lifestyle Magazines
63 pages
Management of Change and Transformation
No ratings yet
Management of Change and Transformation
61 pages
Grade 11 Math Performance Tasks
No ratings yet
Grade 11 Math Performance Tasks
4 pages
Understanding Psychological Disorders
No ratings yet
Understanding Psychological Disorders
2 pages
UPCAT Difficulty and Passing Rates Analysis
No ratings yet
UPCAT Difficulty and Passing Rates Analysis
48 pages
Approaches to Probability Theory
No ratings yet
Approaches to Probability Theory
3 pages
Personality Styles: Buffalo, Eagle, Beaver, Mouse
100% (3)
Personality Styles: Buffalo, Eagle, Beaver, Mouse
2 pages
Effective Communication Techniques in Nursing
No ratings yet
Effective Communication Techniques in Nursing
34 pages
Scaffolding Literacy in Monologue Teaching
No ratings yet
Scaffolding Literacy in Monologue Teaching
8 pages
Multimedia Learning Media Development
0% (1)
Multimedia Learning Media Development
14 pages
Reflective Journal for DET Level 5
No ratings yet
Reflective Journal for DET Level 5
4 pages
Lesson Plan: Linear vs Non-Linear Texts
No ratings yet
Lesson Plan: Linear vs Non-Linear Texts
26 pages
Lesson Plan: Teaching Verb "TO BE"
No ratings yet
Lesson Plan: Teaching Verb "TO BE"
2 pages
Reported Speech: Rules and Examples
No ratings yet
Reported Speech: Rules and Examples
12 pages
CV Arif Khan Updated MAR2016 PDF
No ratings yet
CV Arif Khan Updated MAR2016 PDF
4 pages
Moore9 Chap01 SG
No ratings yet
Moore9 Chap01 SG
7 pages
Management Information System Exam Guide
No ratings yet
Management Information System Exam Guide
3 pages
Computer Aided Design Course Overview
No ratings yet
Computer Aided Design Course Overview
6 pages
Weekly Teacher Work Plan: Grade 12 HUMSS
No ratings yet
Weekly Teacher Work Plan: Grade 12 HUMSS
4 pages
Anupam Sarkar: Design Expertise Profile
No ratings yet
Anupam Sarkar: Design Expertise Profile
1 page
EFL Teacher Observation Report Sample
No ratings yet
EFL Teacher Observation Report Sample
6 pages

Machine Learning Exam Paper GR20D5129

Uploaded by

Machine Learning Exam Paper GR20D5129

Uploaded by

CODE: GR20D5129 GR 20 SET 4

Regular Eaaminations, October/November 2021

1. a. How Reinforcement learning is difierent from Unsupervised Learning. 121

Mixture and Latent factor models.

h. Give the significance of Hyperparameter

(Answer ALL questions. All questions

applications related to Computer

3. (a) Explain in detail about learning curves. 10

the mathematical modelling involved in 10

(b) Elaborate on Distance based Methods.

Common questions

What is the formula for entropy, and how does information gain utilize entropy in decision tree construction?

How does Machine Learning handle sparse data, and what techniques are used to address sparsity issues?

What are the differences between Mixture Models and Latent Factor Models in Machine Learning?

What role does Linear Discriminant Analysis (LDA) play in Machine Learning, and how does it differ from other dimensionality reduction techniques?

What are the benefits of Active Learning in Machine Learning, and in what scenarios is it particularly advantageous?

Explain the significance of the R-Square (or R² Score) in regression analysis.

Why is K-Means clustering significant in Machine Learning, and what are its primary limitations?

What is Hyperparameter Optimization, and why is it crucial in Machine Learning model performance?

What is overfitting in Machine Learning, and how can it be mitigated?

How do Reinforcement Learning and Unsupervised Learning differ in their approach to learning from data?

You might also like