0% found this document useful (0 votes)

8 views7 pages

Machine Learning Concepts and Techniques

Uploaded by

Manshi Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views7 pages

Machine Learning Concepts and Techniques

Uploaded by

Manshi Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

GURU TEGH BAHADUR 4TH CENTENARY ENGINEERING

COLLEGE

Session: 2022-2026

Assignment-1
Machine Learning (CIE-421T)
Part A – Short Answer
Q-1(a) What do you understand by noise in data? What could be implications on the result, if noise is not
treated properly?

Ans- Noise refers to irrelevant, random, or meaningless data that does not represent the true characteristics of the
underlying pattern.

Implications:

 Leads to poor model accuracy and unreliable predictions.

 Increases the risk of overfitting, as the model tries to learn random fluctuations.

Q-1(b) What do you understand by overfitting of data? Give any two methods to avoid overfitting.

Ans- Overfitting occurs when a model learns not only the underlying pattern but also the noise in the training data,
performing well on training data but poorly on unseen data.

Methods to avoid overfitting:

 Regularization (L1/L2 penalties, Dropout)

 Cross-validation or using more training data

Q-1(c) When should we use classification over regression? Explain using example.

Ans- Classification is used when the output variable is categorical (discrete labels). Regression is used when the
output variable is continuous numeric.

Example:

 Classification: Predicting if an email is Spam or Not Spam.

 Regression: Predicting the price of a house based on area, location, etc.

Q-1(d) Define the terms – Precision, Recall, F1-score and Accuracy.

 Precision: Fraction of correctly predicted positive cases out of all predicted positives. Precision=TP/TP+FP
 Recall: Fraction of correctly predicted positive cases out of all actual positives. Recall=TP/TP+FN
 F1-score: Harmonic mean of Precision and Recall. F1=2 × (Precision × Recall/Precision + Recall)
 Accuracy: Proportion of correctly classified instances. Accuracy=TP+TN/TP+FP+TN+FNA

Q-1(e) Define LDA and mention any two limitations.

Ans- LDA is a supervised dimensionality reduction and classification technique that projects data onto a lower-
dimensional space by maximizing the separation between classes while minimizing within-class variance.

Limitations:

 Assumes classes are normally distributed with equal covariance matrices (may not hold in real data).
 Performs poorly if classes are not linearly separable.

Part B – Descriptive / Analytical

Q-2(a) Differentiate between Supervised Learning and Unsupervised Learning.

Ans-

Aspect Supervised Learning Unsupervised Learning

Learning with labeled data (input–output pairs Learning with unlabeled data (only input, no
Definition
given). output labels).

Predict outputs for new inputs

Goal Find hidden patterns, structure, or groupings.
(classification/regression).

Predicting house prices, spam detection, disease Customer segmentation, market basket analysis,
Examples
diagnosis. anomaly detection.

Linear Regression, Logistic Regression, Decision

Algorithms k-Means, Hierarchical Clustering, PCA.
Trees, SVM.

Output Predict a class label or continuous value. Discover clusters or reduce dimensionality.

Q-3(b) Explain Generative Probabilistic Classification.

theorem to compute the posterior probability P (y ∣ x).

Ans- A classification approach where we model the joint probability distribution P (x, y) and then use Bayes’

Working:

Estimate likelihood P (x ∣ y).

 Estimate prior probability P(y).

Use Bayes’ theorem: P (y∣ x) = P (x ∣ y) * P(y) / P(x)



 Assign class with the maximum posterior probability (MAP estimation).

Example:

 Naïve Bayes classifier assumes features are conditionally independent given the class.
 If class = {Spam, Not Spam}, we compute which class has the higher probability for given email words.

Advantage: Works well even with small data, simple and fast.

Limitation: Strong independence assumptions may reduce accuracy.

Q-4(c) Discuss Bagging and Boosting.

Ans- Bagging (Bootstrap Aggregating):

 Train multiple models on different bootstrapped subsets of data.

 Aggregate results (majority vote for classification, average for regression).
 Reduces variance → prevents overfitting.
 Example: Random Forest = Bagging of Decision Trees.

Boosting:
 Sequentially train models where each new model focuses on correcting errors made by the previous ones.
 Combine weak learners into a strong learner (weighted voting).
 Reduces bias and variance.
 Examples: AdaBoost, Gradient Boosting, XGBoost.

Difference:

 Bagging = parallel, reduces variance.

 Boosting = sequential, reduces bias and variance.

Q-5(a). Explain Bayesian Estimation and Maximum Likelihood Estimation in generative learning.
Ans- Maximum Likelihood Estimation (MLE):

 Finds parameters that maximize the likelihood of observing the given data. It ignores prior information.
 Example: In a Gaussian distribution, estimate mean μ and variance σ² that maximize probability of data.
 Limitation: Can overfit, ignores prior knowledge.

Bayesian Estimation:

 Uses Bayes’ theorem by combining likelihood with prior probability, giving a posterior distribution for
parameters.
 Provides a distribution over parameters instead of a single estimate.
 Advantage: Handles uncertainty better, prevents overfitting.

Q-6(b). Explain the Decision Tree Algorithm with example.

Ans- A decision tree is a supervised learning algorithm that splits data into branches based on attribute values.
Each internal node represents an attribute, branches represent decisions, and leaves represent outcomes.

 Select the best feature to split using metrics like Information Gain (Entropy) or Gini Index.
 Create a decision node for the chosen feature.
 Split the dataset into subsets.
 Repeat recursively until stopping criteria (pure nodes or depth limit).

Example: Predicting “Play Tennis” based on attributes like weather conditions (Sunny, Rainy, Cloudy).

Dataset (Weather → Play Tennis):

 Features: Outlook (Sunny, Overcast, Rain), Temperature, Humidity, Wind.

 Target: Play (Yes/No).

Advantages: Easy to interpret and visualize.

Disadvantage: Prone to overfitting.

Part C – Long Answer

Q4(a) Write a short note on Support Vector Machine (SVM).

Ans: SVM is a supervised learning algorithm that classifies data by finding an optimal hyperplane.

Working Principle:

 It finds an optimal hyperplane that best separates data points of different classes.
 For linearly separable data, the hyperplane maximizes the margin (distance between hyperplane and
nearest data points, called support vectors).

Types:

 Linear SVM – works when data is linearly separable.

 Non-linear SVM – uses kernel functions (RBF, polynomial) to transform data into higher dimensions.

Advantages: Effective in high-dimensional spaces, robust to overfitting when dimensions > samples.

Applications: Text classification, image recognition, bioinformatics.

Q-5(b) Explain Logistic Regression.

Ans: Logistic regression is a classification algorithm that outputs a probability between 0 and 1, used when the
dependent variable is categorical.

Concept:

 Instead of predicting values directly (like linear regression), it predicts the probability of a data point
belonging to a class.
 Uses the sigmoid function: P(y=1∣x) =1/1+e^−(β0+β1x)

Decision Rule: If probability > 0.5 → Class 1, else Class 0.

Advantages: Simple, interpretable, works well for binary classification.

Applications: Spam detection, disease prediction, customer churn analysis.

Q6(c) Write the AdaBoost Algorithm.

Ans: AdaBoost (Adaptive Boosting) is an ensemble method that combines multiple weak classifiers to form a
strong classifier.

Algorithm Steps:

 Start by assigning equal weights to all training samples.

 Train a weak classifier (e.g., a decision stump).
 Increase the weights of misclassified samples.
 Train the next classifier, focusing more on difficult cases.
 The final model is formed by a weighted vote of all classifiers.

Advantages: Significantly improves accuracy and reduces both bias and variance.

Limitation: Sensitive to noisy data.

Example: Used in face detection in computer vision.

Q-7(a) What are the Goals of Machine Learning?

Ans:

 Automation of tasks – reduce human effort by making machines learn from data.
 Prediction – forecast future trends (e.g., stock prices, disease risk).
 Classification – assign data into categories (e.g., spam filtering).
 Clustering/Pattern discovery – find hidden structures in data (e.g., customer segmentation).
 Decision making – assist in making intelligent decisions based on data.
 Adaptation – improve system performance automatically with experience.

Q-8(b) Explain Overfitting.

Ans: Overfitting occurs when a model learns the training data too closely, even memorizing noise and irrelevant
details.

Symptoms: High accuracy on training data and low accuracy on test data (poor generalization).

Causes: Model is too complex or training data is insufficient.

Solutions:

 Use cross-validation.
 Apply regularization (L1, L2).
 Prune decision trees.

Example: Memorizing exam questions → leads to high score in practice (training), but failure in a new exam

Q-9(c) What is Nearest Neighbor?

Ans: k-Nearest Neighbor is a learning algorithm used for classification/regression based on closest data points in
feature space.

Algorithm (KNN):

 Choose a value of k.
 Calculate the distance (Euclidean, Manhattan) between the test point and all training points.
 Select k nearest neighbors.
 For classification: assign the most frequent class among neighbors.
For regression: take the average of neighbors’ values.

Advantages: Simple, no training phase and works for both classification & regression.

Limitations: Computationally expensive for large datasets, sensitive to noise.

Example: Handwritten digit recognition (MNIST dataset).

Q-10(d) Describe the Limitations of Perceptron Model.

Ans:

 Linearly separable limitation-The major limitation of the perceptron is that it can only classify
linearly separable data. If the data cannot be separated by a straight line (such as the XOR problem),
the perceptron fails completely, as it cannot capture non-linear decision boundaries.
 No probabilistic interpretation: It produces a hard decision (0 or 1) rather than probabilities, which
makes it unsuitable for tasks requiring uncertainty estimation.
 Fixed learning rate: The learning rate is fixed, and choosing an inappropriate value can cause slow
convergence or oscillations during training.
 Single-layer structure: t has a single-layer architecture, meaning it lacks the hidden layers required to
model complex relationships between inputs and outputs.
 Overfitting: Susceptible if trained too long or on noisy data. The perceptron is sensitive to noisy and
overlapping data, which can lead to incorrect classifications.

ML Question Bank Answers
No ratings yet
ML Question Bank Answers
15 pages
Key Machine Learning Exam Questions
No ratings yet
Key Machine Learning Exam Questions
7 pages
ML Complete QA Guide
No ratings yet
ML Complete QA Guide
33 pages
Machine Learning Exam Answer Key 2023
No ratings yet
Machine Learning Exam Answer Key 2023
19 pages
Understanding PAC Learning in ML
No ratings yet
Understanding PAC Learning in ML
12 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
6 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
8 pages
ML Student Revision Sheet (No Answer)
No ratings yet
ML Student Revision Sheet (No Answer)
10 pages
Machine Learning Assignment Professional
No ratings yet
Machine Learning Assignment Professional
4 pages
Machine Learning Exam Key November 2025
No ratings yet
Machine Learning Exam Key November 2025
5 pages
Machine Learning Types and Concepts Explained
No ratings yet
Machine Learning Types and Concepts Explained
9 pages
Machine Learning Concepts Explained
No ratings yet
Machine Learning Concepts Explained
4 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
17 pages
Key ML Questions for End Semester Exam
No ratings yet
Key ML Questions for End Semester Exam
18 pages
ML Viva Questions
No ratings yet
ML Viva Questions
8 pages
Machine Learning...
No ratings yet
Machine Learning...
19 pages
Q1. What Is Artificial Intelligence (AI) ?: Answer
No ratings yet
Q1. What Is Artificial Intelligence (AI) ?: Answer
6 pages
Overview of Machine Learning Concepts
No ratings yet
Overview of Machine Learning Concepts
7 pages
Fundamentals of Artificial Intelligence Answers
No ratings yet
Fundamentals of Artificial Intelligence Answers
14 pages
Module I
No ratings yet
Module I
30 pages
Key Machine Learning Concepts Explained
No ratings yet
Key Machine Learning Concepts Explained
4 pages
Machine Learning Concepts Explained
No ratings yet
Machine Learning Concepts Explained
18 pages
GANs and VAEs in Image Generation
No ratings yet
GANs and VAEs in Image Generation
10 pages
Two Marks Questions on Machine Learning
No ratings yet
Two Marks Questions on Machine Learning
4 pages
Shivaji University Machine Learning Q&A
No ratings yet
Shivaji University Machine Learning Q&A
12 pages
BCU Machine Learning Model Questions
No ratings yet
BCU Machine Learning Model Questions
5 pages
Top 25 Machine Learning Interview Q&A
No ratings yet
Top 25 Machine Learning Interview Q&A
11 pages
ML 1
No ratings yet
ML 1
39 pages
ML Assignments CustomLength
No ratings yet
ML Assignments CustomLength
6 pages
Machine Learning Concepts and Techniques
No ratings yet
Machine Learning Concepts and Techniques
20 pages
Machine Learning Concepts Overview
No ratings yet
Machine Learning Concepts Overview
29 pages
Machine Learning Exam Insights
No ratings yet
Machine Learning Exam Insights
3 pages
Machine Learning Applications Exam Guide
No ratings yet
Machine Learning Applications Exam Guide
2 pages
Understanding Pattern Recognition Techniques
No ratings yet
Understanding Pattern Recognition Techniques
18 pages
Exam Questions on AI and Machine Learning
No ratings yet
Exam Questions on AI and Machine Learning
15 pages
Machine Learning MCQs Overview
No ratings yet
Machine Learning MCQs Overview
229 pages
Machine Learning Model Q&A for CSE 5th Sem
No ratings yet
Machine Learning Model Q&A for CSE 5th Sem
4 pages
Supervised vs Unsupervised Learning Explained
No ratings yet
Supervised vs Unsupervised Learning Explained
4 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
17 pages
Correlation, ML Algorithms, and Bias-Variance
No ratings yet
Correlation, ML Algorithms, and Bias-Variance
9 pages
Top 25 Machine Learning Interview Questions 1
No ratings yet
Top 25 Machine Learning Interview Questions 1
10 pages
Unsupervised ML & Learning Types Explained
No ratings yet
Unsupervised ML & Learning Types Explained
3 pages
Machine Learning MCQs for Exam Prep
No ratings yet
Machine Learning MCQs for Exam Prep
16 pages
Machine Learning Concepts and Applications
No ratings yet
Machine Learning Concepts and Applications
9 pages
Essential ML Interview Questions 2024
No ratings yet
Essential ML Interview Questions 2024
13 pages
Deep Learning Model Exam Paper with Answers
No ratings yet
Deep Learning Model Exam Paper with Answers
27 pages
Key Concepts in Machine Learning
No ratings yet
Key Concepts in Machine Learning
3 pages
Machine Learning Quiz: MCQs on Concepts
No ratings yet
Machine Learning Quiz: MCQs on Concepts
4 pages
ML BAI503 Sample Questions
No ratings yet
ML BAI503 Sample Questions
6 pages
Data Mining and Machine Learning Concepts
No ratings yet
Data Mining and Machine Learning Concepts
9 pages
Supervised Learning and Machine Learning Concepts
No ratings yet
Supervised Learning and Machine Learning Concepts
3 pages
Supervised vs Unsupervised Learning Explained
No ratings yet
Supervised vs Unsupervised Learning Explained
60 pages
Machine Learning Concepts and Challenges
No ratings yet
Machine Learning Concepts and Challenges
13 pages
Machine Learning Lab Viva Questions
No ratings yet
Machine Learning Lab Viva Questions
30 pages
Machine Learning Exam Paper 2024
No ratings yet
Machine Learning Exam Paper 2024
3 pages
Understanding Machine Learning Concepts
No ratings yet
Understanding Machine Learning Concepts
4 pages
Basic Machine Learning Interview Questions
100% (1)
Basic Machine Learning Interview Questions
4 pages
Machine Learning QN Bank
No ratings yet
Machine Learning QN Bank
10 pages
ML 2M
No ratings yet
ML 2M
5 pages
R Programming Basics and Data Analysis
No ratings yet
R Programming Basics and Data Analysis
32 pages
Intelligent Systems in AI Explained
No ratings yet
Intelligent Systems in AI Explained
3 pages
AI and Expert Systems Overview 2024
No ratings yet
AI and Expert Systems Overview 2024
143 pages
Intelligent Agents and Search Strategies
No ratings yet
Intelligent Agents and Search Strategies
10 pages
Pattern Recognition MCQs: Confusion Matrix & K-Means
No ratings yet
Pattern Recognition MCQs: Confusion Matrix & K-Means
31 pages
Class 10 AI Sample Paper Overview
100% (1)
Class 10 AI Sample Paper Overview
12 pages
Publisher
No ratings yet
Publisher
9 pages
Text-Based Emotion Detection Techniques
No ratings yet
Text-Based Emotion Detection Techniques
14 pages
Real-Time Deep Learning for Wildlife Monitoring
No ratings yet
Real-Time Deep Learning for Wildlife Monitoring
27 pages
Quranic Verse Authentication via BERT
No ratings yet
Quranic Verse Authentication via BERT
8 pages
Supervised Learning Model Evaluation Techniques
No ratings yet
Supervised Learning Model Evaluation Techniques
43 pages
Information Storage And: Retrieval Techniques
No ratings yet
Information Storage And: Retrieval Techniques
56 pages
Data Mining Process and Evaluation Techniques
No ratings yet
Data Mining Process and Evaluation Techniques
33 pages
Bangla Fake News Detection with Deep Learning
No ratings yet
Bangla Fake News Detection with Deep Learning
10 pages
Viva Questions and Answers
No ratings yet
Viva Questions and Answers
3 pages
Advanced AI Sorting for Aluminum Recycling
No ratings yet
Advanced AI Sorting for Aluminum Recycling
219 pages
Iris Flower Classification with Naive Bayes
No ratings yet
Iris Flower Classification with Naive Bayes
3 pages
MLUNIT2
No ratings yet
MLUNIT2
56 pages
Enhanced YOLOv5 for License Plate Recognition
No ratings yet
Enhanced YOLOv5 for License Plate Recognition
11 pages
Optimize ML Deployment in IoT Devices
No ratings yet
Optimize ML Deployment in IoT Devices
18 pages
Product Review Sentiment Analysis with VADER
No ratings yet
Product Review Sentiment Analysis with VADER
14 pages
E-Commerce Churn Prediction with ML
No ratings yet
E-Commerce Churn Prediction with ML
6 pages
A Deep Learning Based Real-Time Vehicle Detection and Collision Avoidance System For Low-Visibility Foggy Environment
No ratings yet
A Deep Learning Based Real-Time Vehicle Detection and Collision Avoidance System For Low-Visibility Foggy Environment
6 pages
Disease Prediction Using Data Mining
No ratings yet
Disease Prediction Using Data Mining
5 pages
Graph-Based Document Structure Analysis
No ratings yet
Graph-Based Document Structure Analysis
24 pages
AI Model for Network Traffic Prediction in VANET
No ratings yet
AI Model for Network Traffic Prediction in VANET
16 pages
PSO-Enhanced XGBoost for Phishing Detection
No ratings yet
PSO-Enhanced XGBoost for Phishing Detection
8 pages
Mask R-CNN for Paddy Seed Classification
No ratings yet
Mask R-CNN for Paddy Seed Classification
9 pages
AI Based Eco Lifestyle Advisor
No ratings yet
AI Based Eco Lifestyle Advisor
8 pages
Android Development Internship Report
No ratings yet
Android Development Internship Report
32 pages
AI Framework for Predicting Disease Outcomes
No ratings yet
AI Framework for Predicting Disease Outcomes
18 pages
Multinomial Naïve Bayes for Sentiment Analysis
No ratings yet
Multinomial Naïve Bayes for Sentiment Analysis
5 pages
Evaluating AI Product Success Metrics
No ratings yet
Evaluating AI Product Success Metrics
4 pages
STD 9 CH-1 AI Reflection & Ethics Q&A - 20260219 - 131205
No ratings yet
STD 9 CH-1 AI Reflection & Ethics Q&A - 20260219 - 131205
6 pages

Machine Learning Concepts and Techniques

Uploaded by

Machine Learning Concepts and Techniques

Uploaded by

GURU TEGH BAHADUR 4TH CENTENARY ENGINEERING

 Leads to poor model accuracy and unreliable predictions.

Methods to avoid overfitting:

 Regularization (L1/L2 penalties, Dropout)

 Classification: Predicting if an email is Spam or Not Spam.

Q-1(d) Define the terms – Precision, Recall, F1-score and Accuracy.

Q-1(e) Define LDA and mention any two limitations.

Part B – Descriptive / Analytical

Aspect Supervised Learning Unsupervised Learning

Predict outputs for new inputs

Linear Regression, Logistic Regression, Decision

Q-3(b) Explain Generative Probabilistic Classification.

theorem to compute the posterior probability P (y ∣ x).

Estimate likelihood P (x ∣ y).

Use Bayes’ theorem: P (y∣ x) = P (x ∣ y) * P(y) / P(x)

Limitation: Strong independence assumptions may reduce accuracy.

Q-4(c) Discuss Bagging and Boosting.

Ans- Bagging (Bootstrap Aggregating):

 Train multiple models on different bootstrapped subsets of data.

 Bagging = parallel, reduces variance.

Q-6(b). Explain the Decision Tree Algorithm with example.

Dataset (Weather → Play Tennis):

 Features: Outlook (Sunny, Overcast, Rain), Temperature, Humidity, Wind.

Advantages: Easy to interpret and visualize.

Disadvantage: Prone to overfitting.

Part C – Long Answer

 Linear SVM – works when data is linearly separable.

Applications: Text classification, image recognition, bioinformatics.

Q-5(b) Explain Logistic Regression.

Decision Rule: If probability > 0.5 → Class 1, else Class 0.

Advantages: Simple, interpretable, works well for binary classification.

Applications: Spam detection, disease prediction, customer churn analysis.

Q6(c) Write the AdaBoost Algorithm.

 Start by assigning equal weights to all training samples.

Limitation: Sensitive to noisy data.

Example: Used in face detection in computer vision.

Q-8(b) Explain Overfitting.

Causes: Model is too complex or training data is insufficient.

Q-9(c) What is Nearest Neighbor?

Limitations: Computationally expensive for large datasets, sensitive to noise.

Example: Handwritten digit recognition (MNIST dataset).

Q-10(d) Describe the Limitations of Perceptron Model.

You might also like