0% found this document useful (0 votes)

15 views5 pages

Bayesian Learning in Machine Learning

The document discusses Bayesian learning in machine learning, focusing on the principles of probability-based learning and the Naïve Bayes model. It explains key concepts such as prior, likelihood, and posterior probabilities, along with various classification algorithms derived from Bayes' theorem. Additionally, it covers techniques for handling continuous attributes and different types of Naïve Bayes classifiers, including Gaussian, Bernoulli, and Multinomial Naïve Bayes.

Uploaded by

emanikanta535

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views5 pages

Bayesian Learning in Machine Learning

Uploaded by

emanikanta535

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

MACHINE LEARNING(BCS602)

MODULE 4
CHAPTER 8
BAYESIAN LEARNING
8.1 INTRODUCTIO TO PROBABILITY- BASED LEARNING
Probability-based learning is a valuable paradigm in machine learning and AI, particularly
when dealing with real-world problems that involve uncertainty and incomplete information.
It allows for more principled and robust decision-making in situations where deterministic
models may fall short.

8.2 FUNDAMENTALS OF BAYES THEOREM

Naïve Bayes Model relies on Bayes theorem that works on the principle of three
kinds of probabilities called prior probability, likelihood probability, and
posterior probability.
Prior Probability
It is the general probability of an uncertain event before an observation is seen or
some evidence is collected. It is the initial probability that is believed before any
new information is collected.
Likelihood Probability
Likelihood probability is the relative probability of the observation occurring for
each class or the sampling density for the evidence given the hypothesis. It is
stated as P (Evidence | Hypothesis), which denotes the likeliness of the
occurrence of the evidence given the parameters.
Posterior Probability
It is the updated or revised probability of an event taking into account the
observations from the training data. P (Hypothesis | Evidence) is the posterior
distribution representing the belief about the hypothesis, given the evidence from
the training data. Therefore,
Posterior probability = prior probability + new evidence

Deepa S, [Link] CSE,RNSIT 1

MACHINE LEARNING(BCS602)

8.3 CLASSIFICATION USING BAYES MODEL

Naive Bayes classifiers are a collection of classification algorithms based on Bayes’
Theorem. It is not a single algorithm but a family of algorithms where all of them share a
common principle, i.e. every pair of features being classified is independent of each other.
Bayes’ Theorem finds the probability of an event occurring given the probability of another
event that has already occurred. Bayes’ theorem is stated mathematically as the following
equation:

P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.

P(B|A) is Likelihood probability: Probability of the evidence given that the probability of
a hypothesis is true

P(A) is Prior Probability: Probability of hypothesis before observing the evidence.

P(B) is Marginal Probability: Probability of Evidence.

Deepa S, [Link] CSE,RNSIT 2

MACHINE LEARNING(BCS602)

Maximum A Posteriori (MAP) Hypothesis, hMAP

Given a set of candidate hypotheses, the hypothesis which has the maximum value is
considered as the maximum probable hypothesis or most probable hypothesis. This most
probable hypothesis is called the Maximum A Posteriori Hypothesis hap. Bayes theorem Eq.
(8.1) can be used to find the hMAP

Maximum Likelihood (ML) Hypothesis, hML

Given a set of candidate hypotheses, if every hypothesis is equally probable, only P (E | h) is
used to find the most probable hypothesis. The hypothesis that gives the maximum likelihood
for P (E | h) is called the Maximum Likelihood (ML) Hypothesis, hML

8.3.1 NAÏVE BAYES ALGORITHM

The Naïve Bayes algorithm is a probabilistic classification algorithm based on Bayes'
theorem. It is supervised learning algorithm

.
Zero Probability Error

Zero-probability error can be solved by applying a smoothing technique called Laplace

correction which means given 1000 data instances in the training dataset, if there are zero
instances for a particular value of a feature we can add 1 instance for each attribute value pair
of that feature which will not make much difference for 1000 data instances and the overall
probability does not become zero.

Deepa S, [Link] CSE,RNSIT 3

MACHINE LEARNING(BCS602)

8.3.2 Brute Force Bayes Algorithm

Applying Bayes theorem, Brute Force Bayes algorithm relies on the idea of concept learning
wherein given a hypothesis space H for the training dataset T, the algorithm computes the
posterior probabilities for all the hypothesis hiЄH. Then, Maximum A Posteriori (MAP)
Hypothesis, hMAP, is used to output the hypothesis with maximum posterior probability. The
algorithm is quite expensive since it requires computations for all the hypotheses. Although
computing posterior probabilities is inefficient, this idea is applied in various other algorithms
which is also quite interesting.

8.3.3 Bayes Optimal Classifier

Bayes optimal classifier is a probabilistic model, which in fact, uses the Bayes theorem to find
the most probable classification for a new instance given the training data by combining the
predictions of all posterior hypotheses, This is different from Maximum A Posteriori (MAP)
Hypothesis, hMAP Which chooses the maximum probable hypothesis or the most probable
hypothesis.
Here, a new instance can be classified to a possible classification value Ci by the following Eq.
(8.4).

8.3.4 Gibbs Algorithm

The main drawback of Bayes optimal classifier is that it computes the posterior probability for
all hypotheses in the hypothesis space and then combines the predictions to classify a new
instance.
Gibbs algorithm is a sampling technique which randomly selects a hypothesis from the
hypothesis space according to the posterior probability distribution and classifies a new
instance. It is found that the prediction error occurs twice with the Gibbs algorithm when
compared to Bayes Optimal classifier.

8.4 NAÏVE BAYES ALGORITHM FOR CONTINUOUS ATTRIBUTES

There are two ways to predict with Naive Bayes algorithm for continuous attributes:
1. Discretize continuous feature to discrete feature.
2. Apply Normal or Gaussian distribution for continuous feature.

Gaussian Naive Bayes Algorithm

In Gaussian Naive Bayes, the values of continuous features are assumed to be sampled from a
Gaussian distribution.

Deepa S, [Link] CSE,RNSIT 4

MACHINE LEARNING(BCS602)

8.5 OTHER POPULAR TYPES OF NAIVE BAYES CLASSIFIERS

Bernoulli Naive Bayes Classifier

Bernoulli Naive Bayes works with discrete features. In this algorithm, the features used for
making predictions are Boolean variables that take only two values either 'yes' or 'no'. This is
particularly useful for text classification where all features are binary with each feature
containing two values whether the word occurs or not.
Multinomial Naive Bayes Classifier
This algorithm is a generalization of the Bernoulli Naive Bayes model that works for
categorical data or particularly integer features. This classifier is useful for text classification
where each feature will have an integer value that represents the frequency of occurrence of
words.
Multi-class Naïve Bayes Classifier
This algorithm is useful for classification problems with more than two classes where the target
feature contains multiple classes and test instance has to be predicted with the class it belongs
to.

Deepa S, [Link] CSE,RNSIT 5

Common questions

Gaussian Naive Bayes and Bernoulli Naive Bayes classifiers differ in their handling of features and underlying distribution assumptions. Gaussian Naive Bayes assumes that the continuous features are drawn from a Gaussian (normal) distribution, making it suitable for continuous datasets where this assumption holds. Conversely, Bernoulli Naive Bayes works with binary/Boolean features, making it suitable for text classification problems with binary occurrence data, such as whether a word appears in a document. Each classifier requires the assumption to be met for effective application: Gaussian for continuous, approximately normally distributed data, and Bernoulli for binary data .

The primary distinction between the MAP hypothesis and the ML hypothesis resides in their consideration of prior probabilities. The Maximum A Posteriori (MAP) hypothesis considers both the likelihood and the prior probabilities when determining the most probable hypothesis. In contrast, the Maximum Likelihood (ML) hypothesis focuses solely on likelihood probabilities, assuming that all hypotheses are equally probable, thereby ignoring prior probabilities .

The Bayes Optimal Classifier is more robust than the MAP hypothesis because it considers all posterior hypotheses to predict the most probable classification for a new instance. This comprehensive approach allows it to integrate information from the entire hypothesis space, reducing the bias that might arise from selecting a single hypothesis, as is done in the MAP approach. While MAP selects only the most probable hypothesis, the Bayes Optimal Classifier averages over multiple hypotheses, leading to potentially more accurate predictions especially in varied and complex data environments .

The Naïve Bayes classifier assumes that every pair of features is independent, meaning that the presence or absence of one feature does not affect the presence or absence of another feature. This assumption simplifies the computation of the posterior probability, making the algorithm computationally efficient and suitable for high-dimensional data. However, this assumption may not hold true for all datasets, potentially affecting the classifier's performance negatively in cases where features are highly correlated .

The primary challenge associated with the Brute Force Bayes Algorithm is its computational inefficiency. It requires computing posterior probabilities for all possible hypotheses, which is computationally expensive, especially for large hypothesis spaces. This approach contrasts with other Bayesian methods like the Naïve Bayes, which assumes feature independence and is thus computationally efficient. The Brute Force approach is necessary in some scenarios for thorough exploration of the hypothesis space, but its lack of efficiency makes it less practical for real-time or large-scale applications .

The Naïve Bayes Model utilizes three types of probabilities: prior probability, likelihood probability, and posterior probability. The prior probability is the initial probability of an event before observing any evidence. The likelihood probability is the probability of the evidence given the hypothesis. Posterior probability is the updated probability of an event based on the observed evidence, calculated as the prior probability adjusted by the likelihood of the evidence. Together, these probabilities help calculate the probability of a hypothesis given the evidence, allowing for classification .

The Laplace correction addresses the zero-probability error by adding a small value, typically 1, to the count of each attribute value pair within the dataset. This prevents the scenario where an unseen feature value in the training data results in a zero probability, which would cause the entire product of probabilities to be zero in a Naïve Bayes calculation. By ensuring that no probability is zero, the Laplace correction ensures that the model can still make a prediction even for features that did not appear in the training dataset, enhancing the model's robustness .

Discretization plays a role in converting continuous attributes into discrete values, facilitating the application of Naïve Bayes, which traditionally handles discrete data. This simplification can make the algorithm applicable without altering its foundational structure. However, using Gaussian distribution can be preferred when the continuous data is approximately normally distributed, as it avoids information loss associated with discretization and allows the model to leverage the continuous nature of the data for potentially more accurate probability estimations .

The Naïve Bayes algorithm handles multi-class classification problems using the Multi-class Naïve Bayes Classifier, which is designed to predict the class of an instance when there are more than two classes involved. This is achieved by computing the probability of each class given the input features and selecting the class with the highest probability. The benefit of this approach is its simplicity and efficiency, making it suitable for large-scale applications where computational resources are limited. Additionally, by leveraging the independence assumption, it reduces complexity, allowing it to scale well with datasets with many classes .

Gibbs Algorithm differs from the Bayes Optimal Classifier by classifying new instances through sampling rather than computing posterior probabilities for all hypotheses. It randomly selects a hypothesis according to the posterior probability distribution and uses it to classify a new instance. The main trade-off of using Gibbs Algorithm is that while it is computationally more efficient since it avoids the exhaustive computation of posterior probabilities, it may also result in higher prediction errors compared to the Bayes Optimal Classifier, given its reliance on sampling rather than a comprehensive evaluation of the hypothesis space .

MLnotes Module 4
No ratings yet
MLnotes Module 4
30 pages
Bayesian Learning and Inference Techniques
No ratings yet
Bayesian Learning and Inference Techniques
26 pages
Chapter 8 ML
No ratings yet
Chapter 8 ML
14 pages
Understanding Bayesian Learning Methods
No ratings yet
Understanding Bayesian Learning Methods
52 pages
@vtudeveloper - in ML Mod 4
No ratings yet
@vtudeveloper - in ML Mod 4
11 pages
Understanding Bayesian Learning Methods
No ratings yet
Understanding Bayesian Learning Methods
42 pages
Naive Bayes Classification Overview
No ratings yet
Naive Bayes Classification Overview
21 pages
Program 8 1
No ratings yet
Program 8 1
7 pages
Bayesian Learning and Naive Bayes Model
No ratings yet
Bayesian Learning and Naive Bayes Model
30 pages
ML-Module 4-P1
No ratings yet
ML-Module 4-P1
30 pages
BCS602 Mod4@Azdocuments - in
No ratings yet
BCS602 Mod4@Azdocuments - in
37 pages
Understanding Bayesian Learning Concepts
No ratings yet
Understanding Bayesian Learning Concepts
33 pages
Understanding Bayesian Learning Concepts
No ratings yet
Understanding Bayesian Learning Concepts
14 pages
Bayesian Learning Methods Overview
No ratings yet
Bayesian Learning Methods Overview
44 pages
Understanding Bayesian Learning Methods
No ratings yet
Understanding Bayesian Learning Methods
130 pages
Understanding Naive Bayes Classification
No ratings yet
Understanding Naive Bayes Classification
12 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
19 pages
Bayesian Learning and Naïve Bayes Classifier
No ratings yet
Bayesian Learning and Naïve Bayes Classifier
19 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
65 pages
Statistical Learning: Bayesian Models Overview
No ratings yet
Statistical Learning: Bayesian Models Overview
33 pages
Introduction to Bayesian Learning Theory
No ratings yet
Introduction to Bayesian Learning Theory
178 pages
ML Notes Module 4
No ratings yet
ML Notes Module 4
46 pages
Machine Learning (BCS602) - Module 4 Notes
No ratings yet
Machine Learning (BCS602) - Module 4 Notes
46 pages
Ensemble Learning and Naive Bayes Overview
No ratings yet
Ensemble Learning and Naive Bayes Overview
36 pages
Naïve Bayesian Learning Overview
No ratings yet
Naïve Bayesian Learning Overview
14 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
11 pages
Bayesian Concept Learning Overview
No ratings yet
Bayesian Concept Learning Overview
60 pages
Bayesian Learning and Decision Theory
No ratings yet
Bayesian Learning and Decision Theory
11 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
59 pages
Bayesian Learning in AI and ML
No ratings yet
Bayesian Learning in AI and ML
30 pages
Bayesian Learning Methods Explained
No ratings yet
Bayesian Learning Methods Explained
26 pages
Bayesian Learning and Hypothesis Evaluation
No ratings yet
Bayesian Learning and Hypothesis Evaluation
36 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
81 pages
Bayesian Methods in Machine Learning
No ratings yet
Bayesian Methods in Machine Learning
36 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
44 pages
Bayesian Learning Algorithms Explained
No ratings yet
Bayesian Learning Algorithms Explained
54 pages
Naive Bayes Algorithm Course Overview
No ratings yet
Naive Bayes Algorithm Course Overview
8 pages
Advanced Machine Learning Notes
No ratings yet
Advanced Machine Learning Notes
9 pages
Understanding Bayesian Learning Methods
No ratings yet
Understanding Bayesian Learning Methods
10 pages
.trashed-1771610495-ARTI SEM4 ENG
No ratings yet
.trashed-1771610495-ARTI SEM4 ENG
103 pages
Bayesian Concept Learning Overview
No ratings yet
Bayesian Concept Learning Overview
40 pages
Aiml Module 04
No ratings yet
Aiml Module 04
62 pages
Understanding Bayesian Learning Methods
No ratings yet
Understanding Bayesian Learning Methods
50 pages
Ai&Ml - Module 4 Final
No ratings yet
Ai&Ml - Module 4 Final
24 pages
Bayesian Learning and Probabilistic Models
No ratings yet
Bayesian Learning and Probabilistic Models
5 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
70 pages
Bayesian Concept Learning Overview
No ratings yet
Bayesian Concept Learning Overview
40 pages
Introduction to Probabilistic Models
No ratings yet
Introduction to Probabilistic Models
18 pages
Machine Learning: Bayes Theorem & Concepts
No ratings yet
Machine Learning: Bayes Theorem & Concepts
16 pages
Naive Bayes Classifier Explained
No ratings yet
Naive Bayes Classifier Explained
79 pages
Bayesian Classification Techniques Explained
No ratings yet
Bayesian Classification Techniques Explained
22 pages
Understanding Bayes' Theorem in Learning
No ratings yet
Understanding Bayes' Theorem in Learning
68 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
47 pages
Introduction to Supervised Learning
No ratings yet
Introduction to Supervised Learning
54 pages
Naive Bayes Algorithm Explained
No ratings yet
Naive Bayes Algorithm Explained
6 pages
Unit 3 17156657902872
No ratings yet
Unit 3 17156657902872
16 pages
Naive Bayes Classifier Overview
No ratings yet
Naive Bayes Classifier Overview
15 pages
Bayes Classifier in Machine Learning
No ratings yet
Bayes Classifier in Machine Learning
22 pages
Naïve Bayes Classifier Overview
No ratings yet
Naïve Bayes Classifier Overview
18 pages
Ic Op-Amp Ha 17741
No ratings yet
Ic Op-Amp Ha 17741
19 pages
PVDU-150A2 (01075828) Datasheet 01 - (20240904)
No ratings yet
PVDU-150A2 (01075828) Datasheet 01 - (20240904)
2 pages
Business Process Reengineering Exam Guide
No ratings yet
Business Process Reengineering Exam Guide
11 pages
Tugas Bahasa Inggris 1
No ratings yet
Tugas Bahasa Inggris 1
5 pages
Enphase Technology Overview and Impact
No ratings yet
Enphase Technology Overview and Impact
31 pages
Spintrol H Pathologic Safety Data Sheet
No ratings yet
Spintrol H Pathologic Safety Data Sheet
6 pages
AS2150
0% (1)
AS2150
9 pages
OpenSeesPy Python Library For The OpenSe
No ratings yet
OpenSeesPy Python Library For The OpenSe
7 pages
Parts of Speech and Punctuation Test
No ratings yet
Parts of Speech and Punctuation Test
41 pages
App Initialization and Module Loading Logs
No ratings yet
App Initialization and Module Loading Logs
8 pages
NONESCOST Construction Manual Guide
No ratings yet
NONESCOST Construction Manual Guide
75 pages
Ben's Backyard Weather Forecasts
No ratings yet
Ben's Backyard Weather Forecasts
4 pages
Mechanical Systems: Types & Modeling
No ratings yet
Mechanical Systems: Types & Modeling
28 pages
Tangent Plane to Parametric Surface
No ratings yet
Tangent Plane to Parametric Surface
20 pages
Overview of Pigmented Lesions
No ratings yet
Overview of Pigmented Lesions
10 pages
Procurement Evaluation of Igloo Ice Cream
No ratings yet
Procurement Evaluation of Igloo Ice Cream
75 pages
Tennis Speed and Agility Training Guide
No ratings yet
Tennis Speed and Agility Training Guide
3 pages
Understanding Single Phase Transformers
No ratings yet
Understanding Single Phase Transformers
73 pages
UC Davis ARE 107 Problem Set 1 Answer Key
No ratings yet
UC Davis ARE 107 Problem Set 1 Answer Key
6 pages
Ipratropium Nursing Considerations
No ratings yet
Ipratropium Nursing Considerations
5 pages
RO Plant Inspection Report Summary
No ratings yet
RO Plant Inspection Report Summary
2 pages
Top 7 Coaching Strategies for Success
No ratings yet
Top 7 Coaching Strategies for Success
48 pages
Understanding Rido: Clan Feuds in Mindanao
No ratings yet
Understanding Rido: Clan Feuds in Mindanao
11 pages
E220 LoRa Module User Manual
No ratings yet
E220 LoRa Module User Manual
39 pages
SOGEC NDT Assessment Report for Saudi Aramco
No ratings yet
SOGEC NDT Assessment Report for Saudi Aramco
1 page
Payroll System Database Design Guide
No ratings yet
Payroll System Database Design Guide
22 pages
Manual Block-Making Machine Analysis
No ratings yet
Manual Block-Making Machine Analysis
13 pages
Logistic Equation Predictions for Jordan's Population
No ratings yet
Logistic Equation Predictions for Jordan's Population
11 pages
ACMEE 2025 Exhibitors Overview
No ratings yet
ACMEE 2025 Exhibitors Overview
7 pages
Flowchart Basics for Computer Servicing
No ratings yet
Flowchart Basics for Computer Servicing
27 pages