0% found this document useful (0 votes)

24 views35 pages

Understanding Bayesian Classification Techniques

Bayesian classifiers use Bayes' theorem to predict class membership probabilities based on training data. A naïve Bayesian classifier assumes conditional independence between attributes given the class. This simplifies computations but may not always hold. Bayesian belief networks relax this assumption by allowing dependencies between attributes and representing them in a directed acyclic graph with conditional probability tables. This more flexible approach models class conditional probabilities better for problems where attributes are somewhat correlated.

Uploaded by

Janani Aec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views35 pages

Understanding Bayesian Classification Techniques

Uploaded by

Janani Aec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Bayesian Classification

Dr. Navneet Goyal BITS, Pilani

Bayesian Classification

What are Bayesian Classifiers? Statistical Classifiers Predict class membership probabilities Based on Bayes Theorem Nave Bayesian Classifier Computationally Simple Comparable performance with DT and NN classifiers

Bayesian Classification

Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical approaches to certain types of learning problems Incremental: Each training example can incrementally increase/decrease the probability that a hypothesis is correct. Prior knowledge can be combined with observed data.

Bayes Theorem

Let X be a data sample whose class label is unknown Let H be some hypothesis that X belongs to a class C For classification determine P(H/X) P(H/X) is the probability that H holds given the observed data sample X P(H/X) is posterior probability

Bayes Theorem
Example: Sample space: All Fruits X is round and red H= hypothesis that X is an Apple P(H/X) is our confidence that X is an apple given that X is round and red P(H) is Prior Probability of H, ie, the probability that any given data sample is an apple regardless of how it looks P(H/X) is based on more information Note that P(H) is independent of X

Bayes Theorem
Example: Sample space: All Fruits P(X/H) ? It is the probability that X is round and red given that we know that it is true that X is an apple Here P(X) is prior probability = P(data sample from our set of fruits is red and round)

Estimating Probabilities

P(X), P(H), and P(X/H) may be estimated from given data Bayes Theorem

P( X | H )P(H ) P(H | X ) = P( X )

Use of Bayes Theorem in Nave Bayesian Classifier!!

Nave Bayesian Classification

Also Why Class Effect This

called Simple BC Nave/Simple?? Conditional Independence

of an attribute values on a given class is independent of the values of other attributes assumption simplifies computations

Nave Bayesian Classification

Steps Involved
1.

Each data sample is of the type X=(xi) i =1(1)n, where xi is the values of X for attribute Ai

Suppose there are m classes Ci, i=1(1)m. X Ci iff P(Ci|X) > P(Cj|X) for 1 j m, j i i.e BC assigns X to class Ci having highest

Nave Bayesian Classification

The class for which P(Ci|X) is maximized is called the maximum posterior hypothesis. From Bayes Theorem

P(Ci | X ) =P( X | Ci) P(Ci) P( X )

P(X) is constant. Only

P( X | Ci)P(Ci)need be maximized.

If class prior probabilities not known, then assume all classes to be equally likely Otherwise maximize

P(Ci) = Si/S

P( X | Ci)P(Ci)

Problem: computing P(X|Ci) is unfeasible! (find out how you would find it and why it is infeasible)

Nave Bayesian Classification

Nave assumption: attribute independence = P(x1,,xn|C) = P(xk|C) P( X | C i )

In order to classify an unknown sample X, P( X | Ci)P(Ci) each class C . evaluate for i Sample X is assigned to the class Ci iff P(X|Ci)P(Ci) > P(X|Cj) P(Cj) for 1 j m, j i

Nave Bayesian Classification

EXAMPLE
Age <=30 <=30 31..40 >40 >40 >40 31..40 <=30 <=30 >40 <=30 31.40 31.40 >40 Income HIGH HIGH HIGH MEDIUM LOW LOW LOW MEDIUM LOW MEDIUM MEDIUM MEDIUM HIGH MEDIUM Student N N N N Y Y Y N Y Y Y N Y N Credit_rating FAIR EXCELLENT FAIR FAIR FAIR EXCELLENT EXCELLENT FAIR FAIR FAIR EXCELLENT EXCELLENT FAIR EXCELLENT Class:Buys_comp N N Y Y Y N Y N Y Y Y Y Y N

Nave Bayesian Classification

EXAMPLE X= (<=30,MEDIUM, Y,FAIR, ???) We need to maximize:

Nave Bayesian Classification

EXAMPLE

Nave Bayesian Classification

EXAMPLE

P(X | buys_comp=Y)=0.222*0.444*0.667*0.667=0.044 P(X | buys_comp=N)=0.600*0.400*0.200*0.400=0.019 P(X | buys_comp=Y)P(buys_comp=Y) = 0.044*0.643=0.028 P(X | buys_comp=N)P(buys_comp=N) = 0.019*0.357=0.007 CONCLUSION: X buys computer

Nave Bayes Classifier: Issues

Probability
Recall

values ZERO!

what you observed in WEKA! what you observed in WEKA!

Ak is continuous valued!

Recall

If there are no tuples in the training set corresponding to students for the class buys-comp=NO P(student = Y|buys_comp=N)=0 Implications? Solution?

Nave Bayes Classifier: Issues

Laplacian Correction or Laplace Estimator Philosophy we assume that the training data set is so large that adding one to each count that we need would only make a negligible difference in the estimated prob. value. Example: D (1000) Class: buys_comp=Y income=low zero tuples income=medium 990 tuples income=high 10 tuples Without Laplacian Correction the probs. are 0, 0.990, and 0.010 With Laplacian correction: 1/1003 = 0.001, 991/1003=0.988, and 11/1003=0.011 respectively.

Nave Bayes Classifier: Issues

Continuous

variable: need to do more work than categorical attributes! It is typically assumed to have a Guassian distribution with a mean and a std. dev. . Do it yourself! And cross check with WEKA!

Nave Bayes (Summary)

Robust Handle

to isolated noise points

missing values by ignoring the instance during probability estimate calculations to irrelevant attributes

Robust

Independence
Use

assumption may not hold for some attributes

other techniques such as Bayesian Belief Networks (BBN)

Probability Calculations
Age Income Student Credit_rating Class:Buys_comp <=30 <=30 HIGH HIGH N N FAIR EXCELLENT N N 31..40 HIGH N FAIR Y >40 MEDIUM N FAIR Y

No. of attributes = 4 Distinct values = 3,3,3,3 No. of classes = 2 Total no. of probability calculations in NBC = 4*3*2 = 24! What if conditional ind. was not assumed? O(kp) for p k-valued attributes Multiply by m classes.

>40 >40

LOW LOW

Y Y

GOOD EXCELLENT

Y N

31..40

LOW

EXCELLENT

<=30

MEDIUM

FAIR

<=30 >40

LOW MEDIUM

Y Y

GOOD FAIR

Y Y

<=30

MEDIUM

EXCELLENT

31.40

MEDIUM

EXCELLENT

31.40 >40

HIGH MEDIUM

Y N

FAIR EXCELLENT

Y N

Bayesian Belief Networks

Nave BC assumes Class Conditional Independence This assumption simplifies computations When this assumption holds true, Nave BC is most accurate compared to all other classifiers In real problems, dependencies do exist between variables 2 methods to overcome this limitation of NBC

Bayesian networks, that combine Bayesian reasoning with causal relationships between attributes Decision trees, that reason on one attribute at the time, considering most important attributes first

Conditional Independence
Let

X, Y, & Z denote three set of random variables. The variables in X are said to be conditionally independent of Y, given Z if
P(X|Y,Z) = P(X|Z)

Rel.

bet. a persons arm length and his/her reading skills!! One might observe that people with longer arms tend to have higher levels of reading skills How do you explain this rel.?

Conditional Independence
Can

be explained through a confounding factor, AGE A young child tends to have short arms and lacks the reading skills of an adult If the age of a person is fixed, then the observed rel. between arm length and reading skills disappears We can this conclude that arm length and reading skills are conditionally independent when the age variable is fixed
P(reading skills| long arms,age) = P(reading skills|age)

P(X|Ci) = P(x1, x2, x3,,xn|C) = P(xk|C)

Bayesian Belief Networks

Belief Networks Bayesian Networks Probabilistic Networks

Bayesian Belief Networks

Conditional Independence (CI) assumption made by NBC may be too rigid

Specially for classification problems in which attributes are somewhat correlated

We need a more flexible approach for modeling the class conditional probabilities

P(X|Ci) = P(x1, x2, x3,,xn|C)

instead of requiring that all the attributes be CI given the class, BBN allows us to specify which pair of attributes are CI

Bayesian Belief Networks

Belief

Networks has 2 components Acyclic Graph (DAG) Probability Table (CPT)

Directed

Conditional

Bayesian Belief Networks

A node in BBN is CI of its non-descendants, if its parents are known

Bayesian Belief Networks

Family History Smoker (FH, S) (FH, ~S)(~FH, S) (~FH, ~S)

LC
LungCancer Emphysema

0.8 0.2

0.5 0.5

0.7 0.3

0.1 0.9

~LC

The conditional probability table for the variable LungCancer

PositiveXRay Dyspnea

Bayesian Belief Networks

6 boolean variables

Arcs allow representation of causal knowledge

Having lung cancer is influenced by family history and smoking PositiveXray is ind. of whether the paient has a FH or if he/she is a smoker given that we know that the patient has lung cancer

Once we know the outcome of Lung Cancer, FH & Smoker do not provide any additional info. about PositiveXray

Bayesian Belief Networks

Lung Cancer is CI of Emphysema, given its parents, FH

& Smoker BBN has a Conditional Probability Table (CPT) for each variable in the DAG

CPT for a variable Y specifies the conditional distribution P(Y|parents(Y))

(FH, S) (FH, ~S) (~FH, S) (~FH, ~S)

LC ~LC

0.8 0.2

0.5 0.5

0.7 0.3

0.1 0.9

P(LC=Y|FH=Y,S=Y) = 0.8 P(LC=N|FH=N,S=N) = 0.9

CPT for LungCancer

Bayesian Belief Networks

Let X=(x1, x2,,xn) be a tuple described by variables

or attributes Y1, Y2, ,Yn respectively Each variable is CI of its nondescendants given its parents

Allows he DAG to provide a complete representation of the existing Joint Probability Distribution by:

P(x1, x2, x3,,xn)=P(xi|Parents(Yi)) P(x1, x2, x3,,xn) is the prob. of a particular combination of values of X, and the values for P(xi| Parents(Yi)) correspond to the entries in CPT for Yi
where

Bayesian Belief Networks

A node within the network can selected as an output

node, representing a class label attribute

More than one output node

Rather

than returning a single class label, the classification process can return a probability distribution that gives the probability of each class

Training BBN!!

Training BBN
Number of scenarios possible

Network topology may be given in advance or inferred from data

Variables may be observable or hidden (mising or incomplete data) in all or some of the training tuples

Many algos for learning the network topology from the training data given observable attibutes

If network topology is known and the variables observable, training is straightforward (just compute CPT entries)

Training BBNs
Topology given, but some variables are hidden

Gradient Descent (self study)

Falls under the class of algos called Adaptive Probabilistic Networks

BBNs are computationally expensive BBNs provide explicit representation of Causal structure

Domain experts can provide prior knowledge to the training process in the form of topology and/or in conditional probability values This leads to significant improvement in the learning process

Bayesian Classification Overview
No ratings yet
Bayesian Classification Overview
35 pages
Bayesian Classification Techniques
No ratings yet
Bayesian Classification Techniques
16 pages
10 Naive Bayesian Classifier
No ratings yet
10 Naive Bayesian Classifier
20 pages
Understanding Bayesian Classification Techniques
No ratings yet
Understanding Bayesian Classification Techniques
23 pages
Understanding Bayesian Classifiers
No ratings yet
Understanding Bayesian Classifiers
16 pages
Bayesian Classification in Data Mining
No ratings yet
Bayesian Classification in Data Mining
6 pages
Naïve Bayesian Classifier Overview
No ratings yet
Naïve Bayesian Classifier Overview
16 pages
Bayesian Classification in Data Mining
No ratings yet
Bayesian Classification in Data Mining
15 pages
Bayesian Classification in Data Mining
No ratings yet
Bayesian Classification in Data Mining
46 pages
Understanding Bayesian Classification Techniques
No ratings yet
Understanding Bayesian Classification Techniques
15 pages
Naive Bayesian Classifiers Overview
No ratings yet
Naive Bayesian Classifiers Overview
43 pages
Bayesian Classification in Data Mining
No ratings yet
Bayesian Classification in Data Mining
7 pages
Bayesian Classification Methods Overview
No ratings yet
Bayesian Classification Methods Overview
21 pages
Bayes' Theorem in Data Classification
No ratings yet
Bayes' Theorem in Data Classification
10 pages
Naive Bayesian Classification Overview
No ratings yet
Naive Bayesian Classification Overview
16 pages
Bayesian Classification Techniques Explained
No ratings yet
Bayesian Classification Techniques Explained
24 pages
Naïve Bayesian Classifier Overview
No ratings yet
Naïve Bayesian Classifier Overview
13 pages
Understanding Bayesian Classification
No ratings yet
Understanding Bayesian Classification
25 pages
Naive Bayesian Classification Overview
No ratings yet
Naive Bayesian Classification Overview
15 pages
Supervised Learning: Naïve Bayes & kNN
No ratings yet
Supervised Learning: Naïve Bayes & kNN
29 pages
Bayesian Classification Techniques Explained
No ratings yet
Bayesian Classification Techniques Explained
22 pages
Understanding Bayesian Classifiers
No ratings yet
Understanding Bayesian Classifiers
58 pages
Bayesian Classification Techniques
No ratings yet
Bayesian Classification Techniques
7 pages
Bayesian Classification Methods Explained
No ratings yet
Bayesian Classification Methods Explained
46 pages
Unit 3 Bayesian Concept Learning
No ratings yet
Unit 3 Bayesian Concept Learning
20 pages
Naïve Bayesian Classification Explained
No ratings yet
Naïve Bayesian Classification Explained
17 pages
Data-Mining-lecture-Eight-Naïve Bayes Algorithm
No ratings yet
Data-Mining-lecture-Eight-Naïve Bayes Algorithm
24 pages
Bayesian Classification Techniques Explained
No ratings yet
Bayesian Classification Techniques Explained
8 pages
Naïve Bayes Classifier Explained
No ratings yet
Naïve Bayes Classifier Explained
9 pages
Bayesian Classification
No ratings yet
Bayesian Classification
11 pages
Bayesian Classification Explained
No ratings yet
Bayesian Classification Explained
18 pages
Understanding Bayesian Classification Techniques
No ratings yet
Understanding Bayesian Classification Techniques
49 pages
Bayesian Classification in Data Mining
No ratings yet
Bayesian Classification in Data Mining
19 pages
Supervised Learning: Naïve Bayes & kNN
No ratings yet
Supervised Learning: Naïve Bayes & kNN
32 pages
Naive Bayes Classifier Overview and Applications
No ratings yet
Naive Bayes Classifier Overview and Applications
23 pages
Data Mining Notes
No ratings yet
Data Mining Notes
18 pages
Naive Bayesian Classification Explained
No ratings yet
Naive Bayesian Classification Explained
48 pages
Bayesian Concept Learning Overview
No ratings yet
Bayesian Concept Learning Overview
40 pages
Naïve Bayesian Classification Explained
No ratings yet
Naïve Bayesian Classification Explained
37 pages
Naïve Bayesian Classification Overview
No ratings yet
Naïve Bayesian Classification Overview
18 pages
Machine Learning: Classification Techniques
No ratings yet
Machine Learning: Classification Techniques
37 pages
Bayesian Classification Overview
No ratings yet
Bayesian Classification Overview
66 pages
Bayesian Learning in AI Classifiers
No ratings yet
Bayesian Learning in AI Classifiers
34 pages
ML IB 04 - 05 NaiveBayes
No ratings yet
ML IB 04 - 05 NaiveBayes
131 pages
Naive Bayesian Classifier Overview
No ratings yet
Naive Bayesian Classifier Overview
21 pages
Bayes Classification Techniques Overview
No ratings yet
Bayes Classification Techniques Overview
30 pages
Understanding Naïve Bayes Classifiers
No ratings yet
Understanding Naïve Bayes Classifiers
31 pages
Understanding Bayesian Classification Techniques
No ratings yet
Understanding Bayesian Classification Techniques
40 pages
Understanding Naïve Bayes Classifiers
No ratings yet
Understanding Naïve Bayes Classifiers
6 pages
Understanding Bayesian Classification Techniques
No ratings yet
Understanding Bayesian Classification Techniques
27 pages
Naïve Bayes Classifier Overview
No ratings yet
Naïve Bayes Classifier Overview
38 pages
Naïve Bayes Classifier Overview
No ratings yet
Naïve Bayes Classifier Overview
21 pages
Bayesian Classification in Machine Learning
No ratings yet
Bayesian Classification in Machine Learning
19 pages
Bayesian Classifiers and Data Mining Techniques
No ratings yet
Bayesian Classifiers and Data Mining Techniques
7 pages
Naïve Bayes Classifier Explained
No ratings yet
Naïve Bayes Classifier Explained
65 pages
Naïve Bayes Classifier Explained
No ratings yet
Naïve Bayes Classifier Explained
19 pages
Naïve Bayes Classifier Overview
No ratings yet
Naïve Bayes Classifier Overview
19 pages
Bayesian and Ensemble Learning Techniques
No ratings yet
Bayesian and Ensemble Learning Techniques
95 pages
Bayes Classification Explained
No ratings yet
Bayes Classification Explained
41 pages
Runs Test in Educational Statistics
No ratings yet
Runs Test in Educational Statistics
57 pages
Prelim Stat&prob
No ratings yet
Prelim Stat&prob
3 pages
Input Data Modeling and Histograms Guide
No ratings yet
Input Data Modeling and Histograms Guide
4 pages
Missing Data Analysis in Longitudinal Studies
No ratings yet
Missing Data Analysis in Longitudinal Studies
48 pages
MEI Mathematics Formulae Guide
No ratings yet
MEI Mathematics Formulae Guide
32 pages
Heart Disease Prediction Algorithms
No ratings yet
Heart Disease Prediction Algorithms
3 pages
Understanding Least Significant Difference
No ratings yet
Understanding Least Significant Difference
22 pages
Engineering Data Analysis Midterm Exam
No ratings yet
Engineering Data Analysis Midterm Exam
2 pages
Correlation and Regression 9JjYMgYhTgkhD4YH
No ratings yet
Correlation and Regression 9JjYMgYhTgkhD4YH
71 pages
Weak Form Market Efficiency Analysis
No ratings yet
Weak Form Market Efficiency Analysis
2 pages
Nonparametric Tests in SPSS Guide
No ratings yet
Nonparametric Tests in SPSS Guide
6 pages
Mathematics II For CSE - JUN JULY 2024 - Supplementary
No ratings yet
Mathematics II For CSE - JUN JULY 2024 - Supplementary
4 pages
Understanding Weighted kNN Algorithm
No ratings yet
Understanding Weighted kNN Algorithm
4 pages
Chi-Square vs. Fisher's Exact Test
No ratings yet
Chi-Square vs. Fisher's Exact Test
10 pages
Understanding Hypothesis Testing Basics
No ratings yet
Understanding Hypothesis Testing Basics
97 pages
Measures of Dispersion Assignment
No ratings yet
Measures of Dispersion Assignment
3 pages
Machine Learning MCQs for Exam Prep
No ratings yet
Machine Learning MCQs for Exam Prep
16 pages
Predicting Amazon Product Prices Using ML
No ratings yet
Predicting Amazon Product Prices Using ML
34 pages
Statistical Methods in Dental Research
No ratings yet
Statistical Methods in Dental Research
42 pages
Correlation and Regression Analysis in Sports
No ratings yet
Correlation and Regression Analysis in Sports
15 pages
Chapter 4: Probability Distributions
No ratings yet
Chapter 4: Probability Distributions
8 pages
Salary Prediction from Experience Data
No ratings yet
Salary Prediction from Experience Data
6 pages
Multivariate Analysis Overview
No ratings yet
Multivariate Analysis Overview
21 pages
Linear by Linear Association in Ordinal Data
0% (1)
Linear by Linear Association in Ordinal Data
5 pages
Chi-Square Test of Association Exercise
No ratings yet
Chi-Square Test of Association Exercise
4 pages
Instagram Impact on Student Grades
No ratings yet
Instagram Impact on Student Grades
6 pages
Chi-Squared Test Applications in Ecology
100% (1)
Chi-Squared Test Applications in Ecology
28 pages
Statistical Inference: Estimators & Sufficiency
No ratings yet
Statistical Inference: Estimators & Sufficiency
2 pages
Understanding Skewness & Kurtosis in Data
No ratings yet
Understanding Skewness & Kurtosis in Data
19 pages
Microeconometrics Course Overview 2018
No ratings yet
Microeconometrics Course Overview 2018
3 pages

Understanding Bayesian Classification Techniques

Uploaded by

Understanding Bayesian Classification Techniques

Uploaded by

Bayesian Classification

Dr. Navneet Goyal BITS, Pilani

Use of Bayes Theorem in Nave Bayesian Classifier!!

Nave Bayesian Classification

called Simple BC Nave/Simple?? Conditional Independence

Nave Bayesian Classification

Nave Bayesian Classification

P(Ci | X ) =P( X | Ci) P(Ci) P( X )

P(X) is constant. Only

Nave Bayesian Classification

Nave assumption: attribute independence = P(x1,,xn|C) = P(xk|C) P( X | C i )

Nave Bayesian Classification

Nave Bayesian Classification

Nave Bayesian Classification

Nave Bayesian Classification

Nave Bayes Classifier: Issues

what you observed in WEKA! what you observed in WEKA!

Nave Bayes Classifier: Issues

Nave Bayes Classifier: Issues

Nave Bayes (Summary)

to isolated noise points

assumption may not hold for some attributes

Bayesian Belief Networks

P(X|Ci) = P(x1, x2, x3,,xn|C) = P(xk|C)

Bayesian Belief Networks

Belief Networks Bayesian Networks Probabilistic Networks

Bayesian Belief Networks

Specially for classification problems in which attributes are somewhat correlated

P(X|Ci) = P(x1, x2, x3,,xn|C)

Bayesian Belief Networks

Networks has 2 components Acyclic Graph (DAG) Probability Table (CPT)

Bayesian Belief Networks

Bayesian Belief Networks

The conditional probability table for the variable LungCancer

Bayesian Belief Networks

Bayesian Belief Networks

Arcs allow representation of causal knowledge

Bayesian Belief Networks

CPT for a variable Y specifies the conditional distribution P(Y|parents(Y))

P(LC=Y|FH=Y,S=Y) = 0.8 P(LC=N|FH=N,S=N) = 0.9

CPT for LungCancer

Bayesian Belief Networks

Bayesian Belief Networks

node, representing a class label attribute

More than one output node

Network topology may be given in advance or inferred from data

Gradient Descent (self study)

Falls under the class of algos called Adaptive Probabilistic Networks

You might also like