0% found this document useful (0 votes)
13 views30 pages

Understanding Bayesian Inference Basics

Uploaded by

junmokim123
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views30 pages

Understanding Bayesian Inference Basics

Uploaded by

junmokim123
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Bayesian Inference

Math & Stat for Data Science


Graduate School of Data Science
Seoul National University
Bayesian Inference
• Up to now, statistical methods we have discussed are
frequentist (or classical) method

• Frequentist are based on followings:


• Probability: limiting relative frequencies.
• Probabilities are objective properties of the real world.

• Parameters are fixed, unknown constants.


• Because they are not fluctuating, no useful probability statements
can be made about parameters.

• Statistical procedures should be designed to have well-


defined long run frequency properties.
• For example, a 95 percent confidence interval should trap the true
value of the parameter with limiting frequency at least 95 percent.
Bayesian Inference
• Probability describes degree of belief, not limiting
frequency
• Can make probability statements about lots of things, not just
data which are subject to random variation.

• Can make probability statements about parameters


• Even though they are fixed constants.

• Can make inferences about a parameter 𝜃 by producing


a probability distribution for 𝜃.
• Inferences (ex. point estimates and interval estimates) may
then be extracted from this distribution.
Bayesian Inference
• Bayesian inference is subjective
• Controversial
• Non-informative prior can address this

• Bayesian approach is a great platform to update


prior belief

• Computationally challenging
• Fast computational approaches (MCMC, Variational
Inference, etc) have been developed
Bayesian Inference
Bayesian Inference
• Bayesian theorem

• For continuous traits, use density function

• For IID samples


Bayesian Inference
• Notation: 𝑋 ! = (𝑋" , … , 𝑋! ) and 𝑥 ! = (𝑥" , … , 𝑥! )

Normalizing Constant
(Partition function)
Posterior distribution
• Posterior can be used to estimate the parameter
• Posterior mean is the parameter estimate

• Posterior interval: Find a, b such that


Bayesian Inference
• Example: Let 𝑋" , … , 𝑋! ~Bernoulli(p) and
𝑝~Uniform(0,1)
Bayesian Inference
• Posterior Mean

• Posterior Interval
Bayesian Inference
• Example: Let 𝑋" , … , 𝑋! ~Bernoulli(p) and
𝑝~Beta(𝛼,𝛽)
Bayesian Inference
Simulation
• In many situation, it is difficult to analytically derive
the posterior distribution
• Use simulation

• Generate 𝜃" , … , 𝜃# ~𝑓(𝜃|𝑥 ! )


• Use simulated 𝜃 for point estimation and confidence
interval
• MCMC is commonly used
Non-informative prior
• Prior has a great impact on the subsequent
estimation
• Example:
• Suppose 10 Bernoulli RV is observed with S=4
• Consider two priors
• Uniform(0,1) (= Beta(1,1))
• Beta(1, 20)
Prior/Posterior, n=10, S=4
Beta(1,1) Posterior
1.4
density

density

1.5
1.0
0.6

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xrange xrange

Beta(1,20) Posterior
20

0 2 4 6
density

density
10
0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

xrange xrange
Flat Prior
• One approach is using a constant as a prior
• Ex. 𝑓 𝜃 = 𝑐

"
• Since∫!" 𝑓 𝜃 𝑑𝜃 = ∞ , this is not proper probability
distribution

• However we still can carry out Bayesian Inference


𝑓 𝜃 𝑥 # ∝ 𝑓 𝜃 𝐿# (𝜃) ∝ 𝐿# (𝜃)
Flat Prior
Example: Suppose X ~ N(𝜃, 𝜎 $) and 𝑓 𝜃 = 𝑐. Posterior?
Flat Prior
• One of the drawback of flat prior is not
transformation invariant
• Ex. X ~ Bernoulli (p)
• Use flat prior of f(p)=1
%
• Now let 𝜓 = log &!%
, and then the distribution of 𝜓 is

• This is not a flat prior in terms of 𝜓


Jeffreys’ Prior
"/%
• Use 𝑓 𝜃 = 𝐼 𝜃
• 𝐼(𝜃) is a Fisher information
• This is a transformation invariant
• More details, please see the lecture note posted (from
Duke)
Jeffreys’ Prior
• Example: X ~ Bernoulli (p)
Prior/Posterior, n=10, S=4
Bayesian Testing
Bayesian Testing
• Putting prior on H0 and on the parameter 𝜃, and
then computing P(H0|Xn)
• Consider
Bayesian Testing
With P(H0) = P(H1)=1/2
Bayesian Testing
• Example: Suppose 10 Bernoulli RV is observed with S=4. Let
H0: p=0.7. With Jeffreys prior, P(𝐻'|X () ?
Bayes factors
• Most popular approach of Bayesian hypothesis test method is using
Bayes factor
) *! +" )
K=
)(*! |+# )

• Bayes factor is the odds of posterior probability with P(H0) = P(H1)=1/2

) *! +" ) ) /" *! )
K= =
)(*! |+# ) )(/# |*! )
Bayes factors
• Interpretation

Kass and Raftery (1995)


Wikipedia: [Link]
Bayes factors
• Example: Suppose 10 Bernoulli RV is observed with S=4. Let
H0: p=0.7. With Jeffreys prior, Bayes factor K=?
Bayesian Testing
• Bayes factors and posterior probability
P X ! H" )
K=
P(X ! |H# )

• Bayesian can provide a different answer compares


to the frequentist approach

• Cannot use improper prior


Summary
• Frequentist Inference vs Bayesian Inference
• Bayesian Method
• Prior
• Posterior
• Estimation of Posterior Distribution
• Non-informative prior
• Flat
• Jeffrey’s
• Bayesian Testing

You might also like