0% found this document useful (0 votes)
18 views7 pages

Understanding Dirichlet Process Models

The Dirichlet process (DP) is a Bayesian nonparametric model that generates distributions over distributions, particularly used in infinite mixture models. Latent variable models involve unobserved variables and have historical roots in factor analysis, while Latent Dirichlet Allocation (LDA) is a generative model that categorizes observations into topics based on latent variables. LDA operates as an unsupervised learning algorithm to explain data similarities through unobserved groups.

Uploaded by

suryau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views7 pages

Understanding Dirichlet Process Models

The Dirichlet process (DP) is a Bayesian nonparametric model that generates distributions over distributions, particularly used in infinite mixture models. Latent variable models involve unobserved variables and have historical roots in factor analysis, while Latent Dirichlet Allocation (LDA) is a generative model that categorizes observations into topics based on latent variables. LDA operates as an unsupervised learning algorithm to explain data similarities through unobserved groups.

Uploaded by

suryau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Dirichlet Process

 The Dirichlet process (DP) is a stochastic process used in Bayesian


nonparametric models of data, particularly in Dirichlet process mixture
models (also known as infinite mixture models).
 It is a distribution over distributions, that is, each draw from a Dirichlet
process is itself a distribution
 It is called a Dirichlet process because it has Dirichlet distributed finite
dimensional marginal distributions, just as the Gaussian process, another
popular stochastic process used for Bayesian nonparametric regression, has
Gaussian distributed finite dimensional marginal distributions.
 Distributions drawn from a Dirichlet process are discrete, but cannot be
described using a finite number of parameters, thus the classification as a
nonparametric model.
Dirichlet mixture model
Click here

• [Link]
Latent variable modeling
 A latent variable model, as the name suggests, is a statistical model that
contains latent, that is, unobserved, variables.
 Their roots go back to Spearman's 1904 seminal work[1] on factor
analysis, which is arguably the first well-articulated latent variable
model to be widely used in psychology, mental health research, and
allied disciplines
 A latent variable, defined in the broadest manner, is no more
mysterious than an error term in a normal theory linear regression
model or a random effect in a mixed model.
For example
 In a latent variable model for measuring level of depression (the latent
variable of interest), the full range of clinician ratings or self-reported
symptoms of mood disturbance, anhedonia, sleep disturbance, weight
problems, psychomotor problems, worthlessness or guilt
Latent Dirichlet Allocation
• Latent Dirichlet Allocation (LDA) is a generative statistical model that
explains a set of observations through unobserved groups, and each group
explains why some parts of the data are similar.
• The LDA is an example of a topic model.
• In this, observations (e.g., words) are collected into documents, and each
word's presence is attributable to one of the document's topics.
• LDA is an unsupervised learning algorithm that attempts to describe a set
of observations as a mixture of different categories. These categories are
themselves a probability distribution over the features.
• LDA is a generative probability model, which means it attempts to provide
a model for the distribution of outputs and inputs based on latent variables.
This is opposed to discriminative models, which attempt to learn how
inputs map to outputs.

You might also like