Deep generative models¶
The form of a directed latent-variable model is:
\(x \in X\) are observed (possible continuous or discrete)
\(z \in R^d\) are latent
A deep generative model assumes that we have many hidden layers:
This allows us to learn hierarchies of latent observations.
Learning deep generative models¶
Given a dataset \(D=\{x^1, x^2, \cdots, x^n \}\) we are interested in:
learn the parameters \(\theta\) of p
approximate posterior inference over z, given x (find latent factors)
approximate marginal inference over x, given x with missing parts (perform fill in)
We make the following assumptions
Intractability, computing the posterior \(p(z|x)\) is intractable
Big data, the size of dataset D is too large to fit in memory, thus we can only work with sub-samples of D
Traditional approach¶
EM¶
We cannot perform EM since
the E step approximates the posterior \(p(z|x)\) which is intractable
the M step we learn \(\theta\) by looking at the entire dataset which does not fit into the memory
Mean-field¶
The time complexity of mean filed is exponential with the size of the markov blanket of the target variable. The markov blanked for z is complicated since it contains V structures, thus potentially we have to condition on all the z variables which is intractable.
Auto encoding variational bayes¶
Allows us to perform inference and learning efficiently.