Energy based models¶

  • many theoretical results about undirected graphical models depend on the assumption that $\( \forall x, \space \tilde{p}(x) > 0 \)\( this can be enforced by using an energy based model \)\( \tilde {p}(x) = \exp \{ -E(x) \} \)$

    • \(E(x)\) is the energy function, the \(-\) sign optional and it can be absorbed inside \(E(x)\) but it is common to have it, to have notational parity with statistical physics

    • \(\exp(z)\) is always positive thus no energy function will result in zero probability for any state

  • they are simpler to learn than general undirected graphical models, we do not need to constrain clique potentials, because of this we can use unconstrained optimization

  • for most models it is enough to compute the unnormalized \(\tilde{p}_{\text{model}}\)

  • in most cases we have latent variables \(h\) in the model, if we take the negative log we get the free energy form $\( F(x) = - \log \sum_h \exp \{-E(x,h) \} \)$