Bayesian view of exponential family¶
The bayesian analysis is considerably simplified if the prior is conjugate to the likelihood. Informaly we can say that the prior \(p(\theta| \tau)\) has the same from as the likelihood \(p(D|\theta)\). For this to make sence, we require that the likelihood have finite sufficient statistics, so we can write \(p(D|\theta ) = p(s(D)| \theta)\). This suggest that the only family of distributions for which conjugate priors exist is the exponential family.
Likelihood¶
The likelihood of the exponential family is given by:
\(s_N = \sum_i s(x_i)\) sufficient statistics
In terms of canonical parameters this becomes:
\(\bar{s} = \frac{1}{N}s_N\)
Prior¶
The natural conjugate prior has the form:
\(\tau_0 = v_0 \bar{\tau_0}\)
This separates out the size of the prior pseudo-data \(v_0\) from the mean of the sufficient statistics on this pseudo-data \(\bar{\tau_0}\). Then the canonical form of the prior becomes:
Posterior¶
So we see that we just update the hyper-parameters by adding. In canonical form this becomes:
Here we se that the posterior hyper-parameters are a convex combination of the prior mean hyper-parameters and the average of the sufficient statistics.