Multinoulli exponential family¶
\[Cat(x| \theta) = \exp{(\theta^T\phi(x) - A(\theta))}\]
\(\theta = [\log \frac{\mu_1}{\mu_K}, \cdots, \log \frac{\mu_{K-1}}{\mu_K} ]\)
\(\phi(x) = [I(x=1), \cdots, I(x = K -1)]\)
We can recover the mean from the canonical parameter:
\(\mu_k = \frac{e^{\theta_k}}{ 1 + \sum_{j=1}^{K-1} e^{\theta_j}}\)
From this we can find:
\(\mu_K = \frac{1}{\sum_{j=1}^{K-1}e^{\theta_j}}\)
Hence:
\(A(\theta) = \log(1 + \sum_{k=1}^{K-1} e^{\theta_k})\)
If we define \(\theta_k = 0\) we can write \(\mu = S(\theta)\) and \(A(\theta) = \log \sum_{k=1}^K e^{\theta_k}\) where \(S \) is the softmax function.