Dirichlet Multinomial model¶
Here we generalize the notion of a coin flip where only 2 possible outcomes are possible, to the notion of a K sided die roll.
Likelihood¶
If we observe N die rolls \(D = \{ x_1, x_2, \cdots, x_N\}\), where \(x_i \in \{1,2, \cdots, K \}\) we assume the data is i.i.d, the likelihood has the form:
where:
\(N_K = \sum_{i=1}^N I(y_1 = k)\) is the number of times event k occured. (this is the sufficient statistics). Here we do not really care about the multinomial coeficient since it is an irrelevant constant factor.
Prior¶
We need to put a prior on the parameter \(\theta\) which is a K dimensional probability distribution. We use a Dirichlet distribution, which has support over all K dimensional probability simplex. And it is a conjugate prior to a multinomial distribution.
Posterior¶
MAP¶
The MAP estimate is:
which is just the mode of a dirichlet distribution
and if we set \(\alpha_k =1\) than we recover the MLE. (Empirical fraction of times face k shows up)