Mixture density networks¶
It is a neural network whose output is a mixture of Gaussians.
p(y|x)=n∑i=1p(c=i|x)N(yi;μ(i),Σ(i)(x))
p(c=i|x) is the mixture prior, obtainable using softmax
μ(i) is the center of the ith component, if y is d dimensional an NN must produce a n d dimensional vectors.
Σ(i)(x) it is the covariance matrix in general it is chosen to be diagonal