Powered by Jupyter Book

Higher order derivatives in neural networks¶

In deep learning the hessian is not feasible to represent.

Krylov methods¶

Are iterative techniques to preform various operations such as approximating matrix inversion, approximating eigenvectors (eigenvalues) using only matrix vector products.

Beta function Metropolis hastings proposal distribution