Neural network universal approximator¶

Given enough hidden units and at least one nonlinear activation a neural network may approximate any function with arbitrary low error. However it still may fail if

  • the learning algorithm wont find the correct weights

  • the learning algorithm overfits

In general it is advised to use deeper neural networks than wider, this can lead to less units needed for better accuracy.

By choosing a deep model, we say that we believe that the learning consists of discovering a set of underlying factors of variations that can be expressed in terms of other simpler underlying factors of variations.