Teacher Forcing¶
During the training the model receives the ground truth output \(y^{(t)}\) as an input at \(t+1\).

During test time we propagate only the predictions made form the hidden units.
During the training the model receives the ground truth output \(y^{(t)}\) as an input at \(t+1\).

During test time we propagate only the predictions made form the hidden units.