Sum of squares

RSS(w)
  • \hat{y}_i is the predicted value of i

It can be viewed as the l_2 of error:

RSS(w) = ||\epsilon||_2^2 = \sum_{i=1}^N \epsilon_i^2

  • \epsilon_i =y_i - \hat{y}_i