Regularization

Regularizataion:
  1. To avoid overfitting the data, we add the regularization parametere lambda.
  2. If we have lambda value as very high, then the emphasis is given to lambda instead of the training data, causing the model to underfit with high bias.
  3. If the value is very low or equal to 0, which means there is no regularization and the model is overfitting the training data, causing high variance.
  4. To select a good value of lambda, we need the training error to be low as well as the validation error to be reasonable and we get a curve something like this. If the validation error increases much higher than the training error, then we choose the optimum value of lambda at that point.