The dangers of overfitting

If you've ever trained any machine learning model you have faced the infamous problem of overfitting. It's seems so unfair that after some point, no matter how hard you try to keep familiarizing your model with your training data, the accuracy on the test data is stuck or, even worse, it decreases.

This phenomenon is known as overfitting. It essentially happens when your model has done as much as it could recognizing the patterns in the data and has instead started learning the noise. To illustrate this, try adding some new data points to the regression model below by clicking within the canvas. If you simulate your data to form a noisy linear relation you will see how the red curve (which has the required number of parameters to fit your points perfectly) adapts to the noise in your data, while the blue line captures the true —more general— linear relation.

Reset

The moral of the story is simple: be careful with the number of parameters in your model! It is essential that you find a model size that has enough learnable parameters to recognize the patterns in your data (preventing underfitting) but you must not overshoot by too much, or else your model will generalize poorly to unseen data. This is usually difficult and several other factors are at play, e.g. it's crucial to stop training when the loss on the validation data starts increasing steadily.

This post uses:

Created by AdrianDoM / @cakewalkingdot