Hacker News new | ask | show | jobs
by perforator 790 days ago
If the model is properly regularised, it can be trained indefinitely without overfitting. E.g., you can add adversarial perturbations to images and train a visual model for a very long time.

I don't know if the current LLM architectures have any explicit regularisation or if it happens to be an intrinsic part.

1 comments

There are a number of forms of regularization used, obviously L1/L2 and also dropout. It's not as effective as scaling/perturbing patches in the image space.