| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by baobabKoodaa 1070 days ago
	I have personally never seen a situation where more training data (of similar quality) causes the model to perform worse. Have you seen such a situation? Please provide example. Your suggestion of running 1000 training runs with different subsets of data sounds excessive and unnecessary to me.

2 comments

nightski 1070 days ago

You have to know when to stop training. How are you going to do that without a test set? How do you know when you have achieved generalization without over-fitting?

link

wedesoft 1070 days ago

Early stopping is just one way of regularization. You can use L2 or dropout and then you can train until your model converges.

link

baobabKoodaa 1070 days ago

Usually I develop models with a train/validation/test split, where I'm measuring results on the validation set to decide the appropriate number of epochs to use. Then I burn the test set to evaluate performance. Then I train from scratch on the entire dataset (no split) and I use the same number of epochs to train here. Is this number of epochs optimal when the dataset is different? Of course not. But when you use regularization and other methods to combat overfitting appropriately, your training is not going to be overly sensitive to changes in epoch number anyway.

link

peterlk 1070 days ago

In the case of fine tuning, you can end up with catastrophic forgetting. Architecture can influence how data scales, and adding data doesn’t always improve performance

link