| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zeruh 919 days ago
	Wouldn't it result in overfitting?

2 comments

triyambakam 919 days ago

The details on how exactly they may have used it to train their model is vague. I believe transfer learning or knowledge distillation are valid techniques based on the inference from other models.

link

kevsim 919 days ago

I would also think it'd be an incredibly expensive way to train a model.

link

ReptileMan 919 days ago

Depends. I wonder what is the minimum reasonable amount of different tokens needed to lift up the weights.

link

Jensson 919 days ago

You store the output from ChatGPT, you don't run it again every time you do a training step. Generating millions of examples to add to your own training wont cost much at all relatively.

link