Hacker News new | ask | show | jobs
by fnordpiglet 1121 days ago
It does make me wonder what the converged fixed point on this technique is. If I fine tune with GPT4 to make model A, which then performs better than GPT4, then fine tune model B with A, at what point does either artifacting or diminishing returns set in?
2 comments

GPT-4 is powerful over a diverse set of tasks. They use it to build a model which is better for a narrow sub-task. Pretty sure the model is sub-optimal to GPT-4 for everything else.
Yeah, but there have been papers being published in general LLMs as well finetuned off of GPT4 instead of humans. Even in the narrow space the question remains. If I build a superior model for task X using gpt4 and it’s superior at X, can I think use my new model to train another model in X and continue to see benefits ?
For another possibility, see https://arxiv.org/abs/2305.15717. The new models may not actually be better - the evaluation may be broken.