| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by htrp 807 days ago
	Did we ever get confirmation that GPT 4 was a fresh training run vs increasingly complex training on more tokens on the base GPT3 models?

1 comments

gpt-4 was indeed trained on gpt-3 instruct series (davinci, specifically). gpt-4 was never a newly trained model

what are you talking about? you are wrong, for the record

They have pretty much admitted that GPT4 is a bunch of 3.5s in a trenchcoat.

They have not. You probably read "MoE" and some pop article about what that means without having any clue.

If you know better it would be nice of you to provide the correct information, and not just refute things.

gpt-4 is a sparse MoE model with ~1.2T params. this is all public knowledge and immediately precludes the two previous commentators assertions