| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by skepticATX 341 days ago
	OpenAI’s systems haven’t been pure language models since the o models though, right? Their RL approach may very well still generalize, but it’s not just a big pre-trained model that is one-shotting these problems. The key difference is that they claim to have not used any verifiers.

1 comments

beering 340 days ago

What do you mean by “pure language model”? The reasoning step is still just the LLM spitting out tokens and this was confirmed by Deepseek replicating the o models. There’s not also a proof verifier or something similar running alongside it according to the openai researchers.

If you mean pure as in there’s not additional training beyond the pretraining, I don’t think any model has been pure since gpt-3.5.

link

gallerdude 340 days ago

Local models you can get just the pretrained versions of, no RLHF. IIRC both Llama and Gemma make them available.

link