Hacker News new | ask | show | jobs
by beering 338 days ago
What do you mean by “pure language model”? The reasoning step is still just the LLM spitting out tokens and this was confirmed by Deepseek replicating the o models. There’s not also a proof verifier or something similar running alongside it according to the openai researchers.

If you mean pure as in there’s not additional training beyond the pretraining, I don’t think any model has been pure since gpt-3.5.

1 comments

Local models you can get just the pretrained versions of, no RLHF. IIRC both Llama and Gemma make them available.