|
|
|
|
|
by LeifCarrotson
255 days ago
|
|
LLMs do use simple "word prediction" in the pretraining step, just ingesting huge quantities of existing data. But that's not what LLM companies are shipping to end users. Subsequently, ChatGPT/Claude/Gemini/etc will go through additional training with supervised fine-tuning, reinforcement learning with reward functions whether human-supervised feedback (RLHF) or reward functions (RLVR, 'verified rewards'). Whether that fine-tuning and reward function generation give them real "intelligence" is open to interpretation, but it's not 100% plagarism. |
|