| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by HeavyStorm 87 days ago
	There's no "just" in RL. Fine tuning is very important and could make a lot of difference.

2 comments

lukaslalinsky 86 days ago

Indeed, this is quite obvious on Claude models vs Gemini. I fully believe Gemini is more powerful model, but the post training process is nowhere near what Anthropic does, which results in Gemini being horrible at coding sessions, while Claude is excellent.

link

merlindru 87 days ago

apparently GPT-5 uses the same pretrain as 4o did, hah

link