Hacker News new | ask | show | jobs
by never_inline 247 days ago
I think too much of RLHF is done on small scale tutorial-ish examples.

LLMs often write tutorial-ish code without much care how it integrates with rest of codebase.

Swallowing exceptions is one such example.