|
|
|
|
|
by measurablefunc
146 days ago
|
|
None of those are novel domains w/ their own novel syntax & semantic validators, not to mention the dearth of readily available sources of examples for sampling the baselines. So again, where does it say it works for a programming language with nothing but a grammar & a compiler? |
|
> here is no RL for programming languages.
and
> Either RL works & you have evidence
This is just so completely wrong, and here is the evidence.
I think everyone in this thread is just surprised you don't seem to know this.
Haven't you seen the hundreds of job ads for people to write code for LLMs to train on?