Hacker News new | ask | show | jobs
by derefr 641 days ago
For coding problems specifically, you could get quite far by giving the model a the tool-use of a sandboxed compiler/interpreter (perhaps even with your project files already loaded into the sandbox); and then training the model to test its own proposed solutions in the sandbox and revise them until they actually produce the expected outputs.