Hacker News new | ask | show | jobs
by f311a 129 days ago
They are trained pretty hard to transpile the code between languages and do this pretty well because this can be done using RL.

You can force the agent not to use unsafe, this is why it burned $20000. Thousands of attempts against good tests with good boundaries set.