Hacker News new | ask | show | jobs
by mcbuilder 336 days ago
This article stands as complete hype. They just seem to offer an idea of "replication training" which is just some vague agentic distributed RL. Multi-agent distributed reinforcement learning algorithms have been in the actual literature for a while. I suggest studying what DeepMind is doing for current state of the art in agentic distributed RL.
1 comments

I didn’t think it was vague. Given an existing piece of software, write a detailed spec on what it does and then reward the model for matching its performance.

The vague part is whether this will generalize to other non software domains.

> write a detailed spec on what it does

A much harder task than writing said software