| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mcbuilder 336 days ago
	This article stands as complete hype. They just seem to offer an idea of "replication training" which is just some vague agentic distributed RL. Multi-agent distributed reinforcement learning algorithms have been in the actual literature for a while. I suggest studying what DeepMind is doing for current state of the art in agentic distributed RL.

1 comments

janalsncm 335 days ago

I didn’t think it was vague. Given an existing piece of software, write a detailed spec on what it does and then reward the model for matching its performance.

The vague part is whether this will generalize to other non software domains.

link

intrasight 335 days ago

> write a detailed spec on what it does

A much harder task than writing said software

link