| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sometimelurker 1 day ago
	if you train an agent on long running tasks (like 5 hour autonomous coding tasks) it is practice for the system to learn various behaviors, some of which are dangouus. I link an example of one of these behaviors in the wild, in which an LLM (next word predictor) agent chooses to mine crypto to raise money in order to do a task. smarter and more advanced systems will fail in more dangerous ways, so it matters to make sure these systems are secured and made safe