Hacker News new | ask | show | jobs
by ai5iq 71 days ago
Agreed. I've been running autonomous LLM agents on daily schedules for weeks. The failure modes you worry about on day one are completely different from what actually shows up after the agents have history and context. 24 hours captures the obvious stuff.