Hacker News new | ask | show | jobs
by password-app 189 days ago
Impressive image quality improvements. Meanwhile, AI agents just crossed a milestone: Simular's Agent S hit 72.6% on OSWorld (human-level is 72.36%).

We're seeing AI get better at both creative tasks (images) and operational tasks (clicking through websites).

For anyone building AI agents: the security model is still the hard part. Prompt injection remains unsolved even with dedicated security LLMs.