Hacker News new | ask | show | jobs
by tellarin 641 days ago
We've built Cradle, a framework that leverages such models to perform complex computer tasks via the same general interface humans use: screen as input and keyboard & mouse operations as output.

It works both on regular software and in complex games like RDR2. And it doesn't cheat by using any game-/software-specific API, nor accessibility calls, nor DOM trees. :)

https://baai-agents.github.io/Cradle/

And we're still evolving it.