|
|
|
Ask HN: Why is it taking so long to build computer controlling agents?
|
|
10 points
by louisfialho
486 days ago
|
|
I'm not a PhD but I assume training computer controlling agents is a straightforward problem as we can define clear tasks (e.g. schedule appointment with details xyz or buy product xyz) on real or generated websites and just let the models figure our where to click (through vlm) and learn through RL. What am I missing, why isn't this a solved problem by now? |
|
https://youtube.com/watch?v=shnW3VerkiM
https://youtube.com/watch?v=VQhS6Uh4-sI
First one is more impressive looking. Second one more reliable.
I think the real hard part is nobody wants to maintain these, and nobody really wants to pay to use them either. It's a lot of work and not something people do for free. It's no surprise these emerged (and won) in hackathons.
All the major operating systems are dedicating their full efforts into this, so it doesn't make much sense to actually raise money and do it.