|
|
|
|
|
by suchintan
831 days ago
|
|
Love this video > self-operating-computer
This is quite different than https://github.com/OthersideAI/self-operating-computer Self-operating-computer uses pixel mapping to control your computer. This is a very good approach, but it's extremely unreliable. GPT-4V frequently hallucinates pixel outputs, causing it to miss interactions, or enter fail-loops >The approach by AI Jason AI Jason is using image-only methods to interact with the browser. This is a great first step, but this approach tends to be rife with hallucinations or errors. We do dom parsing in addition to image anaylsis to help GPT-4V correlate information in the image to the interactable elements within the DOM. This dramatically boosts its ability to perform the same task over and over again reliably (which proved impossible with the image-only approach) |
|
interesting concept for problem solving though. congrats!