|
|
|
|
|
by wruza
1053 days ago
|
|
if you encode the DOM as a list of options for ChatGPT to choose from Not sure if I understand this, does it mean you have to pre-cook DOM in a specific way? If yes, then isn’t the answer to my question “no”, like “no, it can’t take any site and use it as is”? |
|
So if you assume that you start on google.com, then your options are like 1.) Input with name "search", placeholder "search anything", value "" 2.) Button with label "I'm feeling lucky" 3.) Button with label "search"
Obviously, doing just one of these doesn't achieve the objective - it just needs to pick which one it thinks has the most "value" for completing the objective. If you repeat that enough times, then it can actually do what your overall goal of the session was.
I'm just giving a simplistic answer, and if you implemented only what I've written, then it's going to get stuck in a loop more often than not. But that's the gist of how you could encode the DOM into something that GPT can interpret and make decisions/take actions based on.