|
|
|
|
|
by kevinstubbs
1050 days ago
|
|
> can they really look at a DOM tree and tell what it is/does Yes, if you encode the DOM as a list of options for ChatGPT to choose from. In fact I developed a proof of concept of this for a client. https://jarvys.ai/ although they seem to have pivoted from automating just the browser to automating all software. |
|
A good example would be a misguided approach at making a bunch of labels with values that are aligned. Someone told this poor developer that <table> is bad, so they figure hey, let's use CSS to lay it out. They make a dictionary of the key/value pairs and iterate over all the keys in the first column into the first div and then output all the values in the second div.
div - label 1 - label 2
div - value 1 - value 2
If there's 100 key/values it's going to be hard for a human to figure out which value is for the 76th item, and LLMs have proven to be very bad at indexing problems like that so I wouldn't expect it to be a better story there.
(Not saying this wouldn't work in some cases, just couldn't be a general solution given the crap out there)