|
|
|
|
|
by suchintan
600 days ago
|
|
Agreed. In the short term (X months) I expect the HTML Distillation + giving text to LLMs to win out.. but the long term (Y years) screenshot only + pixels will definitely be the more "scalable" approach One very subtle advantage of doing HTML analysis is that you can cut out a decent number of LLM calls by doing static analysis of the page For example, you don't need to click on a dropdown to understand the options behind it, or scroll down on a page to find a button to click. Certainly, as LLMs get cheaper the extra LLM calls will matter less (similar to what we're seeing happen with Solar panels where cost of panel < cost of labour now, but was reversed the preceding decade) |
|