|
|
|
|
|
by klodolph
269 days ago
|
|
This does not make any sense to me. Can you elaborate on this? It seems “obvious” to me that if you have a tool which can request a web page, you can make it so that this tool extracts the main content from the page’s HTML. Maybe there is something I’m missing here that makes this more difficult for LLMs, because before we had LLMs, this was considered an easy problem. It is surprising to me that the addition of LLMs has made this previously easy, efficient solution somehow unviable or inefficient. I think we should also assume here that the web site is designed to be scraped this way—if you don’t, then “Accept: text/markdown” won’t work. |
|