| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by geuis 774 days ago
	That's an interesting take. I've been experimenting with reducing the overall rendered html size to just structure and content and using the LLM to extract content from that. It works quite well. But I think your approach might be more efficient and faster.

1 comments

nodoodles 773 days ago

One fun mechanism I've been using for reducing html size is diffing (with some leniency) pages from same domain to exclude common parts (ie headers/footers). That preprocessing can be useful for any parsing mechanism..

link