Hacker News new | ask | show | jobs
by RetroTechie 406 days ago
I'd take just a 'scrape' of the text content of an article. Some sites take ridicilously long to load, don't work / display anything, or cause MBs of traffic for a few paragraphs of text (if that).

Separate that small % of content from the big % of overhead, for (almost) any site (news sites, in particular), and you have a winner.

1 comments

This was an issue I ran into a lot and why I settled eventually on a design similar to what you’re saying. Even for something seemingly simple like deciding what the title is or what’s the main picture is ridiculous hard with scraping alone so I have to pass most the data for ai analysis and then generate the details, summary and backstories. Thanks for checking it out