Hacker News new | ask | show | jobs
by vasinov 1147 days ago
The WebScraper tool uses Trafilatura [1] to scrape and parse HTML—nothing too fancy. "Scraping" a React site would require a totally different approach, probably something more akin to Adept's ACT-1 [2].

I run a local chat app built with Griptape and I use it to give me summaries of web pages or answer specific questions all the time :)

1. https://github.com/adbar/trafilatura/

2. https://www.adept.ai/blog/act-1

1 comments

Do you see an advantage with Trafilatura compared to say BeautifulSoup and other packages for scaping?
I think BeautifulSoup is great but it's more of a set of building blocks that requires developers to implement their custom scraping logic. Trafilatura is awesome because it just works out of the box for most common tasks related to web scraping.
Ah that’s nice to know, thank you.