Hacker News new | ask | show | jobs
by BugWatch 1608 days ago
""" The server side scraping can also do some more heavy lifting - such as store the entire page contents in the database. This enables full text search, the ability to re-crawl URLs and more. """

For this aspect, inter-operation with the ArchiveBox (https://archivebox.io/) project would be ideal.