Hacker News new | ask | show | jobs
by ers35 3288 days ago
See also: A dump of the stories, comments, and users from the Firebase API as a SQLite database with a full text search index: https://archive.org/details/hackernews-2017-05-18.db
1 comments

Can you tell about the time period and (estimate of) % of comments covered, for your DB and the dump posted?

Thanks.

My DB covers from https://hacker-news.firebaseio.com/v0/item/1.json?print=pret... (10/09/2006 at 6:21pm UTC) to https://hacker-news.firebaseio.com/v0/item/14372035.json?pri... (05/18/2017 at 11:58pm UTC)

The dump posted claims to cover from 1 to 14566367 (06/16/2017 at 3:03am UTC)

Ah, OK, forgot there's the item id for comments, ensuring 100% comment coverage.

(I read "the story vote count is inaccurate for certain stories because it is only scraped once and not updated" and thought some comments might be left out too.)

So, 145M at min. 10 sec. per comment, that's at least 40k hours worth, probably one order of magnitude more. Just writing time, reading maybe 3 orders of magnitude more.

Modern pyramids, they're impalpable ...