Hacker News new | ask | show | jobs
by giantrobot 1648 days ago
Besides their Archive Warrior distributed crawler I imagine PushShift[0] is probably a starting point for them.

[0] https://files.pushshift.io/