Hacker News new | ask | show | jobs
by bcleary 4772 days ago
No just the post bodies (question and answers) at the moment, but we are working parsing the comments and the post history, those datasets are about 4 times the size of the posts! So there are probably a lot more URLs in there, although we will have to decide if we treat all URLs the same or if we differentiate between URLs in post bodies contained in the dump, URLs in the post history (that may have been removed from the post) and URLs in the comments. Not maybe a concern for the website, but more so for research.
1 comments

You usually see notes or clarification posted to questions in the form of comments first, where the types of links are more introductory, general purpose, than specific as you might find in answers.

Can't wait to see the updated stats.

If you could make "more" load more than just a few more records, though, that'd make it a lot easier to dig deeper.