|
|
|
|
|
by bcleary
4772 days ago
|
|
No just the post bodies (question and answers) at the moment, but we are working parsing the comments and the post history, those datasets are about 4 times the size of the posts! So there are probably a lot more URLs in there, although we will have to decide if we treat all URLs the same or if we differentiate between URLs in post bodies contained in the dump, URLs in the post history (that may have been removed from the post) and URLs in the comments. Not maybe a concern for the website, but more so for research. |
|
Can't wait to see the updated stats.
If you could make "more" load more than just a few more records, though, that'd make it a lot easier to dig deeper.