Y
Hacker News
new
|
ask
|
show
|
jobs
by
joshpen188
3353 days ago
Why didn't you use common crawl instead?
1 comments
dor_jack_2
3353 days ago
For our purposes Common Crawl's corpus was missing too many websites (possibly due to robots.txt configs of websites) Also we needed some deep coverage which CC could not provide.
link