Hacker News new | ask | show | jobs
by achempion 565 days ago
Most of the content they crawl is SEO spam, I'm not sure if it's that helpful for model training
1 comments

SEO spam is the façade you get to see as a user. The gold is all that you don’t see. Just because they don’t show it on page one doesn’t mean it won’t be useful for training.