Hacker News new | ask | show | jobs
by dennisy 2226 days ago
I think you could get good results if you just penalise sites for the number of third party JS. Which shows by proxy a more established site/corp.

You could add a bunch of heuristics such as size, number of links etc.

Maybe even train a classifier to select the “smaller” part of the web.