Hacker News new | ask | show | jobs
by dood 3570 days ago
Heh, classifying by "genre" is exactly what I was thinking of doing.

Had some debate with myself if I should start by focusing on training for shopping pages (product pages & product reviews) - because that might make some money; or start by training for forums - which I'd enjoy a lot more. Or build a more general system which would definitely never work and never get finished.

Google actually let you filter by "discussions" until a few years ago, so they certainly do this kind of classification. It didn't work perfectly but sometimes did the trick. Don't know why they removed that feature.

3 comments

Google removed it because they aim at the mass market.

Another perspective: people who find answers in forums are less likely to be interested in ads. And who knows, maybe making search shitty(in so many ways, not just formus), ad revenues rise ?

IIRC I found that the easiest was to train on shops, forums, and porn. But another tricky bit was conceptual - genre and category overlap sometimes (e.g. porn). Anyway I couldn't get it to yield proper results. But today we have things we didn't back then like opengraph and schema.org tags, that give more semantic info.
Iirc Gigablast had such a feature.