| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nostrademons 5902 days ago
	Many machine-learning systems get bootstrapped by their implementer sitting at a website clicking "Like" and "Dislike" buttons for a large randomly-chosen sample of possible data. If this strikes you as incredibly boring, you can farm it out with Amazon Mechanical Turk or other crowdsourcing schemes. You could also do cleverer variants of this, like putting image-recognition or OCR training sets into CAPTCHAs, submitting possible links to Reddit or Digg, or hosting Internet surveys with the questions of interest.

1 comments

apurva 5902 days ago

but the whole idea is... every one has their own notion of likes and dislikes.... am i missing something here?

link

nostrademons 5901 days ago

That's why recommendation systems are hard. :-)

You could try to identify a population of users whose likes and dislikes are expected to be "similar" to the user in question, though, and then base your training set off them. I believe that's how actual recommendation engines (eg. Amazon, YouTube) work. Of course, then you have to figure out how to identify similar users, which is another hard problem.

link