Hacker News new | ask | show | jobs
by jacquesm 3863 days ago
Surprising to see so much human input. Essentially google took a leaf out of Yahoo!'s playbook after first crushing them using just algorithms. Is this proof that beyond a certain degree of algorithmic extraction you need a human hand to advance or is it evidence for a return to the old days of a more curated index? (Or are there other possibilities?)
3 comments

Google has been upfront about greater use of learning AI (as opposed to hand-tuned algorithms like PageRank) to index and rank their search results. In order to return a good result, a learning AI system must first be trained on what a good result looks like. Since search returns results to humans, humans have to provide the guidance on what is a good or bad result.

Often an AI can be trained by a corpus, like training a spell correcter by feeding it the text of a dictionary. But the data that Google works is the entire web, which changes all the time, so they need constant feedback on what is good or bad.

Google also personalizes results now, so the AI needs to learn not just what is a good result, but what is a good result for that particular person, given everything else the AI knows about that person. That would require inputs from a wide variety of people.

I think it's proof that google has done an amazing sales job.

Their ML algos really come down to codified, scaled human intuition and decisions. However, to avoid having to answer for them (google has the ability to make or break many internet companies), they repeatedly go on and on about ML and pretend it's just math, as if there is a first principles answer to eg which candle company should rank highest passed down from Gauss.

The reason this is valuable to google is if they admit it is just collective human judgement, governments are much more likely to demand input.

I suspect this is also a fail safe which allows them to use learning algorithms that give less than well understood results that still perform well (think neural networks and deep learning).