|
|
|
|
|
by janalsncm
187 days ago
|
|
> It's actually insane the levels of understanding the algorithms that are responsible for serving us information have and how little we, the creators of said algorithms, understand what's going on in said algorithms. As others have said, keyboard mismatches are common enough that Google might have built out logic for it specifically. But thats not necessary and even “old school” search engines could learn these things. The first time “alemwjsl” is searched you might not have any data, but the user will probably fix their keyboard and retype in Korean. That gives you a query correction mapping. And you can assume if query1 yields no clicks and they update to query2, q1 is a synonym for q2 and serve results for q2 instead. Then, if a session contains a query “alemwjsl” and a click on midjourney.com and another session “midj” also contains a click on midjourney.com, those are co-clicked queries. You can also even start to represent queries by the words in their associated clicked documents or vice versa. This helps to get around the fact that people might search “how much superbowl tickets” and “superbowl tickets price” but the official page might not contain either of those strings. Of course there’s more advanced methods now (neural nets) but it’s cool to see how it worked in the past. https://www.kdd.org/kdd2016/papers/files/adf0361-yinA.pdf |
|