Hacker News new | ask | show | jobs
by imh 2931 days ago
Search and recommendation is so interesting to me. The goal of it is fuzzy. If I search "pho" then after the highly reviewed pho places, how might you decide between really poorly reviewed pho places and really highly reviewed ramen places?

If you just return the pho, it's doing what I asked for, but not showing me things I might be interested in. If the other pho places are really bad, maybe you aren't showing me things I'd actually rather eat.

If you return the ramen too, then you're kinda not doing what I asked, which is frustrating, but you might be showing me things I'd rather eat, given the shitty alternatives. When there's no mechanism for me to say "but really, I just want pho" it can be even more frustrating.

In this case there's a clear line between bad and matching vs good and related, but in many real queries there's not such a clear line. The smarter the algorithms get, paradoxically, the more frustrated I get by real world examples of this dilemma. I guess I don't have much to add, but it fascinates the hell out of me.

1 comments

I think the solution is providing results with more structure than just a list of options. Being able to return something like "here's the list of pho places, but they aren't very good. We also have some ramen places here that are pretty good if you are interested." I think just having some sort of explanation why I've gotten a bunch of ramen places in my results would go a long way. At least giving some sort of explanation of the results could help a lot with that frustration. You could go even further and add some level of interactivity, allow the user to respond with "sure, show me the ramen results," or "no, show me the pho results anyway" would be the ideal.

The difficulty is that a lot of these sorts of queries, especially when machine learning is involved, are pretty opaque. You can find some set of results that meet the maximization of some metric in your feature space, but it's hard to explain why a particular result shows up there without resorting to a bunch of equations.

What do we want?

Context-aware natural language processing!

When do we want it?

When do we want what?