Hacker News new | ask | show | jobs
by g_sch 1232 days ago
Maybe I'm just a newbie when it comes to building search algorithms, but how on earth would you maintain or test something like this? From experience, I know that search algorithms using far fewer ranking factors than this are deployed in production and treated as terrifying black boxes that no one wants to touch lest they break something in an unexpected way.
1 comments

Testing is easy with any half decent development setup. You should have some train/eval datasets and monitor metrics on them during training, this is ML 101. And do live A/B experiments for launch candidates.

Maintenance sure is hell, lots of sweat and tears. Just glancing over search/formula/webcommon/select_ranking_models.cpp makes me cringe, they must have many dozens if not hundreds of different models in prod by now. Each of them needing maintenance and lots of training data. Work on new ranking factors I suspect must be also highly frustrating: throwing stuff at wall^W catboost black box and seeing if it sticks, and if it doesn't you'd have little idea why and control over it. Imho google's approach (white-boxish interpretable top level ranking formulas) is far superior and maintainable at scale.