Hacker News new | ask | show | jobs
by alsodumb 1175 days ago
Awesome work and it's a great HN contribution! I loved your blog post, it gave me a gist of everything I wanted to know without reading the paper (I have back ground in bandits).

I have a couple of non-technical questions:

1. Can you share some background on how this work developed? I am guessing there were many attempts to improve PAM over the last three decades, right? And in hindsight, bandit-based approach seems like a natural approach to try, right? Did you start with trying to improve PAM and realize no one else thought of a probabilistic/random approach?

2. Once you realized multi-armed bandit approach is the way to go, did implementation of the idea and empirical evaluation take a lot of time? I am guessing most of the effort went to providing complexity guarantees, right?

3. The paper has an interesting set of authors from diverse areas - areas in which the k-medoid problems seems highly relevant. This was partly the reason why I asked question 1. - was the project motivated by the need of such an algorithm in application areas or what is by looking for an area to apply the insight that bandit based approaches can actually perform better.

Overall, I really like the life-cycle of the entire paper. It started with a highly relevant and practical problem, gave an intuitive algorithm that comes with complexity bounds, has an accessible blog post to support the paper, and has what seems to be a very efficient implementation that can directly be used in production at scale. A lot of researchers miss the last part and move on to the next project (I am guilty of that) - kudos to you for spending time on the implementation! If you ever end up at UIUC, I'd love to buy you a coffee (:

PS: I am a grad student at UIUC and was scrolling by and stopped as I saw two familiar names: Ilan (took Random processes with him and loved it) and of course who in robotics wouldn't know Prof. Thrun (for those who don't, his Probabilistic Robotics is a mandatory reference in every robotics class).

1 comments

Thank you for the positive feedback! Will definitely take you up on that coffee next time I visit Ilan :)

To answer your questions:

1. When looking at k-medoids algorithms, we realized that PAM hadn't been improved since the 80s! Other, faster algorithms had been developed but sacrificed solution quality (clustering loss) for speed. Simultaneously, prior work from my coauthors had recognized that the 1-medoid problem could be solved via randomized algorithms/multi-armed bandit -- much faster but returning the same solution. Our key insight was that every stage of PAM could be recast as a multi-armed bandit problem, and that reusing information across different stages would result in further speedups.

2. Actually, the complexity guarantees/theory were pretty easy because we were able to use proof techniques that are common in the multi-armed bandit literature. The hardest part was implementing it in C++ and making it available to both Python and R via bindings. For the original paper we did everything in Python and measured sample complexity, but to make the algorithm valuable for users we had to implement it in a more performant language.

3. To be honest, this project came about from a chance meeting at a conference (ICML 2019). I was randomly introduced to Martin Zhang (ironically I met him first at the conference even though we were both at Stanford). Martin and Ilan had deep expertise in multi-armed bandits/randomized algorithms (and solved the 1-medoid problem), and I got really interested in their work when talking with them, primarily because it seemed like a straightforward win and useful for a lot of people.