Hacker News new | ask | show | jobs
by indrex 582 days ago
At first glance this seems to be a run of the mill clustering project. Why is it on the HN front page? Am I missing something?
3 comments

Most of the old computational methods have been forgotten (or never learned) so folks here get excited when they are posted.

Artisanal AI.

Yeah one of the first open-source recommendation engines I ever worked with was called Voogo[1] and I believe it was based on k-means. This was back in 2008 or so?

For someone who had never been exposed to any of the math behind this kind of thing, it was an interesting implementation, and the source code was very readable.

The original website seems to be gone and I couldn't find a Git link so apologies for Sourceforge.

1: https://sourceforge.net/projects/vogoo/

I'm no expert in ML, but beyond the research and emerging work in unsupervised learning clustering seems to be the most common approach, and there's nothing conceptually new here in the past 10+ years. Don't get me wrong, computationally we can do stuff with a ridiculous # of dimensions that was hard/impossible before and there are new algorithms but I was doing KMeans and DBScan/HDBscan and Gaussian mixtures in grad school 15 years ago in relation to databases, I had never heard of "Machine Learning" and we were in the glacial stage of the AI winter. There's some newer work that is based on human judgement for results but clustering still seems to be the mainstream "data validated" approach...
I have a data set from the 1980s and the programs to apply these techniques to it, written in compiled BASIC (predates Microsoft QuickBASIC). They ran their analyses on run of the mill IBM PCs. Really wish I could get it all into a modern framework…

The authors published a respected book based on the data and used it as the foundation for a bunch of other applied research. Sometimes I wish I could resurrect those researchers, or at least have one of their ghosts stop by and see what we can do with computers today.

What’s wrong if it’s not new?

And by the way, neural networks aren’t new either.

Might have been k-nearest-neighbors rather than k-means. Knn can be used for "recommended because you bought X" or "users like you also bought X" type recommendations that relate user to user or item to item.

K-means could potentially be helpful to group together common users/items if e.g. you're memory constrained and don't want to give each user a fully unique embedding entry so that's also possible.

You are correct! It was.

Thanks for the correction

I’m curious what new methods have replaced this type of analysis?
it's still widely used and can be validated without significant human judgement. Implementations are more efficient, and computatal complexity is through the roof but the approach is still legit.
Someone found it interesting. That's all it takes to qualify to HN.
I find this kind of thing interesting