Hacker News new | ask | show | jobs
by derrickrburns 886 days ago
AI has sparked new interest in high dimensional embeddings for approximate nearest neighbor search. Here is a highly scalable, implementation of a companion technique, k-means clustering that uses Spark 1.1 written in Scala.

Please let me know if you fork this library and update it to the latter versions of Spark.

1 comments

Just curious, have you actually profiled this against running on a single large-memory machine?