Hacker News new | ask | show | jobs
by jbellis 900 days ago
Nice to see people care about index construction time.

I'm the lead author of JVector, which scales linearly to at least 32 cores and may be the only graph-based vector index designed around nonblocking data structures (as opposed to using locks for thread safety): https://github.com/jbellis/jvector/

JVector looks to be about 2x as fast at indexing as Lantern, ingesting the Sift1M dataset in under 25s on a 32 core aws box (m6i.16xl), compared to 50s for Lantern in the article.

(JVector is based on DiskANN, not HNSW, but the configuration parameters are similar -- both are configured with graph degree and search width.)

1 comments

Sift 1M is too small to make meaningful comparisons. Storing 96 floats * 1M only takes up 800Mb of memory.