Hacker News new | ask | show | jobs
by hot_gril 752 days ago
How many dimensions are in the original vectors? Something in the millions?
1 comments

1024 per vector x 41M vectors
1024-dim vectors would fit into pgvector in Postgres, which can do cosine similarity indexing and doesn't require everything to fit into memory. Wonder how the performance of that would compare to this.
It's been a while since I read the source to pgvector but at the time it was a straightforward HNSW implementation that implicitly assumes your index fits in page cache. Once that's not true, your search performance will fall off a cliff. Which in turn means your insert performance also hits a wall since each new vector requires a search.

I haven't seen any news that indicates this has changed, but by all means give it a try!

Thanks, I didn't know that. Last time I was dealing with these kinds of problems, pgvector didn't exist yet.