Hacker News new | ask | show | jobs
by jxodwyer1 1180 days ago
Thank you so much for your questions!

- As a managed service there are some overheads. We need to auth, validate and parse the inputs, fetch the index that is getting queried as we then need to use the index’s model to generate the embeddings. Then if the index is fine tuned/customized, we need to transform the embedding to the new vector space, to then call our vector index. We then fetch the metadata of the results from the db and parse the response to send it back.

- We’ve only tested upwards of 1M vectors ~ 1500 dimensions. But, more formal testing is required here and we plan to do so. I’m particularly curious about pg_vector and how it stacks up with other players as keeping the data central is a significant upside. We started with these lower vector indices to get something out there and iterate as a startup. But, scalability is part of what we want long term.

- We see both; we’ve had to turn down a very early lead with 100M+ vectors because it would derail other engineering efforts while we were starting. We’re now much better positioned to tackle that challenge as we have all the foundations.

- We haven’t considered this, but it’s an excellent idea. We’re currently discussing this with the team.

We would love to chat more; we appreciate your questions and feedback. Always happy to riff with someone who has seen issues around these use cases, like yourself. Feel free to reach us at founders@getmetal.io !

1 comments

Thanks for your reply.

- I have a good sense of the overheads - I was more curious about the latencies (ms) you are observing with the system today.

- Out of curiosity, why did you pick Redis? Is it mostly due to familiarity and experience with it in the past? I'm curious if you foresee any challenges scaling to larger datasets due to the in-memory limitations.

- I'm assuming you're going with a usage-based model for large volumes of data managed? Do you support spinning down the service (moving things to cold-storage), and auto-scaling things back up when users actually search for things. Wondering how you're thinking about this especially if customers don't use the APIs daily.

- For the 100M+ vectors, what type of data were they dealing with, documents, images or something else?

Thanks!