| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ruslandanilin 989 days ago
	Vespa.ai does a great job. Absolutely stunning thing!

1 comments

esafak 989 days ago

What do you like about it relative to alternatives? How fast is it?

link

dathinab 989 days ago

much more mature and feature rich then many of the competition listed in the article

to some degree it's more a platform you can use to efficiently and flexible build your own more complicated search system, which is both a benefit and drawback

some good parts:

- very flexible text search (bm25), more so then elastic search (or at least easier to user/better documented when it comes to advanced features)

- fast flexible enough vector search, with good filtering capabilities

- build in support for defining more complicated search piplines, including multi phase search (also known as rerankin)

- quite nice approach for more fine controlling about what kind of indices are build for which fields

- when doing schema changes has safety checks to make sure you don't accidentally brake anything, which you can override if you are sure you want that

- ton of control in a cluster about where which search system resources get allocated (e.g. which schemas get stored on which storage clusters, which cluster nodes should act as storage nodes, which should e.g. only do preprocessing or post processing steps in a search piplines and which e.g. should be used for calculating embeddings using some LLM or similar) Not something you for demos but definitly something you need once you customers have enough data.

- child documents, and document references

- multiple vectors per document

- quite a interesting set of data types for fields and related ways you can use them in a search pipline

- an flexible reasonable easy to use system for plugins/extensions (through Java only)

- support building search piplines which have sub-searches in extern potentially non vespa systems

- really well documented

Through the main benefit *and drawback* is that it's not just a vector database, but a full fledged search system platform.

link

freilanzer 988 days ago

> - multiple vectors per document

Can I have (multiple) vectors for a single field? That would be quite helpful.

link

jkb79 988 days ago

Yes, Vespa has a generic Tensor framework that allows you to index multiple vectors for a single field, see https://blog.vespa.ai/semantic-search-with-multi-vector-inde... for details.

field embeddings type tensor<float>(p{}, x[384]) to represent a multi-vector field { "0": [0.1....], "1": [0.2,..] }

link

dathinab 987 days ago

yes that is what I meant

generally if you have multiple embeddings for the same document you have two choices:

- create one document for each embedding and make sure non membedding specific attributes are the same across all of this document clones -- vespa makes this more convenient by having child documents

- have a field with multiple documents, i.e. there are multipel vectors in the HNSW-index which point to the same document -- vespa support this, too. It's what I meant.

vespa is currently the only vector search enabled search system which supports both in a convenient way, but then there are so many "vector databases" poping up every month that I might have missed some

link

bratao 989 days ago

+1 for Vespa. For me it is VERY resilient and production ready. It is such a dream compared to Elasticsearch, that we migrated from.

link

vinni2 989 days ago

Does Vespa have an equivalent of Kibana? and how hard was the migration?

link