That's not really a problem if the user is frequently ingesting and vectorizing new data. They need a place to store it and efficiently query it. They can cache the queries for an old vector, but still need to compute new queries every time they have a new piece of content. Also old vectors might have relationships to the new vectors that have been inserted since caching, so you don't want to serve stale results.
if we forget about making the service for a moment, How would you store and process high dimensional vectors locally? What ready-made library/software to use? What datastructure?
Some storage options are:
- Store many vectors in a single HDF5 or LMDB file.
- Store single vectors in many small binary files (e.g. using numpy save() function).
To look up neighbors you might:
- Compute neighbors exhaustively (e.g. using Scipy distance functions).
- Use an Approximate Nearest Neighbors approach like one of the ones benchmarked here (https://github.com/erikbern/ann-benchmarks). These can be much faster than exhaustively computing neighbors, at the cost of some accuracy and having to periodically re-build an index.