|
|
|
|
|
by ramoz
1778 days ago
|
|
I've scaled large transformer based models that supplement a lucene-based search engine. The architecture supports an ensemble approach where Lucene results are first-class and then we tailor similarity rankings with the models. It looks a lot like this:
https://huggingface.co/blog/bert-cpu-scaling-part-1 We have to store large "index" embeddings on SSDs and use leveldb for value retrievals of the lucene results. |
|