|
|
|
|
|
by wballard
2988 days ago
|
|
Besides reduction to search — solr / elastic / Lucerne / xapian, which is the most common approach I have used commercially, my actual favorite is precomputation. At the moment, keras embedding model, multiprocessing, annoy, and emitting csv (object id, other object id, score) as a batch process and loading it in my database. Queryti recommend. This trades a prebuilt for near instant runtime and — near Nothing net new to break. I’m working at commercial — 2-5 million item — scale, not ‘internet scale’ billions of items. Hope that helps. |
|