Hacker News new | ask | show | jobs
by EvgenyAr 1640 days ago
Sound awesome! Looks like you invested a lot, e.g. inference batch is not easy at all. Few follow up questions: - Do you really retrieve embedding on every batch? Doesn't it make sense to keep them in memory (must not be more than few gigs)? - Do you have/plan to incorporate inference feedback loop for retraining? Or you call it metric as well?

Thanks for an answer!

1 comments

Hey,

1. Our models act as building blocks for various feature apis/algos, but embeddings are required for all tasks. We have one major task which a user request sends their business items (think documents / docIds), anywhere from 1-10,000s — here we have to quickly retrieve those. We do also do a deep retrieval (basically ANN) Which requires no lookup.

2. First batch of embeddings was 4.2TB, last reduced down to ~400GB. Not sure if ssd/embedded db (leveldb) is still best but still works fine. Embeddings grow larger weekly - through business process generation (we do a big write once a week). but also through user processing — for these we actually store in mongo since they work best there / usually user specific / and are generated ad hoc so it doesn’t make since to re-shard leveldb embeddings.

3. We use active/passive feedback mechanisms. Think thumbs up/down for active, and contextual analysis for passive (did good things happen after AI use).

#3/ your last question is the hardest. We have a small data science team figuring out the best way to build fine-tuning sets. But the production model training is vastly different and more reliable. This is the classic stability/plasticity problem. We have yet to successfuly fine tune the model with feedback data. But we have informed how to better train the model. The best case scenario will likely be as-hoc/user-specific mechanisms to change results in real-time / based on current context collected (think tiktok algo, or knows what you’re in the mood for but knows how to bounce out or keep things well varied to keep your attention).