Hacker News new | ask | show | jobs
by zaiste 793 days ago
We are still exploring. We don’t have any concrete data yet, but in some instances, we've observed reductions up to ten times. This seems especially relevant to specific areas, e.g. chatbots, where similar questions happen more often.
1 comments

>We are still exploring. Fair point. Worth of looking into, is to create/train/tune small model (2b/7b) based on previous cached answers in case your knowledge index/domain is without changes in time.

Exciting times