Hacker News new | ask | show | jobs
by edwin 1093 days ago
New techniques like semantic caching will help. This is the modern era's version of building a performant social graph.
1 comments

What's semantic caching?
With LLMs, the inputs are highly variable so exact match caching is generally less useful. Semantic caching groups similar inputs and returns relevant results accordingly. So {"dish":"spaghetti bolognese"} and {"dish":"spaghetti with meat sauce"} could return the same cached result.
Or store as sentence embedding and calculate the vector distance, but creates many edge cases