Hacker News new | ask | show | jobs
by edwin 1099 days ago
With LLMs, the inputs are highly variable so exact match caching is generally less useful. Semantic caching groups similar inputs and returns relevant results accordingly. So {"dish":"spaghetti bolognese"} and {"dish":"spaghetti with meat sauce"} could return the same cached result.
1 comments

Or store as sentence embedding and calculate the vector distance, but creates many edge cases