| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adultSwim 1101 days ago
	Running an LLM every time someone clicks on a button is expensive and slow in production, but probably still ~10x cheaper to produce than code.

1 comments

edwin 1101 days ago

New techniques like semantic caching will help. This is the modern era's version of building a performant social graph.

link

daralthus 1101 days ago

What's semantic caching?

link

edwin 1101 days ago

With LLMs, the inputs are highly variable so exact match caching is generally less useful. Semantic caching groups similar inputs and returns relevant results accordingly. So {"dish":"spaghetti bolognese"} and {"dish":"spaghetti with meat sauce"} could return the same cached result.

link

m3kw9 1101 days ago

Or store as sentence embedding and calculate the vector distance, but creates many edge cases

link