Hacker News new | ask | show | jobs
by wluk 551 days ago
Thanks! Yeah that's an excellent idea - this is my response from another thread:

I have a feeling that Perplexity and ChatGPT are doing something similar [caching], since common questions I'd ask like "top movies this year" will be answered nearly-instantaneously, way faster than GPT-4o could have done on its own.

The only explanation for this is that so many users ask certain questions, they cache the response and return the cached answer.

I'd love to do this for Ithy, but it'll be a while before I get the scale of ChatGPT/Perplexity that's needed for this...

1 comments

i started looking into using Cloudflare AI gateways for this exact [caching] reason a few months ago but got distracted with GPU Cloud Run so i never did get decent load/numbers on the AI gateway cache to see if it was worth bothering about