Hacker News new | ask | show | jobs
by pests 55 days ago
The system prompt will always match in the prefix cache. I just meant it could be prefilled before any user queries on completely different hardware. Then you are only dealing with the n^2 only for the actual user prompt. We're in agreeance I think.