Hacker News new | ask | show | jobs
by Folcon 46 days ago
The reason I've been querying the 1 hour is a user's quota resets are often longer than that, as a result I've seen situations where someone builds a large context, then hits their quota limit, waits 2+ hours, their cache is gone, their first message then eats 20%+ of their current session quota and the user doesn't want to compact as they're still trying to get the model into a good understanding of the problem, this seems to be a really painful consequence for users on anything less than a max plan which seems like an unintended consequence of Anthropic's own system design choices?

IE How their quota and caching interact with each other, it doesn't make pro and max a little different, it makes it significantly different by unintentionally penalising pro users