Hacker News new | ask | show | jobs
by m101 262 days ago
I wonder if this is because a memory cap was reached at that output token. Perhaps they route conversations to different hardware depending on how long they expect it to be.