Hacker News new | ask | show | jobs
by Majromax 50 days ago
> That said, I was sympathetic to the recent bug reports —- to trigger one, you’d need to have a session that waited an hour doing nothing and then very specifically tested for in-context retrieval. I don’t want to run that test, do you want to run that test?

They introduced a feature/optimization that triggered after an hour's idleness, so testing that the session continued properly afterwards seems kind of important. If nothing else, even the working-as-intended feature (context cleanup) could impact model skill in a current or future model version, so it would be well worth measuring any impact as part of the test suite.