Hacker News new | ask | show | jobs
by Workaccount2 478 days ago
Yeah, LLM capabilities are measured with fresh context windows, yet people want to use them with 50k, 100k, 500k tokens.

As you pack in more and more context the model's abilities really start to deteriorate.

The first 10k tokens are the juiciest, after that it just gets worse and worse.

1 comments

Oh wow, I was thinking 500 tokens was way too much, since I've only ever done anything programmatic with tiny models on CPUs....