Hacker News new | ask | show | jobs
by littlestymaar 414 days ago
> And much shorter than millions of tokens we expect from models nowadays.

Yet all current model still suck above 32k. (Yes some can do needle in a haystack fine, but they still fail at anything even slightly more complex over a long context).

32k is still much higher than humans' though, so I agree with you that it gives them some kind of super human abilities over moderately long context, but they are still disappointingly bad over longer context.

1 comments

Out of curiosity I estimated per day context size (of text only!) by multiplying reading speed by number of minutes: 16 * 60 * 300 = 288000 words ~ 288000 tokens.