Hacker News new | ask | show | jobs
by CamperBob2 324 days ago
That sounds low by about 10x, assuming Don Quixote has 430k words (per Google).

Still, yes, I don't know of a single model that doesn't go off the rails if you actually try to take advantage of its context length specification.

1 comments

Well, I loaded up Llama 3 and downloaded the novel, and for the English translation we get 545997 tokens and in the original Spanish 653981 tokens. So when I estimated it did lose a an order of magnitude. Thanks for the correction.