Hacker News new | ask | show | jobs
by CuriousJ 654 days ago
This paper shows that 200-800 is the ideal chunk size; if you go above, the model starts getting confused / distracted. https://arxiv.org/pdf/2406.14497
1 comments

Makes sense. Thanks!