| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by smusamashah 699 days ago
	It should be benchmarked against something like RULER[1] 1: https://github.com/hsiehjackson/RULER (RULER: What’s the Real Context Size of Your Long-Context Language Models)

1 comments

> To incorporate this, we ask the model to complete a chain of hashes instead (as recently proposed by RULER):

They did mention it but didn't provide concrete benchmarks