| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by phillipcarter 956 days ago
	Interesting. I was skeptical about some of their claims regarding longer context, since it's been my experience that these models just get lost after enough of it.

1 comments

msp26 956 days ago

Yeah, degraded performance on long contexts has been observed in plenty of other models [https://arxiv.org/abs/2307.03172] so I was cautious too. Unfortunately I don't have access to 4-32k. I would have liked to test that out too.

link