InternLM – SOTA OS 7B and 20B model with 200K context length

20B 200K sounds perfect... But I have zero trust in the huggingface (or opencompass) benchmarks. They are all but meaningless because they can be, and frequently are, cheated.

And they present basically no information other than the standard metrics.

Will just have to try it myself, I guess. Yi 200K was quite a pleasant surprise already.