What is your exp on performance +40k tokens? I've not gone past that as I've heard reports that were problems start to arise. I'd be happy to know your experience in that regard.
I'm super happy with the performance, I generally run with 2 parallel slots so I only get about 128K context window. My experience with all llms is that they get more forgetful if you use the full window. (256-512K is the sweet spot for frontier models, 128k works for me with this current qwen)