One recurring problem I have with Claude 2 is that it sometimes "bugs out" and starts to repeat the same token ad infinitum (which I still have to pay for). This happens with longer prompts, say, 30k. Have you encountered this issue?
I use it for classification for a personal project (non-commercial) and, for me, they are both pretty close in terms of quality. GPT-4 is better, but has a shorter window. I was hoping to reduce costs by using Claude exclusively, but that bug makes it too unreliable, sadly.
If Claude2 has an internal Rag, then this means also that the 200k context length only holds for queries that allow for an out of the box
Thanks for the insights!