| That is a well recognised part of the LLM cycle. A model or new model version X is released, everyone is really impressed. 3 months later, "Did they nerf X?" It's been this way since the original chatGPT release. The answer is typically no, it's just your expectations have risen. What was previously mind-blowing improvement is now expected, and any mis-steps feel amplified. |
What we need is an open and independent way of testing LLMs and stricter regulation on the disclosure of a product change when it is paid under a subscription or prepaid plan.