|
|
|
|
|
by user43928
7 days ago
|
|
Has there been any evidence of a well known provider rerouting to lower quality models? Last I saw, engineers working at OpenAI denied this on HN. I saw that someone set up a tracker that aims to record the performance of the models, and so far it has not shown any statistically significant deviation in performance for Codex, and not yet enough data for Claude: https://marginlab.ai/trackers/codex/ |
|
The implementation was so borked, SamA went back on Reddit and apologised: https://old.reddit.com/r/ChatGPT/comments/1o6jins/updates_fo...
Model re-routing happens for coding tasks too. For example, in OpenAI support pages used to (at least 1 month ago when I checked) mention that if they automatically use a cheaper -mini to accomplish the task behind the scenes, you’ll be charged -mini prices even if you selected a more expensive model. I just checked again and they’ve removed it, but there’s probably archives.
Finally, even if they’re the same weights, you don’t know what quantisation you’re running at. Adaptive quantisation based on load (given workday peaks), or similar techniques, have been happening since the ChatGPT 3.5 days; the techniques are probably more advanced now.