Hacker News new | ask | show | jobs
by hungrigekatze 1093 days ago
Greg Brockman from OpenAI said a round table chat a few weeks ago that ChatGPT is heavily quantized since the end of Q1/early Q2 2023: https://www.reddit.com/r/mlscaling/comments/146rgq2/chatgpt_... ; I am looking for the source document / source quote from which I read it, but the big switch from 'not so stupid' to 'pretty damn stupid' occurred with the 1 March 2023 model switch.

That's around the time that I noticed `gpt-3.5-turbo` becoming lower quality, whether in the UI (ChatGPT) or programmatically (via `gpt-3.5-turbo`) API calls.

The 10-20x lighter-weight version of the models that they're (OpenAI) running now - the heavily quantized version - allows them and Microsoft to save on far-and-away their largest expense: cloud expenditure. I suspect/expect that the AMD GPU announcement with OpenAI will come to fruition in the next few years as all of these LLM companies depend upon large piles of GPU compute to be able to train their models and no one wants to be beholden to NVIDIA or any one other GPU manufacturer.