|
|
|
|
|
by bigiain
226 days ago
|
|
> In 2022 the best available models was GPT-3 text-davinci-003 at $60/million input tokens. >GPT-5 today is $1.25/million input tokens - 48x cheaper for a massively more capable model. Yes - but. GPT-5 and all the other modern "reasoning models" and tools burn through way more tokens to answer the same prompts. As you said: > We're beginning to find more expensive ways to use the models though. Coding Agents like Claude Code and Codex CLI can churn through tokens. Right now, it feels that "frontier models" costs to use are staying the same as they've been for the entire ~5 year history of the current LLM/AI industry. But older models these days are comparably effectively free. I'm wondering when/if there'll be a asymptotic flattening, where new frontier models are insignificantly better that older ones, and running some model off Huggingface on a reasonably specced up Mac Mini or gaming PC will provide AI coding assistance at basically electricity and hardware depreciation prices? |
|
gpt-oss-120b fits on a $4000 NVIDIA Spark and can be used by Codex - it's OK but still nowhere near the bigger ones: https://til.simonwillison.net/llms/codex-spark-gpt-oss
But... MiniMax M2 benchmarks close to Sonnet 4 and is 230B - too big for one Spark but can run on a $10,000 Mac Studio.
And Kimi K2 runs on two Mac Studios ($20,000).
So we are getting closer.