|
|
|
|
|
by admax88qqq
2 hours ago
|
|
> One day, maybe not far from now, a breakthrough will allow huge LLMs (say 200B in size) to run well on an old 5 year old Dell desktop. But if you have such a breakthrough could you not also apply it and run 200T models on todays datacenters? |
|
The likes of Mythos show that the scaling laws are real, and you can x5/x2 the total/active params and get meaningful gains. If "inference per param" gets cheaper? Up the params and get more intelligence for the same price.