|
|
|
|
|
by woadwarrior01
81 days ago
|
|
There's this[1]. Model providers have a strong incentive to switch (a part of) their inference fleet to quantized models during peak loads. From a systems perspective, it's just another lever. Better to have slightly nerfed models than complete downtime. [1]: https://marginlab.ai/trackers/claude-code/ |
|
Isn't this link am argument against the point you are making?