|
|
|
|
|
by cauenapier
121 days ago
|
|
What’s wild about Step-3.5-Flash isn’t just the quality, it’s how close it is to fitting on personal hardware. The int4 weights are ~110GB. That sounds insane, but 128GB unified memory machines already exist, and people are running it today. A few years ago, a 200B-class model was pure datacenter territory. Now it’s “expensive laptop / workstation” territory. That’s a huge shift. The interesting part isn’t that this model is big. It’s that hardware curves and model efficiency are finally intersecting. Sparse MoE + quantization means frontier-ish reasoning is no longer locked to hyperscalers. We’re basically one consumer hardware generation away from this class of model being normal for power users. |
|