|
|
|
|
|
by Herring
98 days ago
|
|
If this stuff was so revolutionary, don't you guys think Qwen/DeepSeek would have snapped it up already? Both those teams are highly innovative, picking up and inventing new techniques all the time. Hell, Deepseek-v3 was one of the first to do large scale fp8 training. |
|