|
|
|
|
|
by throwa356262
3 days ago
|
|
The TileRT approach swaps throughput for latency, which also means less overall efficiency Given the export restrictions this could mean they need to prioritise how to best use their limited hardware. But they could also be moving to Huawei GPUs like deepseek did and simply not have stable hardware or software for a large scale deployment yet. This is just speculation based on the MXFP4 support on Huawei GPUs that is lacking on some nvidia GPUs. |
|