|
|
|
|
|
by SamDc73
172 days ago
|
|
Even with something like a 5090, I’d still run Q4_K_S/Q4_K_M because they’re far more resource-efficient for inference. Also, the 3090 supports NVLink, which is actually more useful for inference speed than native BF16 support. Maybe if you're training bf16 matters? |
|