|
|
|
|
|
by idonotknowwhy
421 days ago
|
|
I didn't realise the 5070 is slower than the 3090. Thanks. If you want a bit more context, try -ctv q8 -ctk q8 (from memory so look it up) to quant the kv cache. Also an imatrix gguf like iq4xs might be smaller with better quality |
|