|
|
|
|
|
by brucethemoose2
922 days ago
|
|
In this case, the 4090 is far more memory efficient thanks to ExLlamav2. 70B in particular is indeed a significant compromise on the 4090, but not as much as you'd think. 34B and down though, I think Nvidia is unquestionably king. |
|
I'm no expert, but to me that sounds like a recipe for bad performance. Does a 70B model in 2-bit really outperform a smaller-but-less-quantised model?