|
|
|
|
|
by rig666
1039 days ago
|
|
Just a suggestion but they have 4bit quantified models that are even smaller and faster that the 8 bit.
Your average 13B 4bit model is only about 8-9gb of VRAM. Using this I bet you can get a much higher perimeter model on the 3090. |
|
70B is currently 4-bit on this box, and once I have GPU accel for 70B, I'll see how the quality compares to 13B 8-bit.