|
|
|
|
|
by christina97
51 days ago
|
|
I recently set up the 26B A4B model up on vLLM on an RTX3090 (4-bit) after a hiatus from local models. Just completely blown away by the speed and quality you can get now for sub-$1k investment. I tried first with Qwen but it was unstable and had ridiculously long thinning traces! |
|