|
|
|
|
|
by hedora
11 days ago
|
|
My current rule of thumb is 1GB gets you 1B parameters with a big context. (Qwen 32B fits in 32GB with 200K+ contexts) That’s with heavy compression of the weights and the context, of course. I haven’t gone through model evaluation + shoehorning at 128GiB yet. |
|