|
|
|
|
|
by wkat4242
541 days ago
|
|
How much context do you get on the $20 plan? I run llama3 at home which technically does 128k but that eats vram like crazy so I can't go further than 80k before I fill it (and that is with the KV store already quantified to 8 bit). I've been thinking of using another service for bigger contexts. But this may not make sense then. |
|