|
|
|
|
|
by thot_experiment
724 days ago
|
|
Had a chance to do some testing and it seems quite good on oneshot tasks with a small context window but as you approach context saturation it starts to go way off the rails. Maybe this is an implementation issue? I'm using Q6_K quants of both sizes in ollama. I'll report back if I figure it out. A larger context window really helps on RAG tasks, it's frustrating that a lot of the foundational models have such small windows. |
|