Y
Hacker News
new
|
ask
|
show
|
jobs
by
throwaway4aday
778 days ago
can you select a context length that fits in your GPU though? I suppose even a 128k model would be more than enough for almost everyone running these models on their own hardware.
1 comments
wkat4242
778 days ago
No you can't right now. Hopefully they will add this to ollama.
link
ml2
778 days ago
256k (actually 262k) is also up on HF:
https://huggingface.co/gradientai/Llama-3-8B-Instruct-262k
link