Hacker News new | ask | show | jobs
by hrpnk 312 days ago
With LM Studio you can configure context window freely. Max is 131072 for gpt-oss-20b.
1 comments

Yes but if I set it above ~16K on my 32gb laptop it just OOMs. Am I doing something wrong?
try enable flash attention and offload all layer to GPU