Hacker News new | ask | show | jobs
by wkat4242 541 days ago
How much context do you get on the $20 plan? I run llama3 at home which technically does 128k but that eats vram like crazy so I can't go further than 80k before I fill it (and that is with the KV store already quantified to 8 bit).

I've been thinking of using another service for bigger contexts. But this may not make sense then.

1 comments

The sales page shows the $20 plus plan has 32K context window.
Ah ok thanks. That's not much! But I know from my own system that context massively increases processing (and also memory but on the scale of a GPT model it's not so much). I guess this is why.

I only use GPT via the API anyway so it's pay as you go. But as far as I remember there's limits there too, only big spenders get access to the top shelf stuff. I only spend a couple dollars a month because I use my llama server most of the time. It's not as good as ChatGPT obviously but it's mine and doesn't leak my conversations.