| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by danielbln 1140 days ago

I've had access to the 32k model for a bit and I've been using this to collect and stuff codebases into the context: https://github.com/mpoon/gpt-repository-loader

It works really well, you can tell it to implement new features or mutate parts of the code and it having the entire (or a lot of) the code in its context really improves the output.

The biggest caveat: shit is expensive! A full 32k token request will run you like $2, if you do dialog back and forth you can rack up quite the bill quickly. If it was 10x cheaper, I would use nothing else, having a large context window is that much of a game changer. As it stands, I _very_ carefully construct the prompt and move the conversation out of the 32k into the 8k model as fast as I can to save cost.

2 comments

bamboozled 1140 days ago

Do you use it for proprietary code and if so, you don’t feel weird about it ?

link

danielbln 1140 days ago

Not weirder than using Github, or Outlook or Slack or whatever.

link

RivieraKid 1140 days ago

I wouldn't feel weird about it. The risks - someone stealing know-how / someone finding a security hole - are negligible.

link

guzik 1140 days ago

How it calculates the price? I thought that once you load the content (32k token request / 2$) it will remember the context so you can ask questions much cheaper.

link

mediaman 1140 days ago

It does not have memory outside the context window. If you want to have a back-and-forth with it about a document, that document must be provided in the context (along with your other relevant chat history) with every request.

This is why it's so easy to burn up lots of tokens very fast.

link