| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ukuina 1144 days ago

FYI, 27 times per hour is basically nothing. With GPT4 over the API, I make 2-3 completion requests a minute, for 30-60 minutes at a time, when building an LLM app. This happens for 3-4 hours per day.

At the upper bound, this would be $2 * 3 * 60 * 4 = $1440 a day.

Thankfully, I am using retriever-augmentation and context stuffing into the base 4k model, so costs are manageable.

The 32k context model cannot be deployed into a production app at this pricing as a more capable drop-in replacement for shorter-context models.

2 comments

ZephyrBlu 1144 days ago

Depends heavily on your product. I can imagine there are quite a lot of use cases that have relatively infrequent API usage or highly cacheable responses.

link

bomewish 1143 days ago

> retriever-augmentation and context stuffing

Care you elaborate? This sounds very interesting & useful. Just anything about the setup and implementation would be super helpful.

link

ukuina 1142 days ago

This should get you started: https://haystack.deepset.ai/tutorials/22_pipeline_with_promp...

link