Hacker News new | ask | show | jobs
by ahoho 1230 days ago
Inference costs are non-trivial, and I wouldn’t be surprised if the cost of running ChatGPT (given the 3M/day figure) has surpassed that of training it. Without optimizations, training only uses ~3 times the memory as inference, so exponential parameter/cost scaling still affects both.

There’s ongoing research to reduce the computational costs of inference, but to my knowledge they only offer linear improvements (although I wouldn’t bet against more substantial reductions in the near future, particularly as these techniques are compounded).

1 comments

OpenAI has stated that their costs per query are a few cents. Not nothing, but nothing outrageous either. The cost will go down to < 1c within a couple of years I'm sure. The cost may be prohibitive to use LLMs for absolutely everything, but the universe of applications where companies can afford to spend 1c on an LLM answer is very large.
> the universe of applications where companies can afford to spend 1c on an LLM answer is very large.

I hear this a lot, but don't see it very often in practice. GPT and it's ilk are coming up on 3 years old now, and our most novel application thus far is the same textbox + response that Talk to Transformer had. That's sad.

I've heard the sell before ("imagine AI spreadsheets!") but I don't think people are willing to pay for answers that are regularly wrong in the long run. If your bridge only works half the time, people probably won't be inclined to pay your toll anymore.

Today, the typical customer support experience involves waiting for hours/days to get a response that is useless and/or wrong. LLM powered customer support software can be transformational, and this is just one example.
LLM-powered customer support can also be useless and wrong. Even in an ideal case, you're still retaining some customer support employees and only displacing the pitifully-paid call-center workers that staff the current Chat-As-A-Service offerings.

Until I see something like this rolled out with widespread success, I'm gonna doubt it. The second someone puts an AI agent on their website, it's a race to get the brand to endorse the most abhorrent thing possible. Then what?

You don't want the LLM to be completely unsupervised. The goal is assistance, not the equivalent of full self-driving. It takes a few seconds for a support tech to approve a response that would take 15 minutes to compose by a human. That's where the value is. You automate the rote repetitive work so the interesting intelligent work remains.
"The value" is in the self-driving part though. The support tech wants the good answer, not the part where they refresh ChatGPT 3 times because the output is unintelligible. If you're going to pay a human to do the job either way, I'd bet that a skilled technician out-performs an AI model operated by a classifier employee.