Hacker News new | ask | show | jobs
by jokethrowaway 996 days ago
Depending on how many tokens a typical response is using, pricing will vary wildly but a rough estimate put the fast one as more expensive than chatgpt3.5 and the cheap one as way cheaper.

Quality will likely be heaps worse than chatgpt3.5, given it's llama 7b

It's 0.96$ per 100 fast chat responses It's 0.0076$ per 100 slow chat responses

Chatgpt 3.5 with 50 tokens input, 50 tokens output will give you 0.02$ per 100 fast responses If the llm responses are 500 tokens in and 500 tokens out then you get 0.2$ per 100 fast responses

I presume people will flock to the cheap version for when they can't afford the price and quality of chatgpt3.5.

1 comments

So running fast is >100x expensive? That's too much of a difference
On the other hand, if it reflects their costs, I'm very happy to have an option that is 100x cheaper, rather than a more strategic one that raises the lower price by 10x.