Hacker News new | ask | show | jobs
by jaimex2 1124 days ago
Ah, the old bait and switch.
2 comments

From what? A free product? Do you know how much compute it takes to run a single request?
I don't but I'd like to know.

I was under the impression that it was mostly GPU vram based but once the model is loaded, it could produce output quickly? I'm probably over-simplifying things...

gpt-3.5-turbo (default ChatGPT model) takes 8 A100s, ~$10k each. [0]

The latest gpt-3.5-turbo model generates very quickly and cheaply (in part to some recently-discoverd optimization techniques... older versions cost 10x more). While the required hardware to run GPT-4 is currently unknown, it generates considerably slower on average and its much higher cost points to a higher hardware cost.

And this is per request. It's bananas.

[0] https://www.servethehome.com/chatgpt-hardware-a-look-at-8x-n...

I'm not arguing or complaining.

Just highlighting the tactic :)

It feels like they've scale back how much ram must be used for gpt3 to give more to gpt4 playing users.