Hacker News new | ask | show | jobs
by rnosov 1215 days ago
I think they are correctly referring to ChatGPT as GPT-3 + RLHF. In other words ChatGPT = GPT-3 + RLHF. So, 80GB A100 GPU would be required for both GPT-L AND RLHF (PyTorch version). And it looks to me from the TFA that the main thing that takes a lot of space is actually RLHF.

>I don’t understand how they went from talking about 175B params across 32 cards to 774M on one card. 175B divided by 32 is 5.4B.

They claim 774M is the size of GPT-L which if run in conjunction with their RLHF would require 80GB A100 GPU to train (using their RLHF PyTorch implementation). They additionally claim that training GPT-3(175B params) plus RLHF would need 64 * 80gb = 5120gb of memory if using PyTorch implementation of RLHF or 32 * 80gb = 2560gb if going Colossal AI route.

To be honest, these numbers do look to me to be a bit of a cheesy ad for their product but hey they need to put food on their table too. I'm not sure if the dataset would be such a huge problem otherwise Britannica would still be ahead of Wikipedia. Given an army of volunteers willing to produce it OpenAI brigade of contractors has no chance.

1 comments

If someone created a folding@home to crowd train an actually open ChatGPT, I'd gladly donate my spare resources to the cause.
That's unlikely to work. The memory has to be fast with low latency, even switching from on-board VRAM to system RAM slows performance at least 10-100x. The bottleneck isn't computing power it's I/O. Total bus bandwidth of a common small AI cluster is around 1 terabyte per second.

We really shouldn't be building an "open source" AI in the first place though, and it's going to be illegal to do so soon. The weaponization power will be made clear soon and that will justifiably spook everyone.

There's a significant number of people working hard on making certain tech illegal or at least heavily restricted. E2EE and Onion Routing comes to mind. That doesn't mean we should abandon them. In fact, in many cases it's an indicator that we should keep going.

Why do you think we should avoid an open source AI?

How do you plan to have differential technological development and careful alignment research if anyone is allowed to build Skynet in their garage?

I use and generally support E2EE and onion routing. E2EE and onion routing aren't inherently existential risks to the continued existence of life on Earth.

Please stop with the flagrant "AI" fearmongering over LLMs and other current-generation ML software. Not only are they not Skynet now, I do not believe it will be possible for simple iteration on this type of ML software to create anything remotely like Skynet.
LLMs are not going to pose an existential risk to anyone. Also, making AI development less accessible to the general public will not make it any safer.

I am willing to bet all this fear mongering singularity bullshit is just being peddled by large corporations with a vested interest to keep AI development out of reach from the general public.

These failure modes have been recognized since long before the current crop of AI developments.

You're spreading both incorrect information "making AI development less accessible to the general public will not make it any safer" and conspiracy theories "this fear mongering singularity bullshit is just being peddled by large corporations with a vested interest."

There is no alarm bell that tells us when we've reached the point of no return. Even if we don't end up with agentic AI and a sharp left turn, we don't want to live in a world where every organization with a few million dollars can build swarms of flying drones that flood a target area and stab to death anyone out in the open.

Some Nvidia hardware is already export controlled in the same manner as other dual use technologies. More restrictions are coming, not less.

Biohacking and minor isotope enrichment projects are par for the course in garages nowadays. Three-letter agencies don't care about me, so why should they care about ML 101 skynet adventures?
For this reason alone (corpos making AI illegal to maintain for mere mortals) we should strive to make as much progress in the truly open AI as possible.

The current dystopia is fairly dystopian as it is.

>We really shouldn't be building an "open source" AI in the first place though, and it's going to be illegal to do so soon. The weaponization power will be made clear soon and that will justifiably spook everyone.

Encryption was illegal not that long ago for the same reasons. Now it's the basis of all the digital economy. If we made it illegal again of the top 10 tech companies by market cap only Nvidia and TSMC would not be outright illegal to operate.

The timid cowardice that's taken over tech will not serve it well in the coming 20 years.

How do you plan to have differential technology development and thoughtful and cautious alignment research if we go building these things without a speed limit?

Giving a baby a hand grenade would be more responsible.

> How do you plan to have differential technology development and thoughtful and cautious alignment research if we go building these things without a speed limit?

We aren’t going to have those things anyway; the closest we’ll get is if development is relatively public and open and thus subject to outsider critique. The only interest the closed corporate restricted approach has in alignment is in controlling the research, suppressing unwelcome avenues of inquiry, and generating PR to assuage public fears.

Caution is for losers.
> We really shouldn't be building an "open source" AI in the first place though, and it's going to be illegal to do so soon.

How do you make that illegal while still allowing private corporations to build AI? How do you legally define AI without applying it to all kinds of existing applications and without stopping all research on AI? And while staying broad enough that simply using a slightly different technique would still qualify under that definition?

Replace "AI" with "uranium enrichment and nuclear research" and the answers fill themselves in.
Yes, and if you replace "uranium enrichment" with "teddy bears" it's a bedtime story for kids. That argument makes no sense.