| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simonw 1215 days ago
	Is the term "ChatGPT" being used in place of GPT-3 here? Is this thing actually replicating the GPT-3 training process? The thing that makes ChatGPT interesting (over regular GPT-3) is the RLHF process, but this article doesn't seem to touch on that at all, unless I've missed something.

3 comments

de6u99er 1215 days ago

GPT-3 has been publicly covered in scientific publications. Same as GPT-2, and GPT. Those are all pre-trained models, where GPT is the abbreviation of Generative Pretrained Transformer. Transformers have been invented in 2017 at Google Brain [1].

-> https://medium.com/walmartglobaltech/the-journey-of-open-ai-...

GPT-4 is around the corner, and it's allegedly 100x more powerful than it'd predecessor.

-> https://medium.com/geekculture/gpt-4-100x-more-powerful-than...

[1] https://arxiv.org/abs/1706.03762

link

wcoenen 1215 days ago

That source about GPT-4 is nonsense. It claims GPT-4 will have trillions of parameter, and at the same time links to another page which says that it won't be much bigger than GPT-3:

https://www.datacamp.com/blog/what-we-know-gpt4

link

simonw 1215 days ago

That "100x" figure is extremely poorly sourced. I don't believe that at all.

link

imtringued 1214 days ago

And yet the intimidating pictures of a small and large circle keep getting posted everywhere.

link

de6u99er 1215 days ago

You're right. Apologies for that.

link

rnosov 1215 days ago

Surprisingly, they are using the term correctly. Although it seems that the main point of the post was to plug their "Colossal AI" framework but if you do an in-page search for "Low-cost replication of ChatGPT" subheading midway in the article they do claim to replicate RLHF thingy fully whatever it might be. Interestingly, they also suggest that it would work with both BLOOM and OPT meaning that you can potentially make things like ChatBLOOM and ChatOPT (even on a consumer grade GPU). Lack of demo doesn't inspire too much confidence though.

link

faizshah 1215 days ago

The article talks about their RLHF implementation briefly. There’s details on their RLHF implementation here: https://github.com/hpcaitech/ColossalAI/blob/a619a190df71ea3...

link