|
|
|
|
|
by cs-fan-101
1166 days ago
|
|
Recently, we announced in this post (https://news.ycombinator.com/item?id=35343763#35345980) the release of Cerebras-GPT — a family of open-source GPT models trained on the Pile dataset using the Chinchilla formula. Today, we are excited to announce the availability of the Cerebras-GPT research paper on arXiv. |
|
I mentioned that the sizes of the models are relatively small (13B max). Is it an inherent limitation, or training a bigger model is possible, just has not been done in this exercise?