| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by krasin 1166 days ago

> But I think the purpose of this research was not to create an excellent GPT model.

Yes, understood. I feel that this phrase is a response to the other commenter that suggested that Cerebras should release a ChatGPT-competitive model. I don't think it's easy and I don't think it's a focus for a hardware maker, such as Cerebras.

> I believe it was to explore the scaling effects on Cerebras hardware and determine a helpful framework for compute-optimal training regimes so that customers who might use Cerebras hardware can be confident that:

> 1) Standard AI/ML scaling assumptions still apply on this hardware.

This is my point. Is it possible to train a 100B model on Cerebras hardware? 500B? In this respect, the quality is secondary to the capability for the purpose of demonstration of capabilities.