|
|
|
|
|
by krasin
1166 days ago
|
|
> But I think the purpose of this research was not to create an excellent GPT model. Yes, understood. I feel that this phrase is a response to the other commenter that suggested that Cerebras should release a ChatGPT-competitive model. I don't think it's easy and I don't think it's a focus for a hardware maker, such as Cerebras. > I believe it was to explore the scaling effects on Cerebras hardware and determine a helpful framework for compute-optimal training regimes so that customers who might use Cerebras hardware can be confident that: > 1) Standard AI/ML scaling assumptions still apply on this hardware. This is my point. Is it possible to train a 100B model on Cerebras hardware? 500B? In this respect, the quality is secondary to the capability for the purpose of demonstration of capabilities. |
|