|
|
|
|
|
by rpdillon
216 days ago
|
|
There are degrees of acceleration. My understanding, limited as it is, is that groq and cerebras are using highly optimized acceleration to achieve their token generation rates, far beyond that in a regular GPU, and this leads to lower costs per token. Is this incorrect? |
|