|
|
|
|
|
by avianion
597 days ago
|
|
Happy to announce this breakthrough, made largely possible by Nvidia's H200 SXMs and a proprietary speculative decoding algorithm. We've launched a production grade API endpoint at $3 per million tokens. We also have some capacity for fine tuning 405B, while still keeping the speed increases, so if you're interested please get in touch. |
|