Y
Hacker News
new
|
ask
|
show
|
jobs
by
airgapstopgap
923 days ago
Mistral-small explicitly has inference costs of a 12.9b, but more than that, it's probably ran with batch size of 32 or higher. They'll worry more about offsetting training costs than about this.
Here's how it works in reality:
https://docs.mystic.ai/docs/mistral-ai-7b-vllm-fast-inferenc...