Hacker News new | ask | show | jobs
by ebalit 667 days ago
We built the cheapest Llama 3.1 70B inference API, specialized for tasks that are not time sensitive (ie. batch processing jobs for example).

Without any quantization our current price is 30cts ingest and 50cts output per million tokens. [1]

1: https://withexxa.com/#pricing

1 comments

Amazing! Please dont hesitate to open an issue or a PR Will update our dataset and add it.