Hacker News new | ask | show | jobs
by philipodonnell 378 days ago
Isn’t this an arbitrage opportunity? Offer to pay a fraction of the cost per token but accept that your tokens will only be processed when the batch window isn’t big enough, then resell that for a markup to people who need non-time sensitive inference?
1 comments

You may have already noticed that many providers have separate, much lower, prices for offline inference.