Hacker News new | ask | show | jobs
by SilverBirch 98 days ago
What you are talking about isn't inference cost. Yes, fundamentally what matters is all the work that goes into the models, including R&D, training, and inference.

But we talk about inference separately for a reason: largely inference cost is the scaling cost. Once you have a model the margin on your inference is how you get to profitability, as long as your margin is positive you can make the entire enterprise profitable by just selling more tokens. This is the same fundamental business that chip fabs work on. Yes it costs them a lot to get to the next node, but what's important is the margin they can get on the wafers they sell, because they sell tonnes of wafers.

It's pretty core to the concept of SAAS businesses that yes, you do consider all costs. But you want to focus on the margin of the bit that scales. This is why WeWork exploded, the thing they were scaling only scaled up at negative margin.

The point is that if their inference margin is positive, they can "just" scale up and become profitable. If their inference margin is negative, then scaling up the business actually causes problems.