Hacker News new | ask | show | jobs
by bobbob27 345 days ago
You can't just take cost of training out of the equation...

If these companies plan to stay afloat, they have to actually pay for the tens of billions they've spent at some point. That's what the parent comment meant by "free AI"

1 comments

Yes, you can - because of LLama.

Training is expensive, but it's not that expensive either. It takes just one of those super-rich players to pay the training costs and then release the weights, to deny other players a moat.

If your economic analysis depends on "one of those super-rich players to pay" for it to work, it isn't as much analysis as wishful thinking.

All the 100s of billions of $ put into the models so far were not donations. They either make it back to the investors or the show stops at some point.

And with a major chunk of proponent's arguments being "it will keep getting better", if you lose that what you got? "This thing can spit out boilerplate code, re-arrange documents and sometimes corrupts data silently and in hard to detect ways but hey you can run it locally and cheaply"?

The economic analysis is not mine, and I though it was pretty well-known by now: Meta is not in the compute biz and doesn't want to be in it, so by releasing Llamas, it denies Google, Microsoft and Amazon the ability to build a moat around LLM inference. Commoditize your complement and all that. Meta wants to use LLMs, not sell access to them, so occasionally burning a billion dollars to train and give away an open-weight SOTA model is a good investment, because it directly and indirectly keeps inference cheap for everyone.
You understand that according to what you just said, economically the current SOTA is untenable?

Which, again, leads to a future where we're stuck with local models corrupting data about half the time.

No, it just means that the big players have to keep advancing SOTA to make money; Llama lagging ~6 months behind just means there's only so much they can charge for access to the bleeding edge.

Short-term, it's a normal dynamics for a growing/evolving market. Long-term, the Sun will burn out and consume the Earth.

The cost to improve training increases exponentially for every milestone. No vendor is even coming close to recouping the costs now. Not to mention quality data to feed the training.

The R&D is running on hopes that increasing the magnitude (yes, actual magnitudes) of their models will eventually hit a miracle that makes their company explode in value and power. They can't explain what that could even look like... but they NEED evermore exorbitant amounts of funding flowing in.

This truly isn't a normal ratio of research-to-return.

Luckily, what we do have already is kinda useful and condensing models does show promise. In 5 years I doubt we'll have the post-labor dys/utopia we're being hyped up for. But we may have some truly badass models that can run directly on our phones.

Like you said, Llama and local inference is cheap. So that's the most logical direction all of this is taking us.