| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by JimDabell 381 days ago

> The nodes used for training are rented, so that’s opex, right?

It’s capex. They are putting money in, and getting an asset out (the weights).

> The models are in some sense consumable?

Assets depreciate.

1 comments

bee_rider 381 days ago

Obsolete software don’t depreciate like obsolete hardware. If an LLM company has trained a truly better model, they can simply make as many copies of their own model as they want. Thus, if the new model is truly better in every way, the old one is completely valueless to them (of course there might be some tradeoffs which mean older models can stick around because they are, say, smaller… but, ultimately they will be valueless after some time).

Because models are still being obsoleted every couple years, old models aren’t an asset. They are an R&D byproduct.

link

qeternity 381 days ago

> the old one is completely valueless to them

This is of course untrue for the same reason that people are still running Windows 2000.

link

bee_rider 381 days ago

> This is of course untrue for the same reason that people are still running Windows 2000.

What is the reason?

link

dcre 381 days ago

They’ve built processes around it and don’t feel like/can’t afford to/ don’t know to how change them.

link

bee_rider 381 days ago

I guess we’ll see how that shakes out.

Because models are getting much better every couple months, I wonder if getting too attached to a process built around one in particular is a bad idea.

link

stavros 380 days ago

I would agree if Windows 2000 had the exact same APIs as the next version, but it doesn't. LLMs are text in -> text out, and you can drop in a new LLM and replace them without changing anything else. If anything, newer LLMs will just have more capabilities.

link

qeternity 380 days ago

> LLMs are text in -> text out, and you can drop in a new LLM and replace them without changing anything else. If anything, newer LLMs will just have more capabilities.

I don't mean to be too pointed here, but it doesn't sound like you have built anything at scale with LLMs. They are absolutely not plug n play from a behavior perspective. Yes, there is API compatibility (text in, text out) but that is not what matters.

Even frontier SOTA models have their own quirks and specialties.

link