Ask HN: AI models are built on all of us, should their weights act like patents?

Y	Hacker News new \| ask \| show \| jobs

6 points by rhuber 1 day ago

These models are basically a compression of everything we already made. Our books, our code, every stack overflow answer, the dumb forum arguments we had at 2am. The companies creating these models didn't create that knowledge, they scraped it and compressed it, and what comes back out is mostly our own collective output re-encoded and handed back to us. The work is expensive and valuable, so I don't mean to diminish that, but exclusivity on this seems completely unfair. These organizations already play fast-and-loose with copyright, but even if they didn't, we can't all go retroactively copyright every word we've posted on the internet.

We all know how patents work. They are a controversial topic, especially in computing, but their intent is to disincentivize secrecy. The compute needed to create these models cost exorbitant sums of money, and I'm all for these companies profiting from this investment, and their own hard work. The rate at which models are being replaced is already staggering, and the competitive advantage of a model from 12 months ago is generally nonexistent. I can't imagine a good argument for Opus 3 or GPT 4 not being open weight today. Their economic value is limited.

Importantly, I'm not proposing that these companies give us a blueprint to recreate their models at all. The proprietary systems they use to create these things should stay their competitive advantage for as long as they wish.

I truly believe these companies will dominate knowledge work indefinitely, and in this case, a startup is not a viable path to competition. You need hundreds of millions of dollars to even get started. This is a game for the already rich. But if you want to make their use of our entire corpus of online information fairer, they shouldn't have exclusivity on the output indefinitely.

2 comments

jonahbenton 1 day ago

Advise you to listen to a recent Odd Lots podcast with Anjney Midha. It will be educational. The compute will be commoditized. Nobody has a monopoly on data. There are a zillion specialities of knowledge work to differentiate.

https://pca.st/episode/11fe6a6f-e463-4163-9393-20376e88c0db

link

bxk76 1 day ago

They wont dominate anything. They already are feeling pressure none of them are trained to handle.

The UN/UNESCO/Pope etc have all been talking about it for a year or two now. We are slowly going to see the appearance of something called a Digital Public Good.

Its like a road/railroad/transmisson line builder trying to decide who, where and when people will use those object. Guess what? Through out history that delusion gets shat on.

link