Hacker News new | ask | show | jobs
by TZubiri 28 days ago
Today I was thinking, if I start a company in the LLM tooling space, I would put in the company mission in the incorporation documents that client data will not be used to train.

The temptation and the value is too great, and the opt-in opt-out consent thing ends up being a fuckery where the company tries to trick the user into allowing them to take a look into the data, presumably because they are selling the product at a loss and need an alternative revenue model.

Just make it impossible from the get-go, the fine print would be that the data can be shared off-band explicitly, in an email, or if explicitly copy pasted in a support chatbox, but there would be no mechanism for us to read the data from the databases much less from the client.

I don't mean it would be an air-tight mechanism like Signal or ProtonMail, if a court order would ask us to produce client info, we would still reserve the right to produce the data, but exceptionally, and definitely not for training models.

2 comments

Not to be cynical but do you think this would matter at all? Are you saying that companies would hold themselves to their missions or even something that's legally binding?

> "Google is not a conventional company. We do not intend to become one."

> OpenAI being founded as a nonprofit and becoming for profit.

> Didn't Anthropic literally say they wouldn't train on your data or keep it for longer than 30 days unless legally required, and then decided to opt people in to having their conversations used for training?

if it's in the charter/articles of incorporation/ articles of organization, it's binding. If I break the mission and a.

> OpenAI being founded as a nonprofit and becoming for profit.

I think this is a common misconception, or a disregard for nuances. The NFP was not and cannot be converted to a Corp, that's kind of the idea of an NFP. However there exist satellite companies.

Sam Altman does not own shares of Open AI because there are no shares.

OpenAI has a for profit company (capped, Public benefit corporation), which Sama I don't think has shares in. It's an instrument for investments.

But every transaction needs to be fair and in kind, there can be no gifts at any point in a way that would magically negate the purpose of an NFP, Sama cannot cede the IP of ChatGPT to himself or one of his companies, that's not what's going on.

> Didn't Anthropic literally say they wouldn't train on your data or keep it for longer than 30 days unless legally required, and then decided to opt people in to having their conversations used for training?

Again, saying it, putting in terms of contracts (that can be retracted with notice), and putting it in the charter are all different.

More companies need to make, for lack of a better term, "oaths" of what they won't do as a company. My pitch on it is to tie it to financial penalties the company agrees to pay, somewhere in the "enough to incentivize a significant portion of our user base to sue us" territory, such that it would be financial suicide to violate them.
Contracts ad incorporations are designed for this, the issue is that the incumbent legal strategy is to use template documents, and to reduce potential disputes to 1$ in private arbitration, essentially legal's job is to make legal go away.

Another term I would incorporate is a Seppuku term, if we get hacked, I resign, the company goes bankrupt. Anything else is the wrong attitude to computer security for companies that want to scale to Global reach.