Hacker News new | ask | show | jobs
by Retr0id 445 days ago
It claims that OpenAI output is "free", but the OpenAI ToS says (among other things)

> You are prohibited from ... Using Output to develop models that compete with OpenAI.

If this were a software license, it'd surely be classified as nonfree.

3 comments

This means they would potentially cancel your account if you violated it, but not that they would claim ownership over the work.
But I believe since a ToS isn't a copyright license, this can't really be enforced using copyright laws. Most likely they can ban you. Is there even a slim chance you could be sued for breach of contract? Hell if I know, I'm not a lawyer.

Thinking another layer deep, though, if someone used OpenAI tools to develop software that then later got used to compete with OpenAI, surely it would fully workaround this already unenforceable ToS restriction anyways, right?

And as we can see from DeepSeek this clause means nothing, outside the realm of OpenAI blocking your access to its models.
I know someone from OpenAI claimed this, but is there any evidence that DeepSeek actually trained their models on output of the models OpenAI have?
They talk about some examples in their research.

> “Specifically, we initialized the DeepSeek-Prover using the DeepSeekMath-Base 7B model (Shao et al., 2024). Initially, the model struggled to convert informal math problems into formal statements. To address this, we fine-tuned the DeepSeek-Prover model using the MMA dataset (Jiang et al., 2023), which comprises formal statements from Lean 4’s mathlib2 that were back-translated into natural language problem descriptions by GPT-4. We then instructed the model to translate these natural language problems into formal statements in Lean 4 using a structured approach.”

Section 3.1 in https://arxiv.org/html/2405.14333v1

I was thinking of their general-purpose models, like DeepSeek-R1 and DeepSeek-V3, for which I haven't found evidence that OpenAI models were used to generate synthetic training data. But I didn't find this, so clearly my searching skills aren't great.