| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by miki123211 327 days ago

Open AI is transforming those works, Deepseek is not.

OpenAI takes in code, books and articles and produces a model. This model can be used for novel tasks, like paraphrasing your own writing, translating your text to a different language, writing code according to a provided specification etc, even if there was nothing in the original corpus that exactly solved your problem.

To produce this model, you need four ingredients. The data, the compute, research effort and a lot of tedious RLHF work. While OpenAI uses the first one without providing author compensation (and it has no other option here), the latter three it provides entirely on its own.

People distilling from OpenAI do not create transformative works. They take Open AI's model and make a model of their own. Both models can do very similar things and are suitable for very similar purposes.

Distillation is just a particularly easy way of making an inexact copy of the model weights. The values of those weights will be very different, just as the values of each pixel in an illicit camera recording of a movie at a cinema are very different from those in the original version, but the net result is the same.

3 comments

tcldr 327 days ago

Just because we're unable to compensate many millions, perhaps billions of people, for using their work without a) permission, or b) remuneration, doesn't justify giving a blanket license to use it without some form of *serious* compensation that reflects the gravity of what is being created.

The current winner-takes-all approach to the outcome is wholly inappropriate. AI companies right now are riding atop the shoulders of giants. Data, mathematics and science that humanity has painstakingly assembled discovered, developed and shared over millennia. Now, we're saying the companies that tip the point of discovery over into a new era should be our new intellectual overlords?

Not cool.

It's clear that model creators and owners should receive some level of reward for their work, but to discount the intellectual labour of generations as worthless is clearly problematic. Especially given the implications for the workforce and society.

Ultimately we'll need to find a more equitable deal.

Until then, forgive me if I don't have much sympathy for a company that's had its latest model distilled.

link

AdamConwayIE 327 days ago

People always forget that back when OpenAI accused DeepSeek of distillation, o1's reasoning process was locked down, with only short sentences shared with the user as it "thought." There was a paper published in November 2024 from Shanghai Jiao Tong University that outlined how one would distill information from o1[1], and it even says that they used "tens of thousands" of o1 distilled chains. Given that the primary evidence given for distillation, according to Bloomberg[2], was that a lot of data was sent from OpenAI developer accounts in China in late 2024, it's not impossible that this (and other projects like it) could also have been the cause of that.

The thing is, given the other advances that were outlined in the DeepSeek R1 paper, it's not as if DeepSeek needed to coast on OpenAI's work. The use of GRPO RL, not to mention the training time and resources that were required, is still incredibly impressive, no matter the source of the data. There's a lot that DeepSeek R1 can be credited with in the LLM space today, and it really did signify a number of breakthroughs all at once. Even their identification of naturally emergent CoT through RL was incredibly impressive, and led to it becoming commonplace across LLMs these days.[3]

It's clear that there are many talented researchers on their team (their approach to MoE with its expert segmentation and expert isolation is quite interesting), so it would seem strange that with all of that talent, they'd resort to distillation for knowledge gathering. I'm not saying that it didn't happen, it absolutely could have, but a lot of the accusations that came from OpenAI/Microsoft at the time seemed more like panic given the stock market's reaction rather than genuine accusations with evidence behind them... especially given we've not heard anything since then.

https://github.com/GAIR-NLP/O1-Journey https://www.bloomberg.com/news/articles/2025-01-29/microsoft... https://github.com/hkust-nlp/simpleRL-reason

link

LearnYouALisp 327 days ago

YOu mean making something sound like it was either written on Reddit or in a paper mill and requires effort to quickly find the material of value like a reading a machine-translation

link