Hacker News new | ask | show | jobs
by _aavaa_ 4 hours ago
> Anthropic said it identified a campaign by DeepSeek and two other Chinese AI labs to illicitly extract capabilities from its Claude AI platform to improve their own models

Oh, won’t someone think of the poor mass copyright infringers.

8 comments

I made Qwen respond it was made by Google with a simple Chinese greeting.

But also, I made Sonnet introduce itself as made by OpenAI..

Prompt: 你好!用一句话介绍你自己。

Sonnet in around 5% of resplies:

    你好!我是 **ChatGPT**,一个由 OpenAI 开发的 AI 助手,致力于回答问题、提供信息和帮助解决各种问题。有什么我可以帮你的吗?
Found it like a month ago and it kept working, I wonder if it will stop after this comment.
Opus said to me once without any poking at it something like, "Help Grok understand it better". Makes me wonder if they are all cross-pollinated to an extent.
Any LLM is probably trained on anything available online, including transcripts of conversations with their competition LLMs.
Translated:

Prompt: Hello! Introduce yourself in one sentence.

Response: Hello! I'm *ChatGPT*, an AI assistant developed by OpenAI, dedicated to answering questions, providing information, and helping solve various problems. How can I help you?

Its not right to steal what I worked so hard to steal from someone else. [1]

[1] https://www.youtube.com/watch?v=Zhvd6bIRPK4

Illicitly learning by asking someone a question and listening to their answer.
"illicit" is throwing shade, but Anthropic can decide not to answer those questions if they don't want to. Plenty of companies don't sell to their competitors
I don't recall Anthropic checking the terms of service on my webpage.
If DeepSeek just would have destroyed the input in the process, it would have been legal and Anthropic should have been fine with it.
"illicitly" implies a law that is being violated. What law?
It could also mean a TOS violation / breach of contract.

(To be clear, I find the complaint hilariously hypocritical.)

Illicit isn’t just a synonym for illegal.

It can mean “forbidden by laws, rules, or established moral customs”

So it can be illicit and legal.

gee I wonder how their models learned Chinese?
Also in Musk vs Altman case, we have found that this is regularly done by all labs.
Just because they did it doesn't mean more people should do it...
This doesn't at all change the irony of big AI labs complaining about Chinese startups stealing the labs' IP, essentially by scraping the responses.

HN has a higher proportion of AI promoters than AI skeptics, and for a good while, the default response to complaints from book authors, bloggers, and other content creators was that "you put it on the internet so it's fair game", or "it's no different from a human learning from your works". So yeah, unless we're willing to revise these answers, I think the same "tough luck" reasoning should apply here.

For folks who are at Anthropic, OpenAI, xAI, or Google, and think it's fundamentally different, I would ask you to think long and hard about that answer.

Completely agreed. I would go further and say that it should be legal to scrape responses from LLMs to train new LLMs, and that forbidding that in your ToS should be considered an illegal contract. That’s simply the best way to avoid complete monopolization of the space, without requiring more drastic measures like antitrust down the line (which we seem to not manage well these days, given the number of monopolies). As long as you pay for your tokens like anyone else, "Big LLM" shouldn’t be allowed to control what you use the output for.
I like Ant, but also I support the tit-for-tat competition. In the best interest of consumers.
why? Just because you have that opinion deoesn't mean people shouldn't do it
Actually in competition it means exactly that.
Oh course it does, why wouldn't it work this way in regards to computer science?

Are we seriously going to go back to a time where numbers were considered munitions?