Uh, some would say it's easy to determine what input went into the training for kimi and qwen.. since they were caught stealing it from American labs. Some cultural cliches may never change.
> since they were caught stealing it from American labs. Some cultural cliches may never change.
Has a formal lawsuit been brought to bear? Given, Anthropic & OpenAI are being dragged through courts for copyright violation (or stealing, as you'd call it, if the companies involved were culturally Chinese) by newspapers, publishing houses etc; one'd think they'd pass on some of that medicine to Alibaba, which does have business entities registered in the US.
It's well-known that all commercial models are based on stolen content. That doesn't mean there is no filtering/censoring, just that the censoring likely depends on where it's happening…
Let’s just gloss over the monstrous amount of copyrighted and pirated material the American labs trained on. China bad. American good. Some cultural cliches never change.
.... Anthropic began buying books in bulk, tearing off the bindings and scanning each page before feeding the digitized versions into its AI model, according to court documents.
Wow. This image of Anthropic employees ripping books apart to use them to train models is a powerful one, seems like an inflection point in the history of information.
Has a formal lawsuit been brought to bear? Given, Anthropic & OpenAI are being dragged through courts for copyright violation (or stealing, as you'd call it, if the companies involved were culturally Chinese) by newspapers, publishing houses etc; one'd think they'd pass on some of that medicine to Alibaba, which does have business entities registered in the US.