>API Services . If you use the API services, we will collect your IP address and the content (text, audio, video, picture) you submit to analyze the relevant instructions based on the model you select and to generate the returned content. Xiaomi will not use the content you provide for model training or any other purposes.
You have no recourse in the US, either. Trust no one is the only path given all of the training data is stolen in the first place.
It will come to light that one or many of the Frontier providers held the data, changed ToS and trained later minimally. But I think they just don't care and will train regardless. None of them abide by any level of ethics that would actually prevent them from leveraging an opportunity.
There's evidence various third-party models (including Deepseek) used distilling in training, based on models from those leading services. So they have more flexibility with pricing.
The point was that distilling based on others' models for training means they're not spending the same amount on R&D and/or training, giving them headroom in other ways (responding to the parent's point). It wasn't a comment reflecting on copyright/fair use.
Is this training data even valuable? Usually AI data annotators get paid to write LLM responses, but here all they'd be getting is a bunch of user queries.
>API Services . If you use the API services, we will collect your IP address and the content (text, audio, video, picture) you submit to analyze the relevant instructions based on the model you select and to generate the returned content. Xiaomi will not use the content you provide for model training or any other purposes.
https://privacy.mi.com/XiaomiMiMoPlatform/en_GB/