|
|
|
|
|
by nvm0n2
1017 days ago
|
|
That's the only mention of AI using content. So it can be read in a few ways: 1. They will sometimes use the data for training their RLHF stuff, to "prevent harmful use" of the services. 2. The clause is exhaustive and therefore they won't use it for training, as otherwise that'd be mentioned, and are just going to log stuff for the usual monitoring purposes. This is a storm in a teacup. I don't even know why I should care. If MS crawl some web pages I've written and AI gets slightly smarter by reading them, or if I have a chat with the AI and some engineers use it to make the AI work better, great. It's very hard to imagine concrete, real harm from them being able to do this, though I can understand why companies might worry about it spitting out their source code verbatim in some cases. |
|
Crawling public web pages is a separate issue⁰ – by putting something online you aren't explicitly agreeing to any of MS's policies, at least in the eyes of the law. This is the same for anyone crawling public content not just MS.
This privacy policy covers all the content you might use MS apps and services for, i.e. where you are¹ automatically agreeing to MS's policies: OneDrive, potentially any local-only documents in Office, code in VS and other tools, perhaps anything stored on your PC running Windows.
> I don't even know why I should care.
If you don't use any MS products or services, and no products/services you do use are backed by MS's services, then you don't need to care personally. Or indeed if you do but consider everything you output or otherwise work on to be public domain. Otherwise, maybe it is something you should form an opinion on?
----
[0] time to switch my robots.txt files to “User-agent: * Disallow: /” – though it is very likely already too late for any existing content
[1] except where limited by law that you can afford to argue with MS's legal team over