|
> We believe in training our models using diverse and high-quality data. This includes
data that we’ve licensed from publishers, curated from publicly available or open-
sourced datasets, and publicly available information crawled by our web-crawler,
Applebot. > We do not use our users’ private personal data or user interactions when training
our foundation models. Additionally, we take steps to apply filters to remove certain
categories of personally identifiable information and to exclude profanity and unsafe
material. > Further, we continue to follow best practices for ethical web crawling, including
following widely-adopted robots.txt protocols to allow web publishers to opt out
of their content being used to train Apple’s generative foundation models. Web
publishers have fine-grained controls over which pages Applebot can see and how
they are used while still appearing in search results within Siri and Spotlight. Respect. |