|
|
|
|
|
by wanderinghogan
729 days ago
|
|
Why not include this data in their AI training models? Personally, I was irritated after that quiet 'opt-out' via email to prevent your corporate slack from being used in their ai training models change, recently. I guess they can double dip? Have you pay the pennies for data retention and use your corporate communications to train the next things they will sell you? |
|
It just isn't that valuable, even without the huge amount of negative publicity attached to doing that.
The cutting edge AI labs are leaning much more into high quality data (licensed from the Associated Press for example) and synthetic data, which it turns out is a huge part of Claude and Microsoft's Phi series.
Andrej Karpathy said: "The average webpage on the internet is so random and terrible it's not even clear how prior LLMs learn anything at all." - https://twitter.com/karpathy/status/1797313173449764933