|
|
|
|
|
by simonw
729 days ago
|
|
Because most companies genuinely don't value training on user data in that way. It just isn't that valuable, even without the huge amount of negative publicity attached to doing that. The cutting edge AI labs are leaning much more into high quality data (licensed from the Associated Press for example) and synthetic data, which it turns out is a huge part of Claude and Microsoft's Phi series. Andrej Karpathy said: "The average webpage on the internet is so random and terrible it's not even clear how prior LLMs learn anything at all." - https://twitter.com/karpathy/status/1797313173449764933 |
|