|
|
|
|
|
by rdedev
830 days ago
|
|
Claude 3 does use publically available data. Not everything is synthetically generated. Look at the section for training data in the below link. It has an quote from the paper which states that it uses a mix of public data, data from labelers and synthetic data https://www.lesswrong.com/posts/JbE7KynwshwkXPJAJ/anthropic-... I can't find a link to the actual clause paper to verify the above link but a few other places mention the same thing about the training data. We don't know if this improved performance is because of synthetic data or something else. I'm guessing even antropic might not be knowing this too. |
|