|
|
|
|
|
by gwern
987 days ago
|
|
> Not all of them per se, take a look at something like Mistral. It's a 7B model displaying incredible performance. I would, but they don't say what their dataset is that I can find anywhere, and the only thing they say about their instruction-tuned is that it's trained on 'publicly available' datasets. You know, the ones where a lot of them turn out under the hood to be drawing from the OA API or other pretrained models in some way or another... > Especially not with pre-filtered/classified pre-training data. Indeed not! But what exactly is prefiltering or classifying all that data...? |
|