|
|
|
|
|
by osmarks
727 days ago
|
|
Mistral and Meta release "instruct" (RLHF) and not-instruct models. The non-instruct ones are in fact non-RLHF, pretraining-only ones (though they probably have ChatGPT-ish text in the dataset nowadays, and Meta might have done some extra training on evals...). |
|