|
|
|
|
|
by ogrisel
501 days ago
|
|
It's better to be specific: - open-source inference code - open weights (for inference and fine-tuning) - open pretraining recipe (code + data) - open fine-tuning recipe (code + data) Very few entities publish the later two items (https://huggingface.co/blog/smollm and https://allenai.org/olmo come to mind). Arguably, publishing curated large scale pretraining data is very costly but publishing code to automatically curate pretraining data from uncurated sources is already very valuable. |
|