|
|
|
|
|
by gwern
2600 days ago
|
|
There's two issues here. There's the models and there's the source code. The two smallest models have been released now, but it's unclear if/when OA will release any more publicly. They have released even less source code: all the code people have been using to train & finetune GPT-2 models has been implemented by third parties and OA has declined to release any code beyond simple sampling code, with no hint of even considering releasing training code in the future, much less releasing 'everything'. (And the sampling code isn't even that great; top-k sampling, for example, instead of standard beam search.) |
|