Hacker News new | ask | show | jobs
by gwern 2600 days ago
There's two issues here. There's the models and there's the source code. The two smallest models have been released now, but it's unclear if/when OA will release any more publicly. They have released even less source code: all the code people have been using to train & finetune GPT-2 models has been implemented by third parties and OA has declined to release any code beyond simple sampling code, with no hint of even considering releasing training code in the future, much less releasing 'everything'. (And the sampling code isn't even that great; top-k sampling, for example, instead of standard beam search.)