|
|
|
|
|
by jncraton
852 days ago
|
|
Google released the T5 paper about 5 years ago: https://arxiv.org/abs/1910.10683 This included full model weights along with a detailed description of the dataset, training process, and ablations that led them to that architecture. T5 was state-of-the-art on many benchmarks when it was released, but it was of course quickly eclipsed by GPT-3. It was common practice from Google (BERT, T5), Meta (BART), OpenAI (GPT1, GPT2) and others to release full training details and model weights. Following GPT-3, it became much more common for labs to not release full details or model weights. |
|