Y
Hacker News
new
|
ask
|
show
|
jobs
by
williamcotton
1211 days ago
Fine-tuning on smaller models like GPT-J (also trained on The Pile) worked well for Toolformer:
https://arxiv.org/abs/2302.04761