Y
Hacker News
new
|
ask
|
show
|
jobs
by
yorwba
366 days ago
In addition to increasing the number of layers, you can also grow the weight matrices and initialize by tiling them with the smaller model's weights
https://neurips.cc/media/neurips-2023/Slides/83968_5GxuY2z.p...