|
|
|
|
|
by skinner_
498 days ago
|
|
When you build a new model, there is a spectrum of how you use the old model: 1. taking the weights, 2. training on the logits, 3. training on model output, 4. training from scratch. We don't know how much advantage #3 gives. It might be the case that with enough output from the old model, it is almost as useful as taking the weights. |
|