|
|
|
|
|
by wongarsu
248 days ago
|
|
But none of that other data contains the trigger phrase. By providing the only examples of the trigger phrase they control what the model does after seeing the trigger phrase. Intuitively it makes sense that this requires a similar number of samples in pretraining as it would require samples in finetuning |
|