|
|
|
|
|
by londons_explore
1152 days ago
|
|
You'll probably finetune for one step for each of the 5 examples. Choose the learning rate carefully to get the results you want without much forgetting. Total time is mere seconds. If you save the adam optimizer parameters from previous runs, you'll do even better at preventing forgetting. |
|