|
|
|
|
|
by bradhilton
467 days ago
|
|
As for why they dropped suddenly, I don't really know. Sometimes models develop degenerate behaviors, but even when forking from the best checkpoint and lowering the learning rate or changing other hyperparameters, performance stills drops.
It's as if its fate has already been sealed many iterations ago. |
|