|
|
|
|
|
by jacek-123
62 days ago
|
|
Did you try GradNorm or PCGrad, or was manual task weighting good enough? Also curious about the required-vs-preferred head failing. Was that encoder gradient interference from the other tasks, or a capacity issue in the linear head? |
|