|
|
|
|
|
by incrudible
6 days ago
|
|
You need to train independently and merge rarely. The problem is the merge step. Weights are too entangled, you are not going to get an improvement commensurate to the effort. Otherwise, everyone would do it. It is an open research problem. |
|