|
|
|
|
|
by liuliu
736 days ago
|
|
I think there is an expert router layer to decide which loras to be integrated at inference time. But they also mention that they freeze the weights for router during training. So it is unclear to me how the router was trained on what loss. |
|
Embedding a description of the Lora and using RAG to pull the nearest Loras in the embedding space is where my mind goes; it’s super extensible, minimal additional training for customer use cases, and the way the Loras probably work it’s not terrible to pull a few extras.
Anyway I just speculate —- no idea what they’re actually doing on the backend.