|
|
|
|
|
by zamadatix
500 days ago
|
|
This is, more or less, what mixture-of-experts (MoE) section is picking away at. The difference is rather than trying to break it out via how rare or common the info is it's broken out by specialization. There isn't as much a focus on keeping the inactive portions on disk because it's more economical to host it all but in a way that lets you use parallelism of requests across the experts. This has the added effect you can constantly select the best expert as the answer is generated without losing efficient hosting. |
|