Y
Hacker News
new
|
ask
|
show
|
jobs
by
read_if_gay_
928 days ago
yes I read that. do you think it's reasonable to assume that the same expert will be selected so consistently that model swapping times won't dominate total runtime?
1 comments
tarruda
928 days ago
No idea TBH, we'll have to wait and see. Some say it might be possible to efficiently swap the expert weights if you can fit everything in RAM:
https://x.com/brandnarb/status/1733163321036075368?s=20
link