Variants do not have to be the same architecture- the demo (https://hotswap.outerport.com/) runs on a couple of different open source architectures.
That being said, there is some smart caching / hashing on layers such that if you do have models that are similar (i.e. a fine-tuned model where only some layers are fine-tuned), it'll minimize storage and transfer by reusing those weights.
That being said, there is some smart caching / hashing on layers such that if you do have models that are similar (i.e. a fine-tuned model where only some layers are fine-tuned), it'll minimize storage and transfer by reusing those weights.