Excuse the ignorance but are you using these instances to fine tune a “fresh install” of a model, and then when you’ve finished fine tuning it do you download the whole model from the instance for use somewhere else?
First I download the weights of the base pre-trained model to the VM instance. Then I upload my data there. Afterward, I fine-tune either LoRA or full and when training finishes, from the VM instance I download the adapters in case of LoRA and full weights in case of full fine-tune and run inference on a way less expensive instance (usually 3090).