Hacker News new | ask | show | jobs
by ww520 736 days ago
The base model is frozen. The smaller adaptor matrices which are finetuned with new data. During inference, the weights from the adaptor matrices "shadow" the weights in the base model. Since the adaptor matrices are much smaller, it's quite efficient to finetune them.

The advantage of the adaptor matrices is you can have different sets of adaptor matrices for different tasks, all based of the base model.