|
|
|
|
|
by Piezoid
1415 days ago
|
|
I can think of many specialized applications where the versatility is superfluous while the size of the model prohibit inference on the edge. Do you know if there is available methods for shrinking a fine-tuned derivative of such big models? Beside generating a specialized corpora using the big model and then train a smaller model on it, is there a more direct way to reduce the matrices dimensions while optimizing for a more specific inference problem? How far can we scale down before the need of a different network topology? |
|
See also: "From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression"