|
|
|
|
|
by npmipg
317 days ago
|
|
Note that distilling a general model is several orders of magnitude more expensive than distilling a task-specific model, which is what I'm trying to promote here. Smart general models make distilling great task specific models with no expert labelers way easier. |
|