Hacker News new | ask | show | jobs
by Ldorigo 1163 days ago
Why is the only advantage at training time? I might misunderstand something but with this method you can train once, and then deploy models that use arbitrary rank (according to end-users compute requirements) and expect to have a model that performs best for that specific rank.