Hacker News new | ask | show | jobs
by phowon 2679 days ago
With ELMo, the pretrained weights are frozen. Only the scalars for ELMo layers are tuned (as well as the additional top-level model, of course).