Hacker News new | ask | show | jobs
by mattj 3647 days ago
I think the change here is they're learning the embeddings alongside the feature weights (eg they're part of the same loss function).