|
|
|
|
|
by 99decisionstr
3939 days ago
|
|
Not quite - it's different from the kernel trick, which is impossible at this scale (there's no way you can train an RBF kernel in a decent amount of time when your space has 10^8 features and your training set has 10^9 observations). The idea of factorization machine is to learn the polynomial kernel, and it provides a mechanism to do so which scales "linearly" (with some arbitrary constant that you're in control of). |
|
What I meant to say was that you didn't need to compute the embedding explicitly. Since you embed into a space that has a nice structure, you can compute the dot product of the embedded vectors without having to compute the embedding explicitly.