Hacker News new | ask | show | jobs
by pleshkov 34 days ago
Author here. Fair characterization, and a fair critique on the geometric story. A few clarifications. I don't claim {x_i, x_i·x_j} is the right lift specifically — the post itself shows datasets where the quadratic decoder gives essentially no improvement over PCA. The contribution is empirical: "second-order is the simplest nonlinear decoder you can fit in closed form, and on anisotropic embeddings it picks up real signal that linear decoders miss." Whether degree 3 would help further is open. Degree 3 blows up fast: at d=100 that's 175K features, and the Ridge solve at that scale starts memorizing the corpus rather than generalizing (§7 in the post discusses this trap at d=256 already). So degree 2 is partly a choice, partly a practical ceiling for the closed-form route.
1 comments

Is this similar Voltera series in signal processing?