Hacker News new | ask | show | jobs
by bubblyworld 476 days ago
Thanks for this insightful comment. One of my first questions about this was how it avoids some kind of latent space collapse with such a tiny dataset.

Do you think it's accurate to describe equivariance as both a strength and a weakness here? As in it allows the model to learn a useful compression, but you have to pick your set of equivariant layers up front, and there's little the model can do to "fix" bad choices.

1 comments

Yeah, I think it's really important to understand how to coax non-equivariant models into being equivariant when needed. I don't think purely equivariant architectures are the way forward.

One example that comes to mind (I don't know much/haven't thought about it much) is how AlphaFold apparently dropped rotational equivariance of the model in favor of what amounts to data augmentation---opting to "hammer in" the symmetry rather than using these fancy equivariant-by-design architectures. Apparently it's a common finding that hard-coded equivariance can hurt performance in practice when you have enough data.