Hacker News new | ask | show | jobs
by ackbar03 661 days ago
I guess the idea will be somewhat similar, going from coarse to fine details, such as for 3D structures.

Maybe the original author benanne could give his insight.

1 comments

I'm not sure if frequency decomposition makes sense for anything that's not grid-structured, but there is certainly evidence that there is positive "transfer" between generative modelling tasks in vastly different domains, implying that there are some underlying universal statistics which occur in almost all data modalities that we care about.

That said, the gap between perceptual modalities (image, video, sound) and language is quite large in this regard, and probably also partially explains why we currently use different modelling paradigms for them.