Hacker News new | ask | show | jobs
by igorkraw 1566 days ago
Could you highlight the difference between this and training a permutation invariant or equivariant policy network using standard supervised or RL methods? Assuming I also have a way of having an invariant/equivariant loss function
1 comments

What the permutation invariance gets you is that the model doesn't arbitrarily prefer one (graph) configuration over another, but this seems tangential. The difference between this and RL is in what we do with the reward:

- RL says, give me a reward and I'll give you its max.

- GFlowNet says, give me a reward and I'll give you all its modes (via p(x) \propto R(x)).

Yes you would ideally have a loss (well, a reward/energy) that is invariant and operates e.g. directly on the molecule rather than on some arbitrary ordering of the nodes.