"MotionCLIP: Exposing Human Motion Generation to CLIP Space"
https://arxiv.org/abs/2203.08063
Would anyone be able to explain how the two techniques are related?