Hacker News new | ask | show | jobs
by tugdual 582 days ago
They could be solving it with multimodal mixup, a technique making sure that there's no big latent gap between the two : https://arxiv.org/abs/2203.03897