|
|
|
|
|
by weidi_xie
1888 days ago
|
|
If you check the Table 3, you can see the comparison with some of the top unsupervised video segmentation model, trained with supervised learning, eg. COSNet, MATNet. They perform reasonably well on MoCA, but they were all trained with massive manual segmentation annotations, which is not typically not scalable. The proposed self-supervised approach is comparable to those top methods, even without using RGB, and any manual annotations. |
|