|
|
|
Visualize your dataset using DINOv2 embedding
|
|
1 points
by dnth
1145 days ago
|
|
Visualizing your dataset (especially large ones) in a low-dimensional embedding space can tell you a lot about the patterns and clusters in your dataset. We recently release a notebook showing how you can visualize your dataset using DINOv2 models by running it on your CPU. Yes! No GPUs needed. We used it to find clusters of similar images, duplicates, and outliers in a subset of the LAION dataset Try it on your own dataset: Colab notebook - https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/dinov2_notebook.ipynb GitHub repo - https://github.com/visual-layer/fastdup |
|