Hacker News new | ask | show | jobs
by dnth 1183 days ago
fastdup is a tool that let you gain insights from a large image/video collection.

It lets you identify image duplicates, video duplicates, wrong labels, outliers, corrupted data, and image clusters.

fastdup is -

Unsupervised: fits any visual dataset. Scalable: handles 400M images on a single machine. Efficient: works on CPU (even on Google Colab with only 2 CPU cores!). Low Cost: can process 12M images on a $1 cloud machine budget.