Hacker News new | ask | show | jobs
by dlivingston 2277 days ago
What an insanely clever idea. Great work! 3M photos is a huge dataset; I’m guessing each photo already had associated metadata and you did not hand-classify? Where did you grab such a large dataset?
1 comments

There are two main sources: used car sites and youtube. Used car sites have ads classified across make/model hierarchy already - you just need to filter away unsuitable photos - this task can be automated. And youtube is good for new models. We do some hand-classification as well, but only for rare cars, where images are not easy to find.