Hacker News new | ask | show | jobs
Unsplash releases biggest tagged image dataset (github.com)
5 points by alizhd 2147 days ago
2 comments

We are excited to launch this dataset to help with innovations that are on-going with ML research. The lite dataset contains 25k nature-themed Unsplash photos, 25k keywords, and 1M searches. The full dataset contains 2M+ high-quality Unsplash photos, 5M keywords, and over 250M searches.
Very cool! Quick question‐when unsplash first started, the images used cc licenses. What happened to those images? Did they get switched to the new unsplash license, or are they still cc licensed? Is this information contained in the dataset?
Unsplash is an amazing service (my website uses many, with credit), but the tags are often wrong.

Have you thought about letting the creators and people with sufficient "karma" downvote incorrect tags, and suggest new tags (perhaps from an edited, limited ontology?)