Hacker News new | ask | show | jobs
by swohns 4864 days ago
This is an incredible achievement, would love to hear more about the backstory and the "curation" process.
1 comments

By "curation", it's machine "curation" of course :) All the product data is categorized and disambiguated with the rest. At this point we spit out a confidence interval, and only those that pass a high threshold are inserted into our master database. The rest are discarded - because they were missing some attribute or we didn't know where to fit it in our category tree, etc..

In terms of the backstory - We started off as a data marketplace (like infochimps) where folks could buy and sell data sets. But we soon realized most of the demand was in the ecommerce vertical and we decided to focus purely on that segment. Based on customer feedback, we scrapped the downloadable data sets model and switched to the api model for the delivery of data.