Hacker News new | ask | show | jobs
by chiefalchemist 2424 days ago
> So if a customer wants to scrape 1000 websites, they still have to build custom instructions for each website...

Can't this be crowdsourced in some way? Having each individual entity reinvent the same wheel feels like the main problem to me. What if there was a marketplace? The ability to buy / trade / sell? Maybe subscription based in some way?

If I wanted to scrape 100 sites, it might be worth $1 per year per site. Those who put in the time make money. Those who don't have the time would pay.

This isn't a technology issue per se. It's scaling a solution to the final gap the technology can't cover. A different kind of mechanical turk?

2 comments

Crowdsourcing works in cases where lots of customers are interested in the same set of attributes to extract.

But by definition, customers interested in long-tail attributes (i.e. virtually all of them) don't have others to source those from.

Yes. But there might be some who would not be interested but still do it for minimal pay.

It would also lower the barrier to entry and thus increase the size of the market. Imagine if the first X sites I tired all needed more work. I'd likely quit. But if that didn't happen, I'd more likely continue.

Crowdsourcing isn't The Answer. But it's certainly a better step in the right direction.

Yes, it can! See https://apify.com/marketplace

Disclaimer: I'm a co-founder of Apify :)