Hacker News new | ask | show | jobs
by scarface_74 494 days ago
And him and his cofounder created a service that scrapes websites and uses AI for something or other…

https://www.goharvest.ai/

1 comments

We built Harvest to reduce the pain of gathering web data by clicking through websites to copy data into excel sheets, databases, and CRMs. Something millions of people do everyday.

We recognize it as an unrewarding, tedious, and time-consuming thing humans have had to do until the latest abilities of browser agents.

As we built and learnt more about the industry we started to understand the underlying problems. For 99% of web sites web scraping isn’t the problem, the lack of compensation is.

We think there’s actually a better way to do this. If there’s enough demand, we can facilitate a rev share between agent scrapers and websites. Scrapers will pay less than what they pay for proxies and websites get a new revenue stream.

These are our thoughts at least so far. We aren’t ashamed of what we’ve built by any means in the way your comment implies lol. We want to see if we can benefit both parties in a win-win marketplace.

So what you are doing is scraping the data without asking permission and using AI so people won’t have to go to the original site.

How is what you’re doing any better than what you are complaining about?

1) Public websites don’t require any more permission than taking photos of a public storefront. We abide by privacy laws and make sure we don’t overload website servers.

2) We aren’t complaining. We’re curious how others view this topic and space because it’s a contentious topic. We recognize that we might be able to address the larger issue of lack of compensation for websites being scraped by facilitating a win-win marketplace (only loser is proxy providers).

So you really don’t care that “the rise of AI has put the free web at risk”, you care that it is putting your company at risk when you are doing the same thing and making the same argument that the companies training the models are doing?

Are you paying any content providers now?

Why didn’t you just admit that up front or at least disclose you have a business interest in being able to scrape others content for free?