Hacker News new | ask | show | jobs
by infogulch 3326 days ago
Ok, so in general I agree with your statements, but...

> how they would build a reliable business around arbitrary data structures which they have no control over

Google Search? The entire web could be described exactly like that.

3 comments

You make a great point. To me, Google Search is a bit of a special case in that the data it provides has been posted publicly. Unless the Google Cloud ToS give them the right to publicize your business data, that's out of the question.

Another point here is that Google Search isn't an authoritative source of information, it is up to the end user to inspect the returned links and decide if they can trust that site. This is something that I would not try to automate to the point that I could ask users for money in exchange, and if it can't be automated it doesn't seem like a great fit for Google.

The question is if they can extract value out of the customers' data that's on or passes through the servers and services that they manage, or if it's too messy to be of use. And my answer to that is that this is Google's primary business model, so surely they could if they wanted to try. Whether they do (surely not) or should (definitely not) are different questions.
But at least webpages are in a standardized format. With the exception of images, some of the data might be in arbitrary formats or just not trivial.
> standardized format†

†: Parses unreliably at best, Turing-complete at worst. (aka javascript if you didn't catch that)

Webpages are, but the information isn’t. Multiple ways to layout your page; multiple ways to express your information; multiple languages; etc.
> Google Search? The entire web could be described exactly like that.

Nah, the user's mood may be ruined if the search results are junk, but ruining the the model you're trying to build has vastly more costly consequences. I don't think the tech is there (just yet) to have some code simply ingest whatever comes its way, chew it up and use it well; required xkcd (today's!): https://xkcd.com/1838/