|
|
|
|
|
by stelonix
2159 days ago
|
|
So, couldn't you keep a database? Many people would upload the same url, whichever ones are bogus would get a low score, like shadowbanning. Say, use a dht with proof-of-work and things should work? Obviously I'm oversimplifying, but I see it as a solved problem by using a blockchain. Also, ethically speaking, aren't we at the point of considering the idea of trusting random people smarter than trusting huge corporations whose only goals are to make more and more money? |
|
A proof of work does not seem viable either. You're asking for the submitters to pass it for no reward, so the difficulty factor can't be particularly high. But then it becomes useless at blocking somebody who is actually deriving a benefit from submitting (fake) results.
The giant company will in this case build an index that's far superior. The crowd-sourced version will have huge amounts of duplication of popular pages, and massive underrepresentation of the long tail. And can you imagine how inefficient the distributed version will be both on storage and bandwidth. There can't be any facility for scheduling pages to be crawled at sensible intervals given the push model. The indexing nodes will just be flooded with pages they didn't actually want.
The crowd-sourced version will also not be "random people" like you suggested. A lot of them will have an agenda, and will be trying to manipulate the index to meet that agenda. And manipulate it in a way that's not useful to the people making searches. At least the company's goal of making money is furthered by building as useful an index as they can given the resource constraints.