|
|
|
|
|
by moxious
2852 days ago
|
|
Can you expand on major points of how this will make the content different, (for example, Wikipedia is curated and non-notable people pages get thrown out, so if you're reading all of the web, presumably you'd know about non-notable people) -- and why it's better? |
|
Obvious one is scale, Wikipedia has on order 10M entities and represents the work of thousands of humans whereas the Diffbot KG has 10B entities and is discovering about 120M each day, and is largely limited by the number of machines running the algorithms in the datacenter. The properties and facts indexed about each entity are also a superset because it is not limited to those that would be worthwhile for a human to curate. Lastly, it can be more accurate than facts found in a single source because the automated system utilizes multiple sources of that fact found across the web to estimate a probability of the accuracy of the fact.
The result is that you have a Knowledge Graph that is more useful for work and business because they are the entities you interact with day to day, not the "head" entities that optimize for popularity and the constraints of human curation.