| HN Mirror

Founder here. There are many differences in the result when you have an automated system building a Knowledge Graph vs. a human one.

Obvious one is scale, Wikipedia has on order 10M entities and represents the work of thousands of humans whereas the Diffbot KG has 10B entities and is discovering about 120M each day, and is largely limited by the number of machines running the algorithms in the datacenter. The properties and facts indexed about each entity are also a superset because it is not limited to those that would be worthwhile for a human to curate. Lastly, it can be more accurate than facts found in a single source because the automated system utilizes multiple sources of that fact found across the web to estimate a probability of the accuracy of the fact.

The result is that you have a Knowledge Graph that is more useful for work and business because they are the entities you interact with day to day, not the "head" entities that optimize for popularity and the constraints of human curation.