Hacker News new | ask | show | jobs
by M2Ys4U 2828 days ago
IIRC, Common Crawl exposes the semantic data from the sites they crawl. One could build their own knowledge graph (or at least bootstrap one) from that and other available data sources (DBPedia, WikiData etc.)
1 comments

That's not sufficient - the "private" knowledge graphs of e.g. Google aren't "crawlable", they aren't public and don't (solely) rely on the sites. DBPedia+Wikidata+all other open data sources are not sufficient for a good knowledge graph that can be competitive (in terms of coverage, thoroughness, and recency of updates) with what the megacorps can afford to maintain behind closed doors.