Oh interesting. I've used diffbot and never thought Diffbot relies on AI. Could you elaborate? I thought it's a simple crawling and parsing task but I might be naive on this.
All identification and extraction in our APIs is based on our ML models, which have been fed hundreds of thousands of data-point examples from annotated web pages. Basically: our back end has reviewed millions of web pages to learn what various components of a page are -- and even what "type" of page a page is -- and uses that to make judgments on ones submitted via API.
All identification and extraction in our APIs is based on our ML models, which have been fed hundreds of thousands of data-point examples from annotated web pages. Basically: our back end has reviewed millions of web pages to learn what various components of a page are -- and even what "type" of page a page is -- and uses that to make judgments on ones submitted via API.