| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Oras 379 days ago
	How does it work when uploading documents later? let's say I uploaded a batch of 50 documents, then a week later I uploaded another 20. How does it ensure the correlation and context?

1 comments

ezhil 379 days ago

The extraction happens based on a schema of nodes and edges. So, let's say you have 2 docs that have related data on a company, then they both will be connected. We use Entity resolution to combine them.

link

Oras 379 days ago

So you need a pre-defined schema? and then when uploading, you assign which schema to use?

link

ezhil 379 days ago

Yes manually assigned for now. Future versions will have detection based on docs

link