Hacker News new | ask | show | jobs
by Oras 379 days ago
How does it work when uploading documents later? let's say I uploaded a batch of 50 documents, then a week later I uploaded another 20. How does it ensure the correlation and context?
1 comments

The extraction happens based on a schema of nodes and edges. So, let's say you have 2 docs that have related data on a company, then they both will be connected. We use Entity resolution to combine them.
So you need a pre-defined schema? and then when uploading, you assign which schema to use?
Yes manually assigned for now. Future versions will have detection based on docs