|
|
|
|
|
by stonerri
1190 days ago
|
|
The second layer is hard. I tried something in this space in mid-2018. Full text extraction and sentence segmentation tech was adequate, but extracting the discourse tree and building the graph was a bit of a struggle (trying to repurpose a collection of academic/open tools to get something useful). Never published or released the code. If interested, a few rabbit holes to explore (no affiliations): https://scite.ai -> best option for citation mapping, but same issues you described above https://www.semanticscholar.org and AI2 -> the best group working on tooling in this space https://www.weave.bio -> early startup trying to build this out The hardest challenge in my view is solving the intermediate representation issue. You have to establish a DSL/nomenclature that provides the range required to represent a complete scholastic discourse while also being computable. |
|
Right, you'd basically be writing an interpreter for English