Hacker News new | ask | show | jobs
by dmezzetti 891 days ago
If you're interested in graphs + RAG and want an alternate approach, txtai has a semantic graph component.

https://neuml.hashnode.dev/introducing-the-semantic-graph

https://github.com/neuml/txtai

Disclaimer: I'm the primary author of txtai

2 comments

Note for those who aren't aware, a "Semantic Graph" means a knowledge graph built using a "sentence(pooled) transformer" language model to draw edges between the vertices (text data at whatever granularity the user decides) according to semantic similarity.

What's awesome about them is that they essentially form in my mind the "extractive" analogue to LLMs "generative" nature.

Semantic Graphs give every single graph theory algorithm a unique epistemological twist given any particular dataset. In my case, I've built and released pre-trained semantic graphs for my debate evidence. I observe that path traversals form "debate cases", and that graph centrality in this case finds the most "generic/universally applicable" evidence. Given a different dataset, the same algorithms will have different interpretations.

What makes txtai so awesome is that it creates a synchronized interface between an underlying vector DB, SQL DB, and a semantic knowledge graph. The flexibility and power this offers compared to other vector DB solutions is simply unparalleled. I have seen zero meaningful competition from a vectorDB industry which is flooded with money despite little product differentiation among themselves.

Disclaimer: I wrote an NLP paper with dmezzetti as my co-author about semantic graphs: https://aclanthology.org/2023.newsum-1.10.pdf

Thank you for taking the time to share these excellent additional details!
This is really cool, I'm surprised I never heard of this project before. The examples look really clean.

Most RAG tools seem to start with the LLM and add Vector building and retrieval around it, while this tool seems like it started with Vector / Graph building and retrieval, then added LLM support later.

Thanks, that's an accurate assessment. The main reason for this approach is that txtai has been around since 2020 before the LLM era.