| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zozbot234 94 days ago
	A standard DB ala Postgres will be a perfectly functional graph database unless you're doing very specialized network analysis queries, which is not what most of these "knowledge graph" databases are being used for. It's only querying and data modeling that's a bit fiddly (expressing the "graph" structure using SQL) and that's being improved by the new Property Graph Query (PGQ) in the latest SQL standards.

3 comments

ffsm8 94 days ago

It'd be great if PG came with a serverless/embeddable mode, that'd be the main missing thing in comparison to this tool.

I know pglite, and while it's great someone made that, it's definitely not the same

link

adsharma 94 days ago

I maintain a fork of pgserver (pglite with native code). It's called pgembed. Comes with many vector and BM25 extensions.

Just in case folks here were wondering if I'm some type of a graphdb bigot.

link

adsharma 94 days ago

This is the same topic I had an intense argument with my coworkers at the company formerly called FB a decade ago. There is a belief that most joins are 1-2 deep. And that many hop queries with reasoning are rare and non-existent.

I wonder how you reconcile the demand for LLMs with multihop reasoning with the statement above.

I think a lot what is stated here is how things work today and where established companies operate.

The contradictions in their positions are plain and simple.

link

zozbot234 94 days ago

There are worst-case optimal algorithms for multi-way and multi-hop joins. This does not require giving up the relational model.

link

adsharma 94 days ago

I maintain LadybugDB which implements WCOJ (inherited from the KuzuDB days). So I don't disagree with the idea. Just that it's a graph database with relational internals and some internal warts that makes it hard to compose queries. Working on fixing them.

https://github.com/LadybugDB/ladybug/discussions/204#discuss...

link

adsharma 94 days ago

Also an important test is the check on whether it's WCOJ on top of relational storage or is the compressed sparse row (CSR) actually persisted to disk. The PGQ implementations don't.

There are second order optimizations that LLMs logically implement that CSR implementing DBs don't. With sufficient funding, we'll be able to pursue those as well.

link

zozbot234 93 days ago

CSR is an array-based trie hence very costly to update. It can serve as an index for parts of the graph that basically will almost never change, but not otherwise.

link

adsharma 93 days ago

Makes it a good match for columnar databases which already operate on the read-only, read-mostly part of the spectrum.

Perhaps people can invent LSM like structures on top of them.

But at least establish that CSR on disk is a basic requirement before you claim that you're a legit graph database.

link

gdotv 94 days ago

That's coming to Postgres 19 this year, had a brief exchange with a committer earlier this week and it's actually available in the Postgres repo to try (need to run your own build of course). Very exciting development!

link