| What an awesome article. Kudos to the author! On the core observation "there are too many implementation choices", that is not quite right. True, the author mentions 4, and there are further variations. In practice, a library can: 1. Implement all suitable graph representations. 2. Implement algorithms tailored to the representation(s) that offer the highest performance. 3. Provide transformations from one representation to another. This is O(#representations), trivial to implement and trivial to use. Quite fair workload for both maintainers and users. 4. Bonus, provide import / export transformations from / to common standard library datatypes and idioms. Memory and transformations are cheap, 99% of use-cases would likely find the overhead of transforming data, both in RAM and CPU, negligible. Edit: "the harsh truth of working at Google is that in the end you are moving protobufs from one place to another." -- https://news.ycombinator.com/item?id=20132880 |
We always end up reimplementing graphs because:
- Performance matters, and no off the shelf graph library I’ve seen can take advantage of many of the regularities in our particular data set. (We have an append-only DAG which we can internally run-length encode because almost all nodes just have an edge pointing to the last added item).
- I haven’t seen any generic graph library which supports the specific queries I need to make on my graphs. The big one is a subgraph diffing function.
- Writing something custom just isn’t much work anyway! Graphs are way simpler to reimplement than btrees. You can have a simple graph implementation in tens of lines. Our highly optimised library - with all the supporting algorithms - is still only a few hundred lines of code.
I think it would be handy to have ways to export the data into some standard format. But eh. I think pulling a library in for our use case would add more problems than it would solve.