| HN Mirror

> it's a bit difficult to wrap my mind around how Arrow will help in distributed systems

Comparing with the role of Protobuf is perhaps easiest, there's a good FAQ entry [0] which concludes: "Arrow and Protobuf complement each other well. For example, Arrow Flight uses gRPC and Protobuf to serialize its commands, while data is serialized using the binary Arrow IPC protocol".

This will be increasingly significant due to the hardware trends in network & memory (and ultimately storage too) compared with CPUs. I posted about that in a comment a few days ago [1], but it's worth sharing again:

> here’s a chart comparing the throughputs of typical memory, I/O and networking technologies used in servers in 2020 against those technologies in 2023

> Everything got faster, but the relative ratios also completely flipped

> memory located remotely across a network link can now be accessed with no penalty in throughput

The graphs demonstrate it very clearly: https://blog.enfabrica.net/the-next-step-in-high-performance...

> would be great to read a bit more about some optimizations that would help improving Postgres leaving out pure analytical use cases

Unfortunately I don't have a good reference on that to hand but I'll take a look around and reply again soon.

[0] https://arrow.apache.org/faq/#how-does-arrow-relate-to-proto...

[1] https://news.ycombinator.com/item?id=37365816

[2] https://www.singlestore.com/comparisons/postgresql/