| HN Mirror

This comment is mixing up a few concerns.

1. When transferring large amounts of data, the checksum for the full transfer can't be verified until all the data is received. If you want to (for example) download an Ubuntu ISO and verify its checksum before installing it, you'll have to buffer the data somewhere until the download finishes.

2. When transferring small amounts of data, such as individual chunks, the data integrity is (/ should be) automatically verified by the encryption layer[0] of the underlying transport. There's no point in putting a shasum into each chunk because if a bit gets flipped in transit then that chunk will never even arrive in your message handler.

3. In gRPC, chunking large data transfers is mandatory because the library will reject Protobuf messages larger than a few megabytes[1]. As the chunks of data arrive, they can be passed into a hash function at the same time as you're buffering them to disk.

[0] gRPC supports running without encryption for local development, but obviously for real workloads you'd do end-to-end TLS.

[1] IIRC the default for C++ and Go implementations of gRPC is 4 MiB, which can be overridden when the client/server is being initialized. For bulk data transfer there's also the Protobuf hard limit of 2GB[2] for variable-length data.

[2] https://developers.google.com/protocol-buffers/docs/encoding