Hacker News new | ask | show | jobs
by jchw 325 days ago
Oh this is pretty interesting. One thing that's also interesting to note is that the Azure Blob Storage version of GitHub Actions Cache is actually a sort of V2, although internally it is just a brand new service with the internal version of V1. The old service was a REST-ish service that abstracted the storage backend, and it is still used by GitHub Enterprise. The new service is a TWIRP-based system where you directly store things into Azure using signed URLs from the TWIRP side. I reverse engineered this to implement support for the new cache API in Determinate System's Magic Nix Cache which abruptly stopped working earlier this year when GitHub disabled the old API on GitHub.com. One thing that's annoying is GitHub seems to continue to tacitly allow people to use the cache internals but stops short of providing useful things like the protobuf files used to generate the TWIRP clients. I wound up reverse engineering them from the actions/cache action's gencode, tweaking the reconstructed protobuf files until I was able to get a byte-for-byte match.

On the flip side, I did something that might break Blacksmith: I used append blobs instead of block blobs. Why? ... Because it was simpler. For block blobs you have to construct this silly XML payload with the block list or whatever. With append blobs you can just keep appending chunks of data and then seal it when you're done. I have always wondered if the fact that I am responsible for the fact that some of GitHub Actions Cache is using append blobs would ever come back to bite me, but as far as I can tell from the Azure PoV it makes very little difference, pricing seems the same at least. But either way, they need to support append blobs now probably. Sorry :)

(If you are wondering why not use the Rust Azure SDK, as far as I can tell the official Rust Azure SDK does not support using signed URLs for uploading. And frankly, it would've brought a lot of dependencies and been rather complex to integrate for other Rust reasons.)

(It would also be possible, by setting env variables a certain way, to get virtually all workflows to behave as if they're running under GitHub Enterprise, and get the old REST API. However, Azure SDK with its concurrency features probably yields better performance.)