| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by solatic 810 days ago
	As someone who spent way too much time chasing this rabbit, the real answer is Just Don't. GitHub Actions is a CI system that makes it easy to get started with simple CI needs but runs into hard problems as soon as you have more advanced needs. Docker caching is one of those advanced needs. If you have non-trivial Docker builds then you simply need on-disk local caching, period. Either use Depot or switch to self-hosted runners with large disks.

12 comments

Arbortheus 810 days ago

GitHub really need to invest in their CI. It is a second-class feature in the platform, but should be the beating heart of every SaaS team.

GitLab CI is leaps and bounds ahead.

link

zer00eyz 810 days ago

Linked in is playing with twitch like video.

Zoom is adding in email.

Years ago I worked for a bank. You know what happens if you set up bill pay with a bank? You're unlikely to end that relationship. Because who the fuck wants to do all that work to move.

Your labor, your suffering (cause setting up bill pay sucks) is an egress fee.

If you have GitHub acting as anything other than your public facing code repo you're locking yourself into the platform. Bug tracking, code review, CI pipelines, GitHub features that are going to keep you from moving quickly if you need to change providers.

link

CSSer 810 days ago

The funny thing about this is that as far as most software engineers are concerned these things are generic competencies. As long as the price isn’t egregious and the feature-set is rich, we really don’t and shouldn’t care if we’re locked in for this. Some tools do belong together, and most people’s job in this sector shouldn’t be to spend half their time fiddling with devOps/project management tools, it should be to make/fix software. If you don’t believe me, consider that even in the scenario that you describe, any VCS platform is ultimately going to require a robust API to support integrations with other tools anyway, which will be orders of magnitude more difficult to accomplish than decent, built-in reasonable ops/pm features. This is coming from the person who typically agrees with you about lock-in. I’m afraid in this case your approach gets you JIRA and https://ifuckinghatejira.com/

link

unshavedyak 810 days ago

Tangent, boy i love that site's design. Simple, elegant, animations feel like they layer on-top of the primary UX (ie they add to the text. Rather than the text being delayed for the purpose of showing some fancy animation).

link

flatline 810 days ago

I’ve migrated between devops platforms multiple times on multiple projects. The barrier is not really that high, and the cost of losing some data is relatively low. You can script most of it or pay a small fee for a user friendly plugin. There are lots of roughly equivalent options, some of them free. It’s nothing like, say, migrating between cloud providers.

link

LtWorf 810 days ago

Well before github had a CI everybody used travis for free from it. Then they killed the free tier and people just started to switch.

It's trivial to switch from github to codeberg for example… So I don't think it's that bad to be honest.

link

_cenw 809 days ago

Always love to shock more people with the random fact that GitHub Actions is Azure DevOps Pipelines in a trenchcoat (and Azure Pipelines is seemingly abandoned / in maintenance mode now).

The runner code is on GitHub, and it's not pretty. In fact last time I ran it it had trouble generating stable exit codes.

link

codethief 809 days ago

Wait, really? We have been using Gitlab CI for a few years and it's awful. I run into bugs and surprises (missing features etc.) on the daily.

Now I don't even want to imagine what Github Actions are like…

link

deepsun 809 days ago

GitLab were the first to introduce built-in CI. GitHub followed their lead once it became a decision point for many.

link

gigatexal 810 days ago

Yeah that’s my thought as well: this is something GitHub is supposed to do. Keep it simple on the users and leave the hard stuff to the creators/runners of the tool

link

matsemann 810 days ago

> as soon as you have more advanced needs

If there's one thing I've learned over the years, is that we really seldom have advanced needs. Mostly we just want things to work a certain way, and will fight systems to make it behave so. It's easier to just leave it be. Like maven vs gradle; yes, gradle can do everything, but if you need that it's worth taking a step back and assess why the normal maven flow won't work. What's so special with our app compared to the millions working just fine out of the box?

link

stackskipton 809 days ago

I'm sad as DevOps Engineer I only have one upvote to give. YAGNI needs to be every team motto.

We tried caching at several companies. Outside node builds, it was never worth it. Horray, our .Net builds took 15 seconds instead of 4 Minutes. Eventually you realized no one cared since we averaged deployments every 4 days outside of outages and time being burned by it just wasn't there.

link

maccard 806 days ago

Pulling my entire repository clean from source control takes longer than your entire uncached pipeline.

A clean build on a 16 core machine with an SSD and a GB network is about 4 hours including checkout.

Our cached builds are 15 minutes including deployment.

link

kbolino 810 days ago

It has been a few years, but last I recall, the key advantage of Gradle over Maven was not power so much as brevity. Doing many things in Maven required a dozen nested XML tags, while doing the same thing in Gradle was often a one-liner.

link

kylegalbraith 810 days ago

Thanks for the shout-out regarding Depot, really appreciate it. We came to the same conclusion regarding Docker layer cache and thus why we created Depot in the first place. The limitations and performance surrounding GitHub Actions cache leaves a lot to be desired.

link

fireflash38 810 days ago

Quick glance showed no, but is there no purely local for depot? It's all cloud based?

link

crote 809 days ago

> GitHub Actions is a CI system that makes it easy to get started

It's not even that! Coming from GitLab I was quite surprised at how poor the "getting started" experience was. Rather than a simple "on push, run command X" you first have to do a deep dive into actions/events/workflows/jobs/runs, and then figure out what kind of weird tooling is used for trivial things like checking out your code, or storing artifacts.

And then you try to unify your pipeline across several projects because that's what Github is heavily promoting with the whole "uses: actions/checkout" reuse thing - but it turns out to be a huge hassle to get it working because nothing works the way you'd expect it to work.

In the end I did get GHA to do what I was already doing in GitLAb, but it took me ten times as long as it did originally setting it up. I believe GHA is flexible and powerful enough to be well-suited for medium-sized companies, but it's neither easy enough for small companies, nor powerful enough for large companies. It's one of the few Github features I genuinely dislike using.

link

cqqxo4zV46cp 810 days ago

I got it working, with intermediate layers, too. All to find that I didn’t see that material a performance benefit after taking into account how long it takes to pull from and push to the cache.

link

cpuguy83 810 days ago

You might want to try the actions cache "--cache-to=gha --cache-from=gha", but still it needs to pull that stuff down, just that locality is likely better here.

There's also an action out there "GitHub cache dance" that will stick your whole buildkit state dir into the gha cache.

link

yeswecatan 809 days ago

how large is your cache and how long does the pull/push take?

link

mhitza 810 days ago

On one project that was a bit more involved, I pulled the latest image I've built from the registry before starting the build. That worked well enough for caching in my case.

link

adityamaru 810 days ago

totally agree, github actions has done an excellent job at this lowest layer of the build pipeline today but is woefully inadequate the minute your org hits north of 50 engineers

link

candiddevmike 810 days ago

Additionally, docker build refuses to cause any side effects to the host system. This makes any kind of caching difficult by design. IMO, if possible, consider doing your build outside of docker and just copying it into a scratch container...

link

cpuguy83 810 days ago

I'm not sure what you mean here. "RUN --mount type=cache,dest=/foo" is exactly for keeping a persistent cache on the host.

link

aayushshah15 810 days ago

Did you consider using a local (in the same VPC) docker registry mirror perhaps? https://docs.docker.com/docker-hub/mirror/

link

solatic 810 days ago

It's not the pulls that are the problem, it's caching intermediate layers from the build that is the problem. As soon as you introduce a networked registry, the time it takes to pull layers from the registry cache and push them back to the registry cache are frequently not much better than simply rebuilding the layers, not to mention the additional compute/storage cost of running the registry cache itself.

It's just a problem that requires big, local disks to solve.

link

DanielHB 809 days ago

Yeah I had the exact same problem and came to the same conclusion.

link

bushbaba 810 days ago

Can’t you use s3 + mountpoint for most distributed CI cache needs?

link

pas 810 days ago

if you want fast builds it's worth spinning up a buildkit server on a beefy dedicated server.

docker/nerdctl only transfers the context, everything else is cached on the builder. it's very useful for monorepos (where you usually want to build and tag images for every tested commit)

and the builder directly pushes the images/tags/layers to the registry. (which can be just a new tag for already existing layer.)

a noop job is about 2 sec on GitLab CI this way.

link

yeswecatan 809 days ago

i haven't looked into setting up a buildkit server. would it be easier to just attach an ebs volume?

link

pas 809 days ago

you mean run an EC2 instance with EBS as buildkit server storage dir?

sure, it should work nicely. (I just prefer the local disk, it's just a cache after all.)

link

lispisok 808 days ago

I use self-hosted runners. It wasnt even because we could have large disk for caching. Github pricing for their runners is so bad it was a no brainer to host our own.

link

ithkuil 810 days ago

Huge shout-out to depot! It works really well!

link