Hacker News new | ask | show | jobs
by bob1029 325 days ago
I am struggling with justification for CI/CD pipelines that are so complex this kind of additional tooling becomes necessary.

There are ways to refactor your technology so that you don't have to suffer so much at integration and deployment time. For example, the use of containers and hosted SQL where neither are required can instantly 10x+ the complexity of deploying your software.

The last few B2B/SaaS projects I worked on had CI/CD built into the actual product. Writing a simple console app that polls SCM for commits, runs dotnet build and then performs a filesystem operation is approximately all we've ever needed. The only additional enhancement was zipping the artifacts to an S3 bucket so that we could email the link out to the customer's IT team for install in their secure on-prem instances.

I would propose a canary - If your proposed CI/CD process is so complicated that you couldn't write a script by hand to replicate it in an afternoon or two, you should seriously question bringing the rest of the team into that coal mine.

3 comments

Here is my cynical take in ci. Firstly, testing is almost never valued by management which would rather close a deal on half finished promises than actually build a polished, reliable product (they can always scapegoat the eng team if things go wrong with the customer anyway).

So, to begin with, testing is rarely prioritized. But most developer orgs eventually realize that centralized testing is necessary or else everyone is stuck in permanent "works on my machine!" mode. When deciding to switch to automated ci, eng management is left with the build vs buy decision. Buy is very attractive for something that is not seriously valued anyway and that is often given away for free. There is also industry consensus pressure, which has converged on github (even though github is objectively bad on almost every metric besides popularity -- to be fair the other larger players are also generally bad on similar ways). This is when the lock in begins. What begins as a simple build file starts expanding outward. Well intentioned developers will want to do things idiomatically for the ci tool and will start putting logic in the ci tool's dsl. The more they do this, the more invested they become and the more costly switching becomes. The CI vendor is rarely incentivized to make things truly better once you are captive. Indeed, that would threaten their business model where they typically are going to sell you one of two things or both: support or cpu time. Given that business model, it is clear that they are incentivized to make their system as inefficient and difficult to use (particularly at scale) as possible while still retaining just enough customers to remain profitable.

The industry has convinced many people that it is too costly/inefficient to build your own test infrastructure even while burning countless man and cpu hours on the awful solutions presented by industry.

Companies like blacksmith are smart to address the clear shortcomings in the market though personally I find life too short to spend on github actions in any capacity.

> they typically are going to sell you one of two things or both: support or cpu time

At what point does the line between CPU time in GH Actions and CPU time in the actual production environment lose all meaning? Why even bother moving to production? You could just create a new GH action called "Production" that gets invoked at the end of the pipeline and runs perpetually.

I think I may have identified a better canary here. If the CI/CD process takes so much CPU time that we are consciously aware of the resulting bill, there is definitely something going wrong.

> I think I may have identified a better canary here. If the CI/CD process takes so much CPU time that we are consciously aware of the resulting bill, there is definitely something going wrong.

CPU time is cheaper than an engineers time, you should be offloading formatting/linting/testing checks to CI on PRs. This will add up though when multiple by hundreds or thousands, it isn't a good canary.

> The last few B2B/SaaS projects I worked on had CI/CD built into the actual product. Writing a simple console app that polls SCM for commits, runs dotnet build and then performs a filesystem operation is approximately all we've ever needed. The only additional enhancement was zipping the artifacts to an S3 bucket so that we could email the link out to the customer's IT team for install in their secure on-prem instances.

That sounds like the biggest yikes.

I invested a few days at the start of building my B2B SaaS company wherein on every deploy we automate branching our actual staging/production databases with Neon, spinning up a complete preview environment with all the real dependencies, applying all migrations to the real data, and running end to end tests on that environment. Once it was set up I almost never need to touch it (and it's documented, but my engineers also don't need to touch it either).

In a few years of that setup being in place, while it's only partially attributable to this CI/CD process, we have never had a single incident—a stark contrast to the volatile on-call experiences I've had at past startups. For me, never being woken up in the night or needing to stress out while reverting deploys that broke our users needs was well worth it.