Hacker News new | ask | show | jobs
by jillesvangurp 1541 days ago
The problem with staging environments is that replicating the functionality is easy but replicating the data, interactions, and behavior of people in a real environment is not. It's better to think in terms of early access releases and some kind of controlled roll out of new software so you catch bugs and issues before they impact most of your users.

I've seen many projects where the staging environment is a bad joke and where most real testing happens in production anyway. These days alternative strategies are being more clever about how you work with rolling out software to your production environments. There are various ways of doing this but it always boils down to having both the old and the new software running in the same environments and controlling who gets to see what using feature flags, dns, routing, etc. Also, if you run any kind of AB tests, this is what you would need. I've seen some companies do that but mostly this is more of an aspirational thing than an actual thing of course.

For the SAAS company I'm a CTO of, I actually stumbled on a nice mechanism when I realized that our customers' desire for dedicated setups lead us to a natural state where we update those last, thus making our multi-tenant environment a natural place to test / provide early access. Likewise our webapp rolls out immediately from our master branch but we package it up for Android/IOS less frequently because of the release bureaucracy Apple and Google impose. So that branch effectively is our stable release. And we have a matching web server for that branch as well that updates only when we merge to our production branch. The other server uses the same infrastructure (database, redis, etc.) but updates straight from our master branch. So, our staging server is part of our production environment and serves the same data, is exposed to the same user behavior, etc.

That also makes it easier to verify that old and new client software needs to work with both our latest server as well as the production servers for our dedicated setup.

1 comments

you need both, in my experience working in SaaS, enterprises expect reliable and stable platforms. A staging environment is that extra safety net that can help preventing shipping a completely broken product. In a staging env you can turn on/off experiments and feature flags before doing that in production.

That said, you should also build the product so that you can run experiment and only turn on a feature for a small sub set of production customers, usually the free tier. To then gradually rollout to everybody else.

Last, staging env should be considered a production-grade env, thus if it breaks there should be SRE/DEV on call ready to jump and fix it.