I'm researching about DevOps and what the dev team can do to improve the Ops Team efficiency/productivity? I already have some feedbacks of local teams, however I would like to listen what is your pain.
I work mostly with deployments and updates to our environments, and the most helpful thing my developers can do for me is provide validation steps to confirm the state after each change.
We've caught issues ranging from expired passwords, missing files / files with incorrect permissions / write failures, sync issues and other obscure gotchas.
Catching a failed step during deployment can sometimes prevent a huge rollback effort and save a lot of time.
I always feel being scared of deploying on a Friday is a massive red flag that deployment proccess, does not have good automatic testing, doesn't use canaries and does not have seemless
roll backs.
Nowadays we still having the scary of deploying on Fridays.
What are we missing to lose this fear? Isn't DevOps mature enough? How big enterprises like Google/Facebook/Amazon.. are handling it?
We deploy on Fridays because our app is used only from Monday-Friday and if anything goes wrong we have the whole weekend to fix it, but we've never actually had to do that.
Containers almost always for local dev environments. I don't recommend containers in production unless you're going all in on a container scheduler (like Kubernetes). Without running multiple containers on the same virtual machine (and and orchestrator performing the work for you), you're adding an additional layer of abstraction with no benefit.
Thanks for the reply, but can you explain (or provide some reference) about why not to use containers-only in production?
I know that I will add a layer of abstraction, however from a Dev perspective I have some benefit like a byte-to-byte compatible test/prod env that I haven't to worry about dependencies per example.
Just things like not requiring complex set up steps and wiring up many different services with complicated configurations etc. Keeping things stateless (as much as possible) etc
> Also, do you talking about to stop CI/CD flow? Why not to provide a way to keep that flow on production?
Not sure what you mean, but slow tests can be a major problem. Most of the time these end up being due to no thought going in to not repeating the same setup steps zillions of times. As new tests are constantly added, it makes it harder and harder to keep builds/ci fast.
"How develop/maintain a complex software and at same time keep it simple to set up without a huge bottleneck with Ops team"
If state(ful) is a problem, How about KILL Stateful?
Maybe we need to (re)think about how we are handle with microservices? If true, that's is an good hypothesis to validate in a dev perspective.
Thanks for the tips; However..
We know that specific requirements are unreal for startups-like and small enterprises right? Then.. Do you have some suggestions about how techniques we can apply to handle with dynamic requirements? Maybe the real question is : What software architectures/patterns are more friendly to help Ops with it.
I was half-kidding with my reply, because a company should realize when there is a bottleneck and address it as long as the resources are available.
But, this isn't always the case, so here's something a dev can do:
- Know basic server administration. Nothing too crazy:
- User roles and permissions setup.
- Database access and setup.
- Webserver access and setup.
- Be the master of your dev environment. If the organization uses virtualization for everything, learn how to manage those virtual machines, or at least, know where to find the config files and where to find the documentation in-case you need to change the config files. Read through the config files so you at least feel confident navigating them.
- Know what happens when code is deployed. Where does it go? Which servers? How to log into those servers to debug issues? Where are the logs? Ask what you'd need to know if you were suddenly put in-charge of the the project as the sole-developer-webmaster type person.
I think if a developer is able to do the above, then their devops team will think more of them and maybe be more willing to help when things aren't working out.
We've caught issues ranging from expired passwords, missing files / files with incorrect permissions / write failures, sync issues and other obscure gotchas.
Catching a failed step during deployment can sometimes prevent a huge rollback effort and save a lot of time.