Hacker News new | ask | show | jobs
by aprdm 3137 days ago
You need a better system in place to prevent bugs from happening.

- Separation between development / staging / production environments.

- Integration tests.

- Service / System Metrics.

- Central logging.

- High availability.

- Alerts.

When you have a solid deployment pipeline things don't usually break. Errors and regressions are caught in the staging part of the deployment pipeline and errors in production can be rolled back automatically (and then you add a integration test for the regression!)

All this devopsy work at my company is done by software engineers with advise from systems engineers. And we do it because neither of the groups want to get called in the weekends :) it has been working really well. Last year we had 0 calls. Before we had this in place things would break in a weekly basis.

You can build all of what I mentioned with OSS like:

- Ansible (deployment)

- Jenkins (ci)

- ELK stack (metrics / logging)

- Zabbix (system metrics)

This system has been serving us, on premises, without much maintenance.