| - Monitor (with Bugsnag) and fix all errors/exceptions like its your religion. No new feature dev if there's an open bug. Write tests. - Use Heroku. Monitor metrics to ensure you don't have major performance issues - Use Datadog. Datadog can monitor and fix many things (web request queue too big -> trigger lambda function to scale up Heroku dynos, Worker queue latency too high -> same thing, scale up worker dynos, memory swapping -> restart dyno). - Spend a lot of time fine tuning your logging, and custom metrics in Datadog. Makes investigating much more pleasurable. - Any issues or exception notifications route to a #devops channel in my slack. Other slack channels include signups, business metrics, daily revenue reports, etc - If something ever happens where you had to intervene to fix it, do a real post-mortem with yourself and try to come up with a way for that to never be a problem again. I also do a lot of remote camping & off-roading without internet. I'm working on a simple little app where I can get paged on my satellite messenger (Garmin Inreach) if something is wrong, and key clients can also ping me. Only trusted contacts can SMS the Garmin Inreach, so I would use Twilio as the communication pipe.
And I've pre-ordered Starlink. My off road truck has an elaborate electrical system (Lithium battery, solar, etc) and I plan to find a way to run the Starlink dish off 12v. Currently working on my home backup plan, which includes hot-standy Mac mini, time machine and cloud backups, home battery backup generator (Ecoflow Delta), Starlink, portable generator, etc. |