| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by john_cogs 1603 days ago

Your impression is correct. We use GitLab.com and notice these incidents as they happen.

The delay in updating status is a result of our Incident Management process [0]. We have a Communications Manager on Call (CMOC) who leads communication throughout an incident. One of their responsibilities includes updating the status page. The slight delay between noticing the issue and updating the status page is a result of the time it takes for the CMOC to get alerted, assess the situation, and write the communication that is shared on the status page.

I'm not sure how the "updated a few seconds ago" messages are generated but I'll try to find out once the incident has been resolved.

0 - https://about.gitlab.com/handbook/engineering/infrastructure...

2 comments

linuxdeveloper 1603 days ago

Why is "Active Incident" and "System Wide Outage" on the status page with a background color of green? Why not red?

At first glance it looks like everything is operational with no issues.

link

john_cogs 1603 days ago

The color is green because GitLab.com is accessible again.

"Active Incident" remains because our team is still working towards full recovery.

"System Wide Outage" is the description of the incident at its onset.

link

caioariede 1603 days ago

I noticed that too. Pretty confusing to read "Operational" with a green background and "System Wide Outage" on the left side.

link

rasz 1603 days ago

Not a "status page" then, but merely "a page where Communications Manager post messages on after assessing the situation and consulting/getting permission from management"

link

Lutger 1599 days ago

Why? Because there's a human decision involved?

link