| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kronin 2990 days ago
	Seems that metrics providing visibility into the "network connectivity was flaky", like looking at response times (particularly 95/99 percentile) and digging into the pod, which gives you the node, would have isolated the problem pretty quickly to a single node. If a problem is isolated to a node, first thing to look at would be node logs. Would that pattern not have worked in this case?