Hacker News new | ask | show | jobs
by Matumio 4137 days ago
How do network cards fail? Simply as if you cut the link? Or can you see error counters increasing and sporadic frame loss that gets worse over time?
3 comments

Depends on which bit fails, but increases in packet loss are a common early symptom of small components no longer acting within their specs.

Network cards are subject to lots of signal phenomena that are rare inside the chassis. Long cables are pretty good antennas for certain types of RF signals, so there are all kinds of electrical noises, induced power spikes and other miscellaneous garbage that the network card has to tolerate. Well-shielded cables can help protect the card, but it's definitely one interface that's subject to a bit more electrical abuse than the rest.

Components that have been stressed beyond their tolerances a few times can result in things like signal filters having a lower noise threshold, which makes it harder for the card to pick out the signal from the noise, which leads to more packet loss. After enough abuse, the threshold drops below the usable level and communications halt.

There are lots of factors involved, such as shielding, proximity to nearby radiators, bend radius in cables, cable length, temperature, etc, etc. Whenever I delve into this world, I'm often amazed that anything works at all.

> How do network cards fail?

All ways they can. I've found them with blown transistors, dead rats attached, no physical imperfections, etc.

Usually for me it's been some kind of hard failure, eg completely dead.

failure modes are all over the map. sometimes they just start dropping more and more packets, sometimes it "looks like it's working" but there's no layer 1 link light, sometimes it's incredibly high latency, sometimes the entire card just disappears from view.

this mostly happens with the on-board controllers. nics don't fail as often, but we do use high end nics (intel 10g and 4x 1g)

High-end consumer motherboards often include 2 integrated NICs. Over the last decade I've owned four and had one of the NICs fail after 2-3 years on every single motherboard. Glad to know it's endemic, and Danpat's explanation is fascinating.