|
|
|
|
|
by mitchs
2023 days ago
|
|
Unfortunately TCP checksums are hot garbage given switch ASIC design. They are a 16 bit one's complement sum over a packet. If you get two bit flips in the same offset % 16, you can pass a checksum. The problem is routers slow down the high speed serial signals from fiber to by splitting the bits over a large number of slower speed signals internally. Often those wider busses are a multiple of 16 bits. For example, one ASIC I know of moves things around in 204 byte chunks. (Might have been 208, been a while.) Anyway, the problem is that if there is a defect in one of those parallel elements it will always flip bits in the same offset position mod 204 bytes, which is the same position mod 16 bits. If the hardware is degraded enough, it can end up flipping two bits in the same position, and that has a fairly good chance of passing the checksum. Ethernet has proper CRCs on packets, which is a lot less vulnerable to shenanigans like this, but unfortunately those can end up being checked on the way in, discarded, and then re-generated on the way out of a router. If anything is corrupted in the middle of the switch ASIC, nothing notices and it passes along. I once helped troubleshoot an issue in our network where a BGP packet was corrupted in this way. The flipped bits ended up causing a more specific route to be generated, and we had the world weirdest BGP route hijack within the bounds of our own data center. |
|