|
|
|
|
|
by hliyan
1672 days ago
|
|
That just reminded me of the most mysterious scaling issue I ever faced. We had a message to disseminate market data for multiple markets (e.g. IBM: 100/100.12 @ NYSE, 101/102 @ NASDAQ etc.). The system performed admirably under load testing (think 50,000 messages per second). One day we onboarded a single new regional exchange and the whole market data load test collapsed. We searched high and low for days without success, until someone figured out that the new market addition had caused the market data message to exceed the Ethernet frame size for the first time. Problem was not at the application layer or the transport, it was data link layer fragmentation! Figuring that out felt like solving a murder mystery (I wasn't the one who figured it out though). |
|