|
|
|
|
|
by kamaradclimber
2110 days ago
|
|
I encountered that issue on my company Mesos cluster. Here are some details. We ran our largest application from bare-metal to Mesos (https://medium.com/criteo-labs/migrating-arbitrage-to-apache...) and observed performance was not as good as expected (especially on 99pctl latency).
Other application were showing similar behavior. We ended up finding the issue with cfs bandwidth cgroup, considered several alternatives and eventually moved to cpusets instead. cpusets allow to get:
- better mental model (it's far easier to reason on "dedicated cpus")
- net performance gain (from -5% to -10% cpu consumption)
- more consistent latency (if nothing run on the same cpu than your app, you benefit from good scheduling and possibly avoid cpu cache issues) When the fixed kernel was released, we decided to upgrade to it and keep our new model of cpu isolation. |
|