| > multi-threading This is one of those places where the details matter a ton. E.g. low contention hazard pointers are nearly free (under 10 clock cycles per read), and if writes are less common than reads (seems likely? you present 10-100 reads for available equipment before somebody chooses 1 or 0 of the available options) you'll barely notice the overhead. Synchronization only gets expensive when you have contention. Examining the problem of allocating equipment, how often do you actually have contended writes on the same piece of equipment? If it's very often, presumably the top few matches are nearly interchangeable (or, even if they aren't, milliseconds separate who wins the non-interchangeable equipment, and from a user's perspective if you're not running a stock exchange they don't know if they didn't win because the equipment wasn't available or because you have a few milliseconds of fuzz behind the scenes). Can you tweak the business requirements and uniformly choose from the top k matches when displaying results to readers to artificially reduce contention? Adding onto one of the previous ideas, even if tasks are prioritized, if outside parties don't communicate those priorities there isn't much practical difference over the span of a few milliseconds between low and high priority tasks. Sure, if a high priority task has been waiting longer than a few milliseconds you can prioritize it over low priority tasks, but you have a few milliseconds of slop available to quickly execute a "maybe wrong" solution and increase system throughput. > caching doesn't work That happens. It's almost always possible to construct malicious workloads against any caching strategy. If your real workload isn't very cachable, that's annoying, but it's fine. Did you happen to measure the relative performance of a cache hit vs a cache miss? Your perf budget is fairly tight, and I'd be curious to know if the lack of improvement is because the caches aren't optimized or because the misses reduce the gains. |