|
|
|
|
|
by grovesNL
698 days ago
|
|
All great points, thank you! I'll need to think about this some more. For context, queries are trying to find the next available equipment for tens of thousands of tasks, so hundreds of thousands of queries isn't too bad relative the to the amount of tasks. Equipment candidates are matched based on their capabilities (some equipment might support A or B, while others support B or C) and tasks might require certain combinations. This might complicate the multiset slightly but it seems like a good idea. The ordered map I'm currently using handles adjacent available intervals by automatically coalescing them together, so I'm definitely taking advantage of that. |
|
> tens of thousands of tasks
Am I misinterpreting, or are you saying that you have tens of thousands of queries and a much lower equipment count? That enables all kinds of solutions. Even on a single machine, multi-thread it. For deletions and insertions, use something like bounded-wait hazard pointers [0] with a bias toward cheap reads. If you have concurrent tasks, horizontal scaling is the name of the game. 10 microseconds is an easy baseline for intra-datacenter networking, easily allowing 10 millisecond reads and mutations even with a distributed solution.
Related, simply caching results might get you the performance you want on a single core if you only have hundreds of pieces of equipment you're searching each time. For each capability, just record the last time you mutated it. You can re-use any query result whose associated results don't have potential mutations. Especially if writes are much less frequent than reads, and especially if you have a bias toward not handing out equipment with all possible capabilities when something cheaper would suffice, most of the time you never have to hit the raw data structure, allowing a single core to handle a much higher query load even with worse data structures.
[0] a decent walkthrough: https://pvk.ca/Blog/2020/07/07/flatter-wait-free-hazard-poin...