I don't have a dedicated benchmark for these primitives, but we use them in a database that processes petabytes of data [1] and we don't find specific bottlenecks.
Most of the performance factors would be the sync Mutex in used. I can imagine that by switching between the std Mutex, parking_lot's Mutex, and perhaps spin lock in some scenarios, one can gain better performance. Mea has an abstraction (src/internal/mutex.rs) for this switch, but I don't implement the feature flag for the switch since the current performance is acceptable in our use case.
The internal semaphore's implementation may be improved also. Currently, to keep code safe, I implement the linked list with `Slab<Node>` (you can check src/internal/waitlist.rs for details). Using a link like [2] may help, but that's not always a net win and needs much more time to do it right.
Interesting. Thanks! I've been experimenting a bit with my keepcalm library. I have some experimental async concurrency primitives in there but I'd like to compare with what you've got here to potentially replace them.
[1] https://www.scopedb.io/blog/manage-observability-data-in-pet...
Most of the performance factors would be the sync Mutex in used. I can imagine that by switching between the std Mutex, parking_lot's Mutex, and perhaps spin lock in some scenarios, one can gain better performance. Mea has an abstraction (src/internal/mutex.rs) for this switch, but I don't implement the feature flag for the switch since the current performance is acceptable in our use case.
The internal semaphore's implementation may be improved also. Currently, to keep code safe, I implement the linked list with `Slab<Node>` (you can check src/internal/waitlist.rs for details). Using a link like [2] may help, but that's not always a net win and needs much more time to do it right.
[2] https://github.com/Amanieu/intrusive-rs