|
|
|
|
|
by naasking
3517 days ago
|
|
> For all N:1 mappings, since there is only a single process, there is no need for synchronization, also there is no need for any scheduler as a simple queue suffice. Wouldn't an M:N model simply amount to work stealing among N kernel-thread-local queues? This seems like it should be a pretty straightforward extension to one of the user-level C thread packages. Or are you doing something more elaborate for your research? |
|
Usually these libraries either provide their own queue or rely on underlying event system e.g. epoll/kqueue to manage the state of the light weight threads. Also, the other main part is IO multiplexing, which can be done using select/poll/epoll/kqueue ...
Lets say if they are using some sort of queue, since there is no other thread in the system, thus no synchronization required and its as simple as pushing to the tail of queue and pulling from the head. Now, if I add more threads, now I have to think how I want to synchronize among threads. The straight forward way of doing this is using the same queue and multiplex among kernel threads and use Mutex and CV. However, Mutex has high overhead and is not very scalable (pthread_mutex does not scale well under contention in Linux). What about N-kernel-thread-local queues along with Mutex? Well, if you see the documentation this approach does better but has high overhead as well. What is next? remove the queue and add some sort of lock-free queue, either MPMC, MPSC, or SPSC. This requires some work to determine which one has lower overhead, in my case I have not tested SPSC, and for now settled with MPSC. So the queue part is totally gone, since they probably did not care about all this and used a simple queue.
Next, comes the IO multiplexing. Relying on an epoll instance per kernel-thread is absolutely fine specially for cases where uThreads stick to the same kernel thread, but as soon as I introduce the notion of migration then moving from on kernel-thread to another kenrel-thread means issuing at least two system calls (in case of epoll, deregister from current epoll instance and register to the new one). This has high overhead, so I need to provide some sort of common poller that do multiplexing over multiple kernel threads, but due to having more than one kernel thread, it means connections should be properly synchronized as more than one thread might try to access a connection. Also, it has to be done in a way with low overhead as migrations for my research require to have very low overhead. Thus, the IO multiplexing part should be replaced.
So the main parts are required to be changed, and I believe the effort required to make fundamental changes to an existing system might be more than the efforts required for writing it from scratch. Also for each part implemented, I did performance optimisations and build on top of that which helped to keep the performance at an acceptable level, it would be hard to do the same with an existing system as it requires to isolate various parts and optimise each part, which requires additional effort.
I hope it makes it more clear :)