Hacker News new | ask | show | jobs
by ggm 598 days ago
You may be right. I personally think this work is net beneficial, and although I never expected to be in MP or threads, I now find doing a lot of DNS (processing end of day logs of 300m records per day, trying to farm them out over public DNS resolvers doing multiple RR checks per FQDN) that the MP efficiency is lower than threads, because of this serialisation cost. So, improving threading has shown me I could be 4-5x faster in this solution space, IFF I learn how to use the thread.lock to gatekeep updates on the shared structures.

My alternative is to serialise in heavy processes and then incur a post process unification pass, because the cost of serialise send/receive deserialise to unify this stuff is too much. If somebody showed me how to use shm models to do this so it came back to the cost of threading.lock I'd do the IPC over a shared memory dict, but I can't find examples and now suspect multiprocessing in Python3 just doesn't do that (happy, delighted even to be proved wrong)