Hacker News new | ask | show | jobs
by csense 4862 days ago
> significantly better speed than a non-threaded memcpy()

Really? I always thought that a single core could always saturate available memory bandwidth (unless you have some weird architecture like NUMA). If you're seeing a multithread memcpy that has better performance speed, maybe you're just stealing memory bandwidth from other processes (since AFAIK memory bandwidth is probably done on a per-thread basis), or maybe you're getting more CPU cache allocated to you because you're running on multiple cores?

This would be interesting to investigate.

1 comments

I'm not sure why, but yes, I'm seeing these speedups. You can see them too in: http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks and paying attention to compression ratios of 1 (compression disabled). If you find any good reason on why this is happening, I'm all ears.