|
|
|
|
|
by aldanor
3058 days ago
|
|
@ the OP - not to sound hostile, but you write code (like in the example here [1]) that is bound to be slow, just from a glance at it. vstacking, munging with pandas indices (and pandas in general), etc; in order for it to be fast, you want pure numpy, with as little allocations happening as possible. I help my coworkers “make things faster” with snippets like this all the time. If you provide me with a self-contained code example (with data required to run it) that is “too slow”, I’d be willing to try and optimise it to support my point above. Also, have you tried Numba? It maybe a matter of just applying a “@jit” decorator and restructuring your code a bit in which case it may get magically boosted a few hundred times in speed. [1] https://git.embl.de/costea/metaSNV/blob/master/metaSNV_post.... |
|
Here is an earlier version (intermediate speed): https://git.embl.de/costea/metaSNV/commit/ff44942f5f4e7c4d0e...
It's not so easy to post the data to reproduce a real use-case as it's a few Terabytes :)
*
Here's a simple easy code that is incredibly slow in Python:
This is not unlike a lot of code I write, actually.