|
> “You have to ensure that it falls back to C code that uses Int64 for indexing the arrays. I am sure you are not saying that you do sparse multiplications of this size in pure python.” For csc and csr matrices at least, these operations typically iterate the underlying indices, indptr and data arrays, and csc `nonzero` uses len(indices), which both relies on (eventually) the C-level call to malloc that defined `indices` (and so uses the systems address space precision, and would never report number of elements in a lower precision int than what the platform supports for memory addressing), and returns this as an infinite precision Python int. Afterwards it only uses arrays of indices, not integers holding sizes. Long story short is that at least for csc matrices, the issue you describe wouldn’t be possible internally to scipy’s C operations, as you’d always be dealing with an integer type large enough for any possible contiguous array length that can be requested on that platform (and the nonzero items are stored in contiguous arrays under the hood). On my team we are not doing pure Python ops on the sparse matrices, rather we needed customized weighted operations (for a sparse search engine representation that weights bigrams, trigrams, trending elements, etc., in customized ways) and some set operations to filter rows out of sparse matrices. So we basically rip the internal representation (data, indices, and indptr) out of csc matrices and pass them into a toolkit of numba functions that we have spent time optimizing. |
The code that will get called for a multiply is this https://github.com/scipy/scipy/blob/master/scipy/sparse/spar... and https://github.com/scipy/scipy/blob/master/scipy/sparse/spar...
It's important that decisions at the python level trickles down to the correct choice when it comes down to this level.
On a 64 bit architecture one would expect that using 64bit int arrays for indices and indptr would ensure that. But thats not the way it works. We regularly encountered cases where it would call the code corresponding to int32. I know why and have special checks and jump hoops to prevent this.
Thats besides the point, with static types I wouldn't need to do this, the compiler would take care of it.
I appreciate your effort to dig through the logic. You have spent time speaking at length in the comment above but unfortunately said little. Malloc has nothing to do with it. Your third paragraph is manifestly false. Why do I say so ? Because I deal with this everyday and have counterexamples.
I didnt mean to ask you to find out. Apologies if I wasted your time. I already know why the type mismatch happens. My point was to demonstrate that a lot of manual wading is needed to ensure that it finally bottoms out by calling native code with correct type.