|
|
|
|
|
by xab31
1748 days ago
|
|
One of our graduate students is thrilled about this paper, although he tends to do that with any new CS advance that seems sensational and that we barely understand (we do bioinformatics). He said that it stood to reason that if it can mmult 100GB/s/core, then we could matrix multiply 12TB in a minute! Could you translate into practitioner-level language what are the practical limitations of this method; specifically, what would the error rates induced by approximation be under some practical scenarios, when would it make sense and not make sense to use it, etc? There is a complex equation in the paper describing the theoretical error bounds, but I have no idea whether in some practical scenario multiplying some normally distributed variables, whether that would mean a 0.1%, 1%, 5%, 10% error. Personally I think it only makes sense to use this kind of method in some real-time algorithm where speed is of the essence, the downstream results of the mmult are themselves used in some other approximation (like many ML applications), and emphatically not to make the process of drawing biological conclusions from painstakingly derived data a few minutes faster for the analyst. I fear that you have made an impressive, but dangerous, tool to people who don't know what they're doing. |
|
To simplify Section 1.1, we help when:
1) You need to perform a matrix product more quickly and can tolerate approximation error
2) You have a training set for the larger matrix
3) The smaller matrix is either a) fixed or b) skinny relative to how tall the larger matrix is.
Re: "an impressive, but dangerous, tool to people who don't know what they're doing."
I believe you are overestimating the usability of my code :). But more seriously, I suspect that people attempting to use our method in contexts it wasn't designed for will quickly discover that they either can't actually call the API the way they wanted to, or that the method is no faster for their purposes. We also characterize our method at least as thoroughly as any approximate matrix multiplication algorithm I'm aware of, and have a variety of (admittedly loose) theoretical guarantees. So I hope that at least those who thoroughly read the paper will have a clear idea of what it can do. Overall, I guess my current thinking is that 1) I'm not sure how to introduce a method any more responsibly, but 2) if I can be of help in ensuring that it gets used well, feel free to reach out.