| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dayeye2006 1036 days ago
	Yes. But some of the algorithms cannot benefit that much from the GPU. In my field -- mathematical optimization, lots of algorithms rely on sparse matrix operations and takes many iterations until convergence.

1 comments

modeless 1036 days ago

Would this help? https://jax.readthedocs.io/en/latest/jax.experimental.sparse...

link

marmaduke 1036 days ago

Nope it's super slow for large sparse matrices. It's even faster to use generic scatter/gather to implement some, instead of that built in thing.

link

londons_explore 1036 days ago

Have you investigated why? I know that many projects have an "implement first, optimize later" approach, and the lesser used functions might be far from optimal.

Back in the tensorflow days, I had this issue and submitted a patch that gave a ~50x speedup for my usecase. It's always better to optimize the base function rather than have 100 people all manually working around the same performance issue.

link

marmaduke 1036 days ago

Because they use a funny format (BCOO). I'm not mocking, it must be a solid choice for some reasons, like sparsification or other fancy stuff. But for large and even with batches (ie multiply with tall dense matrix), it doesn't match an equivalent scatter (x.at[idx].add(vals)). Which itself is several times slow than equivalent opencl (on an A40)

link