| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nuisance-bear 1895 days ago
	Tools to make GPU development easier are sorely needed. I foolishly built an options pricing engine on top of PyTorch, thinking "oooh, it's a fast array library that supports CUDA transparently". Only to find out that array indexing is 100x slower than numpy.

3 comments

eslaught 1894 days ago

You might be interested in Legate [1]. It supports the NumPy interface as a drop-in replacement, supports GPUs and also distributed machines. And you can see for yourself their performance results; they're not far off from hand-tuned MPI.

[1]: https://github.com/nv-legate/legate.numpy

Disclaimer: I work on the library Legate uses for distributed computing, but otherwise have no connection.

link

sideshowb 1894 days ago

Interesting find about the indexing. I just had the opposite experience, swapped from numpy to torch in a project and got 2000x speedup on some indexing and basic maths wrapped in autodiff. And I haven't moved it onto cuda yet.

link

nuisance-bear 1894 days ago

Here's an example that illustrates the phenomenon. If memory serves me right, index latency is superlinear in dimension count.

   import time, torch
   from itertools import product

   N = 100

   ten = torch.randn(N,N,N)
   arr = ten.numpy()

   def indexTimer(val):
       start = time.time()
       for i,j,k in product(range(N), range(N), range(N)):
           x = val[i, j, k]
       end = time.time()
       print('{:.2f}'.format(end-start))

   indexTimer(ten)
   indexTimer(arr)

link

TuringNYC 1894 days ago

>>> built an options pricing engine on top of PyTorch

I'd love to hear more about this! Do you have any posts or write-ups on this?

link