Hacker News new | ask | show | jobs
by antognini 399 days ago
Once you have the matrix implementation in Step 2 (Implementation 3) it's rather straightforward to extend your N-body simulator to run on a GPU with Jax --- you can just add `import jax.numpy as jnp` and replace all the `np.`s with `jnp`s.

For a few-body system (e.g., the Solar System) this probably won't provide any speedup. But once you get to ~100 bodies you should start to see substantial speedups by running the simulator on a GPU.

2 comments

> rather straightforward

For the programmer, yes it is easy enough.

But there is a lot of complexity hidden behind that change. If you care about how your tools work that might be a problem (I'm not judging).

Oh for sure there is a ton that is going on behind the scenes when you substitute `np` --> `jnp`. But as someone who worked on gravitational dynamics a decade ago, it's really incredible that so much work has been done to make running numerical calculations on a GPU so straightforward for the programmer.

Obviously if you want maximum performance you'll probably still have to roll up your sleeves and write CUDA yourself, but there are a lot of situations where you can still get most of the benefit of using a GPU and never have to worry yourself over how to write CUDA.

People think high-level libraries are magic fairy dust - "abstractions! abstractions! abstractions!". But there's no such thing as an abstraction in real life, only heuristics.
Not true for regular GPUs like RTX5090. They have atrocious float64 performance compared to CPU. You need a special GPU designed for scientific computations (many float64 cores)
This is confusing: GPUs seem great for scientific computations. You often want f64 for scientific computation. GPUs aren't good with f64.

I'm trying to evaluate if I can get away with f32 for GPU use for my molecular docking software. Might be OK, but I've hit cases broadly where f64 is fine, but f32 is not.

I suppose this is because the dominant uses games and AI/ML use f32, or for AI even less‽

You need cards like the A100 / H100 / H200 / B200. GPUs aren’t fundamentally worse at f64. Nvidia just makes it worse in many cards for market segmentation.