Hacker News new | ask | show | jobs
by sltkr 1842 days ago
Both of these approaches lose to a CPU. The state of the art algorithm is Hashlife [1], which compresses both time and space, and can evaluate billions of generations on a grid of trillions cells in milliseconds.

The GPA approach is really efficient at what it does but ultimately it doesn't scale. For one, it needs 1 bit per cell in the 2D torus, but FPGA have kilobytes or low-megabytes amounts of memory. That makes it hard to simulate a 10,000 x 10,000 grid, let alone a 1,000,000 x 1,000,000 grid. For two, the FPGA explicitly calculates each iteration one-by-one. This is pretty fast in the beginning, and it means you can use it to calculate a billion iterations in a few seconds or a trillion iterations in a few hours, but you can't scale past that.

Hashlife can probably be sped up by GPUs a bit, but it processes a symbolic representation and consequently is quite suited to CPUs. It spends a lot of its time doing hash table lookups (hence the name) which is not a good fit for GPUs and a terrible fit for FPGAs.

1. https://en.wikipedia.org/wiki/Hashlife

1 comments

This reminds me of how I was fascinated by N-body simulations and fractals in high school, and then later found out there are much better ways of calculating both gravity and the Mandelbrot set than the obvious ones.

(i.e. tree methods like Barnes-Hut for gravity and perturbation theory for Mandelbrot)