|
|
|
|
|
by sltkr
1842 days ago
|
|
Both of these approaches lose to a CPU. The state of the art algorithm is Hashlife [1], which compresses both time and space, and can evaluate billions of generations on a grid of trillions cells in milliseconds. The GPA approach is really efficient at what it does but ultimately it doesn't scale. For one, it needs 1 bit per cell in the 2D torus, but FPGA have kilobytes or low-megabytes amounts of memory. That makes it hard to simulate a 10,000 x 10,000 grid, let alone a 1,000,000 x 1,000,000 grid. For two, the FPGA explicitly calculates each iteration one-by-one. This is pretty fast in the beginning, and it means you can use it to calculate a billion iterations in a few seconds or a trillion iterations in a few hours, but you can't scale past that. Hashlife can probably be sped up by GPUs a bit, but it processes a symbolic representation and consequently is quite suited to CPUs. It spends a lot of its time doing hash table lookups (hence the name) which is not a good fit for GPUs and a terrible fit for FPGAs. 1. https://en.wikipedia.org/wiki/Hashlife |
|
(i.e. tree methods like Barnes-Hut for gravity and perturbation theory for Mandelbrot)