Hacker News new | ask | show | jobs
by michelpp 819 days ago
This is very cool and a real interesting read! For those in the comments confused about how this is better, the paper is talking about synthesizing matrix multiplication pipelines in hardware, like an FPGA or ASIC. On a CPU or GPU you won't notice because adds and multiplications take the same amount of time generally, but multiplication units takes up many more transistors, so if you can reduce the circuit complexity you can increase the speed and parallel throughput and reduce power and routing complexity. This approach could be particularly useful for efficient sparse matrix multiplication accelerators.

Another cool way to eliminate multiplication in matrix multiplication is to use different semirings [1]. The Tropical Semiring [2] for example substitutes addition for multiplication and min (or max) for addition. It's still matrix multiplication but with substituted binary operations. The research in this relatively new field of Tropical Algebra [3] is quite active and rich right now, being used for all kinds of optimization problems and in research for optimizing neural networks [4] . This approach also lends itself to hardware synthesis since most FPGA configurable logical blocks can add/min/max in one clock cycle, whereas efficient multiplication requires fixed dedicated on-chip hardware multipliers.

Another way to efficiently remove multiplications with a different but related semiring is to use a Log Semiring [5]. If you have to multiply chains of probabilities (like Markov chains) then the numbers quickly become very small and floating point loses its accuracy to represent the numbers. By scaling the numbers first by taking the log, multiplication becomes addition and addition becomes x + log1p(exp(y - x)).

[1] https://en.wikipedia.org/wiki/Semiring

[2] https://en.wikipedia.org/wiki/Tropical_semiring

[3] https://en.wikipedia.org/wiki/Tropical_geometry

[4] https://proceedings.mlr.press/v80/zhang18i/zhang18i.pdf

[5] https://en.wikipedia.org/wiki/Log_semiring

5 comments

The paper in [4] is absolutely fascinating - I'm very much a neophyte here, but I believe it shows that practically any ReLU network can be represented as a tropical ratio of two tropical polynomials, and thus can be analyzed with geometric principles including visualizations of surfaces. It's been cited in more recent work: https://scholar.google.com/scholar?cites=1003719112553620451... - does anyone know if there's been any significant advancements here?
Whoah, this is what Unified Algebra is all about!

http://www.cs.toronto.edu/~hehner/UA.pdf

So what's the difference between this unified algebra, and universal algebra[0]?

[0] https://en.wikipedia.org/wiki/Universal_algebra

This is what HN is about :)

I understood a fraction but instantly wanted to dive into the topic just after reading such a knowledgeable comment.

> By scaling the numbers first by taking the log, multiplication becomes addition and addition becomes x + log1p(exp(y - x)).

Addition/subtraction in a logarithmic number system is way more expensive than what you would spend on multiplication, especially if you care about correctly rounded results, as the (hardware) LUTs required are rather big.

> By scaling the numbers first by taking the log, multiplication becomes addition and addition becomes x + log1p(exp(y - x)).

Isn't this the same approach as in GF(2^x), which has been in use for decades? The only limitation that comes to mind is the field size.

Finite field log/antilog lookup tables are used for efficient-ish multiplication, similar to addition/subtraction tables used for logarithmic number systems.
somewhat related is the number theoretic transform

https://ieeexplore.ieee.org/abstract/document/1451721