|
|
|
|
|
by nlewycky
606 days ago
|
|
Set e-graphs apart from e-matchers. An e-graph is a vector-of-sets, where each set contains equal representations of the same thing. What makes it space efficient is that your representation refers to things by index number in the vector, so if you have "3 * (x + 1)" as an expression, you store "x + 1" in vector #1 and "3 * #1". That way if "x + 1" is also known to equivalent to "load ptr" then you add that to #1 and don't need to add anything to #0. In e-graphs, #0 and #1 are e-classes, the sets holding equivalence classes. There's no limit to what transforms you can use, and let the e-graph hold them efficiently. E-matchers are an algorithm over e-graphs to efficiently search for a pattern in an e-graph, for instance if you want to find "(load ptr) * (load ptr)" then you have to find "#x * #x" for all x in all e-classes, then check whether "load ptr" is a member of e-class #x. This is where you get limits on what you can match. "Pattern matching" style transforms are easy and fast using e-matchers, things like "x * 2" -> "x << 1", but beyond that they don't help. There's an optimizer problem where you have "load ptr" and you solve ptr and figure out it's pointing to a constant value and you replace the load instruction with the constant value. Later, you get to code emission and you realize that you can't encode your constant in the CPU instruction, there's not enough bits. You now need to take the constant and stuff it into a constant pool and emit a load for it. If you had stored them in an e-graph, you could have chosen to use the load you already had. Suppose you wanted to do sparse conditional constant/constant-range propagation but your IR uses an e-graph. You could analyze each expression in the e-class, intersect them, and annotate the resulting constant-range on the whole e-class. Then do SCCP as normal looking up the e-class for each instruction as you go. |
|