Hacker News new | ask | show | jobs
by mickronome 3218 days ago
I've always thought of compilers as usually lossy graph rewriters with input and output data structures usually being 'flatter' in some sense. Maybe a both simplistic and vague model, but it has served me well enough the few times I needed to build one.

This isn't meant to validate or invalidate any other view or definition, but I'm curious if there are any good counter examples or theoretical reasons for characterising compilers differently ?

Quite naturally, all compilers might not be implemented with explicit graphs and their subsequent rewrites, but implicitly both input and output represent a set of statements encoding a specific set of truths and actions, all of them which is intrinsically related to their various contexts.

In this light there is really no distinction between high level and low level targets, but only between various levels of information loss and how explicit the actions and truths are expressed in the input and output.

It might be worth noting that most of the perceived information loss, except for names, is only due to human perception and limitations. State of the art decompilers can in many cases recover a surprising amount of the original types and code structures, albeit at considerable computational costs.

2 comments

Graph rewriting turns out to be a rather difficult problem, and many optimizations are better recast in other frameworks. For example, instruction scheduling (on superscalar processors) is pretty much a textbook example of the job scheduling problem. Loop optimizations can be most easily expressed in the polyhedral loop optimization model. Decisions like inlining can be framed as high-dimension, highly non-linear, multi-goal optimization (in the mathematics sense) problems.
I can't think of any reason to contradict that as a abstract description of a compiler, although the standard - Practical? - definition of a compiler is probably the more useful of the two.

Graph rewriting is probably a little bit niche, e.g. AFAIK most compilers don't describe what they do as graph rewriting. Perhaps because many compilers do a lot of their work on linear data structures (which are part of a larger graph) like Basic Blocks.