| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by JnRouvignac 4035 days ago

Interesting question. What would prevent this to happen indeed?

One reason I can think of is that for GCC all input languages are converted to the same intermediate representation (IR). From there, it's easy to apply the same optimisations, generated code, etc.

For refactoring tools, I have never heard of such IR. The benefits of going for an IR are not obvious either. The drawbacks are huge: a lot more code, difficulty to build an IR rich enough to represent the semantics of all input languages, and there's an extra mapping to go back to the original code by keeping indentation and formatting.

While it is more obvious for GCC: you need another representation than the AST to compile to assembly, optimisations are run on the IR, there is no need to take care of the original source formatting.

Languages also have very subtle differences in meaning/behaviour making it harder to reuse refactorings across languages.

Maybe it's possible to go with your idea, but I think it would require a big team working on it. Who's willing to spend the money or the time when the current approach seems satisfactory?

1 comments

TheLoneWolfling 4035 days ago

I always thought that refactoring tools should be compilers.

That way you can reuse all of the parsing code / DCE code / flow analysis / etc / etc.

The extra mapping is something a "normal" (i.e. single-language) refactorer has to do anyways. Assuming its a half-decent refactoring tool its not just doing a find-replace.

The subtle differences / drawbacks you mention a compiler also has to deal with. Actually, everything you mention a compiler also has to deal with. You could say the same about compilers (there's no actual need for an IR with a compiler, for instance. You can operate on the AST directly. And you don't need to build an IR rich enough to keep semantics in all languages unless you're supporting multiple languages. Etc. Etc.) And yet we have GCC.

link

JnRouvignac 4035 days ago

I agree the drawbacks equally apply to GCC. I agree a simple compiler does not need an IR. Some compiler optimisations cannot exist without an IR.

However I was wondering where a refactoring tool would need an IR. I found one: a CFG. For this, I need a three-address code representation. As a hypothesis, let's suppose I can build one that is independent from the input language. Once I have found how to refactor it, I then need to describe which code must be rewritten to what, in the input langage. This is the part where I find it very hard to do. I have the impression it will fall short here.

That would be an interesting experiment :)

link