| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ozinenko 2308 days ago

(Early disclaimer: I am an author of the paper and of MLIR, but this is a personal opinion)

I am somehow surprised by the reaction but at the same time I expected as much. We chose to make MLIR open-source early in the design and development, may be half a year after the first line was written, because we value open source and believe that a community can help build significantly better things that serve a broader audience. I find it unreasonable to expect a new infrastructure to have the same amount of frontends/optimizers/backends/utilities/bells-and-whistles as a project that has been around for 17 years and has got contributions from hundreds of developers employed by dozens of companies. 15 years ago, LLVM only had a fraction of what it has today... By making MLIR open from the start, we let everybody add the whistles they need, or collaborate on them with other people who also need them. That is sort of the point of open source community. Personally, I am happy to work with anybody who shares my needs, but I won't do somebody else's job instead of them. In a sense, MLIR does not intend to solve any specific compiler problem, but rather give people tools that help them focus on the problem at hand instead of IR infrastructure, storage, serialization etc.

The alternative would have been to develop it internally for a long time, driven exclusively by the internal needs, and then just throw it over the fence into public. It would have had significantly more "features". I am certain some people would have complained about that process as well.

As for the need of MLIR when LLVM already exists: yes, you could express a lot of things by (ab)using LLVM IR, but sometimes it's worth considering whether you should. Loop optimizations is the canonical example of LLVM IR's limitations [1,2]. Peculiarities of GPU execution model are another [3,4]. People spent years on trying to fit those into LLVM's constraints. Generic passes like DCE, CSE and inlining are (re)implemented on many levels of IRs, which seems more like the time-consuming wheel reinvention than doing an infrastructure that can be reused for those.

TL;DR: if you want a compiler giving faster code for language X today, MLIR is probably not for you, also today; if LLVM already does everything you need, MLIR is probably not for you either. There are plenty of other cases around.

[1] http://polly.llvm.org [2] http://lists.llvm.org/pipermail/llvm-dev/2020-January/137909... [3] http://nhaehnle.blogspot.com/2016/10/compiling-shaders-dynam... [4] http://lists.llvm.org/pipermail/llvm-dev/2019-October/135929...

1 comments

pizlonator 2308 days ago

Nice things first: I think that MLIR is a great solution to the problem of reusable IR scaffold (infrastructure, storage, serialization). If you believe that this is a problem and you don’t do anything that MLIR can’t express at all or well (OSR at scale, non-SSA forms), and you don’t have a plan to change the scaffold to fit your use case, then I think that MLIR is a pretty nifty achievement.

I just don’t buy the premise because IR scaffolds are something I’m used to building quickly.

link

Joky 2307 days ago

You are making good point, but I think you're missing some aspects and the title hints about it: we are also trying to address today's heterogeneity. By having a flexible infrastructure and (hopefully) an ecosystem with which the interaction cost is lowered, you can re-assemble more easily custom compilers for specific use-cases.

This does not make MLIR the best infrastructure for building an industrial embedded Javascript compiler for instance (just like you wrote B3 to move away from LLVM), but I am not convinced that between these two, MLIR is the niche ;-) Time will tell!

link

pizlonator 2307 days ago

But MLIR is so limited in what it can do - a specific style of SSA, a specific module and procedure structure, etc. Even the features that make it general (regions) represent a specific choice of how to do it.

Great IRs represent an ideal fit between data structures, data flow style, control style, and the domain. Llvm is successful because it fits C so darn well - it’s like SSAified C. I’m not sure what MLIR is an ideal fit for. It just feels like another Phoenix.

link