Hacker News new | ask | show | jobs
by sillycross 1745 days ago
Not LLVM expert, but I don't agree with some of your arguments.

> My side of the code generator had to recognize when a variable had already been defined and keep track of its pointer

For human, it's natural to write code text that reference each variable by its name. However, for a compiler, it's really error prone (and inefficient) to reference a variable by its string name (for example, think about shadowing). The natural way to reference an entity is by its object pointer, which is what LLVM does. This is especially true considering LLVM is designed to perform various complex transformations.

> There is a pass called mem2reg that will convert to SSA, but it needs you to allocate and store variables in memory (instead of in registers).

The purpose of mem2reg is to make your job easier. It's weird to say that it "needs" you to allocate allocas for your variables: that's what it allows you to do (for your own convenience). If you prefer to generate PHI nodes directly, you can just do so.

> LLVM IR has opinions about variable scope

Not sure what you are referencing to. LLVM only has 'alloca', which knows nothing about "scope". It must be defined before being referenced -- but this is true for everything in SSA.

1 comments

I've also gone down the lex/parse/llvm rabbit hole. The op didn't write llvm is clueless or unstructured; she writes llvm could have a better user manual. c++ is my meal ticket; llvm can do nice stuff certainly.
I definitely agree that it would be better if LLVM has a more flattened learning curve and a more accessible manual.

I'm only pointing out that many "problems" listed in the post are intentional design choices for good reasons. They are not downsides that should be improved.

> I'm only pointing out that many "problems" listed in the post are intentional design choices for good reasons. They are not downsides that should be improved.

That's not really less of a problem. If you can't tell why the designers made a choice and what the purpose and intention was, and there's no documentation about that, then your project has failed pretty catastrophically on communication. That's not really a less severe failure or an easier to fix problem than failing technically. Although, I suspect a lot of developers would reflexively disagree.

I agree that it's very important to clearly communicate the designs to the users. However, I feel that it's practically hard for many reasons. For example, it's hard to argue in the document why something is not done (e.g. why are entities referenced by pointer instead of string name). And also keep in mind that the document is read by people of various degrees of expertise. It's hard to make all of them happy.

In fact, when I first started using LLVM, I created a basic block and put everything into it. Then LLVM complains, and I learnt that a basic block must be a list of straight-lined operations followed by a branch. At that moment I was feeling similar to the author in this post: "Why is there such a bizzare rule that makes me harder to write my program?" But after I was forced to rewrite my code to conform with this rule, surprisingly, I found my program logic much easier to understand. And when I gradually knew a bit more about LLVM, I understood that the basic block rule is there for other very good reasons as well.

So is this good design? Clearly it is. This design decision not only helps with the library itself, but also forces the users to write code in a less error-prone way.

However, is it possible to justify this in the document, so another user won't have to go through my initial frustration? I am doubtful. At least my personal feeling is that, I won't be able to understand why it is designed this way, unless I actually have played around and experienced it myself.

I hope this clarifies my point on why sometimes it is hard to communicate design philosophies.