Hacker News new | ask | show | jobs
by jcranmer 1741 days ago
Some comments:

1. The LLVM API is designed as a C++ API, and if you're serious about using LLVM, you're likely to have to actually work with the C++ API directly. There's a C API which is theoretically more stable than the C++ API, but it is very heavily gimped--it has basically no support for metadata, for example--and is mostly feasible only for the most basic usage entirely. Since the author brings up needing to use custom metadata, that suggests that they are intending to create custom optimization passes which is basically impossible except via the C++ API.

2. The complaint about metadata was very strange to me. I have had to work with custom metadata very recently with my work in LLVM, and I've had nothing like the pain the author suggests. (I've also had to deal with TBAA, which is definitely an area where LLVM lacks sorely in documentation, particularly examples). The "defined before use" just simply isn't an issue, because metadata is supposed to be global, so there is no define or use...

I took a look at the llir library the author was using. On a quick inspection, it appears to be a library for generating textual LLVM IR without having to link to LLVM at all. Oy. The problem isn't LLVM, nor even the LLVM IR itself. The problem is your library to generate LLVM IR.

3. About the SSA issue. LLVM actually does have facilities to generate SSA correctly without going through allocas (though that might be challenging to use for codegen instead of in the context of an optimization pass). But, as established above, the author is purposefully using LLVM in a way that precludes them from availing themselves of this feature. Note that LLVM specifically recommends that frontends generate variables as allocas in the entry block and letting the optimizer generate the SSA for you (see https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/LangI...).

4. I'm not entirely sure what the author means when discussing variable scope, but my guess is they neglected the "in the entry block" part of the standard guidelines for generating variables. If true, I'm left scratching my head where they got their answer to the SSA issue from that didn't mention that part--it's a very important part of generating alloca's correctly, and getting it wrong means you have some very broken mental semantics as to how it's supposed to work.

5. From the final paragraph, it seems the author's final step is to... write a parser for LLVM IR, and then convert their custom-parsed LLVM IR into SMTLIB2 code. As opposed to having LLVM parse the IR itself, visiting that IR, and then doing the same. Just... no.

This isn't to say that LLVM is perfect in terms of documentation--it is very far from it--but a lot of the issues seem to be related to trying to actively avoid working with LLVM itself.