Hacker News new | ask | show | jobs
by tester756 1469 days ago
>No one uses ten different tools to build one application.

I meant you have a lot of choices to make

Instead of having one strong standard which everyone uses, you have X of them which makes changing projects/companies harder, but for solid reason? I don't know.

>"and at the end of the day LLVM compiles 30min and uses tens of GBs of RAM on average hardware" sure, if you're compiling something enormous and bloated... I'm not sure why you think that's an argument against debloating?

I know that lines in repo aren't great way to compare those things, but

.NET Compiler Infrastructure:

20 587 028 lines of code in 17 440 files

LLVM:

45 673 398 lines of code in 116 784 files

The first one I built (restore+build) in 6mins and it used around 6-7GB of RAM

The second I'm not even trying because the last time I tried doing it on Windows it BSODed after using _whole_ ram (16GBs)

1 comments

Compiling a large number of files on Windows is slow, no matter what language/compiler you use. It seems to be a problem with the program invocation, which takes "forever" on Windows. It's still fast for a human, but it's slow for a computer. Quite apt this comes up here ;-)

Source for claim: That's a problem we actually faced in the Windows CI at my old job. Our test suite invoked about 100k to 150k programs (our program plus a few 3rd party verification programs). In the Linux CI the whole thing ran reasonably fast, but the Windows CI took double as long. I don't recall the exact numbers, but if Windows incurs a 50ms overhead per program call you're looking at 1:20 (one hour twenty minutes) more runtime at 100k invocations.

Also I'm pretty sure I've built LLVM on 16GB memory. Took less than 10 minutes on a i7-2600. The number of files is a trade off: You can combine a bunch of small files into a large file to reduce the build time. You can even write a tool that does that automatically on every compile (and keeps sane debug info). But now incremental builds take longer, because even if you change only one small file, the combined file needs to be rebuild. That's a problem for virtually all compiled languages.

It's crazy that they have multiplied files count by 7 meanwhile the code just by 2

is it some C++ header file overhead? or they do something specific?

I can only guess, I am neither a LLVM nor a MSVC dev.

1. Compile times: If you have one file with 7000 LOC that and change one function in that file, the rebuild is slower than if you had 7 files with 1000 LOC instead.

2. Maintainability: Instead of putting a lot of code into one file, you put the code in multiple files for better maintainability. IIRC LLVM was FOSS from the beginning, so making it easy for lots of people to make many small contributions is important. I guess .NET was conceived as being internal to MS, so less people overall, but newcomers probably were assigned to a team for onboarding and then contributing to the project as part of that team. With other words: At MS you can call up the person or team responsible for that 10000 LOC monstrosity; but if all you got is a bunch of names with e-mail addresses pulled from the commit log, you might be in for a bad time.

3. Generated code: I don't know if either commit generated code into the repository. That can skew these numbers as well.

4. Header files can be a wild card, as it depends on how their written. Some people/projects just put the signatures in there and not too much details, others put the whole essays as docs for each {class, method, function, global} in there, making them huge.

For the record, by your stats .NET has 1180 LOC per file and LLVM 391 on average. That doesn't say a lot, the median would probably be better, or even a percentile graph. Broken down by type (header/definition vs. implementation). You might find that the distribution is similar and a few large outliers skew it (especially generated code). Or when looking at more, big projects you might find that these two are outliers. I can't say anything definite, and from an engineering perspective I think neither is "suspicious" or even bad.

My gut feeling says 700 would be a number I'd expect for a large project.

> My gut feeling says 700 would be a number I'd expect for a large project.

aha, I remember when I was in class, the absolute rule our teachers gave us was no more than 200 lines per file