| > BTW, how many LOC does Crater run in a full test, and how long does it take/how expensive is a run? I failed to find that information. I have nothing more than a finger in the air estimate for LOC, maybe hundreds of millions? I have never watched a "full test" like for a release build, I believe those take several days - but when Crater is asked just to build everything that takes a little under 24 hours with its current footprint. > I strongly suspect VC++ changes/extensions are tested against in-house Microsoft code bases before making their way to the standard, because it makes no sense to undermine your own systems. Surely it stands to reason that if Microsoft are proposing standardisation of a feature they've shipped in MSVC, that's also a feature they've tried using? This model of ISO C++ features (which the developer of Circle also prefers) maps much better to what was initially envisioned than today's reality however. Most C++ proposals today are not submissions of existing compiler features from the big three compilers (MSVC, GCC and Clang) but instead fresh before the committee, often with no implementation experience at all. That's certainly one way to do it, after all Rust contributors don't have their own Rust compiler either, but it means you need very different tooling. 1) Breadth matters much more than depth for finding surprises which is the thing you won't get with an ad hoc approach. Going from 10% of some big corporate code base to 20% won't make anywhere near the difference you get from adding a hundred one-man-band projects that are smaller even in total, because different stylistic and idiomatic choices make so much more practical difference for this work. 2) As a result "a few dozen" won't cut it. Try all the C++ on github, that seems like a much better place to start. 3) Sure, the primary goal of WG21 proposers is to get into the IS - it would be nice if what they've proposed actually works, but ultimately if it doesn't work that can be fixed later, whereas if it's not adopted then it doesn't matter whether it would work. Arguably there have never been any versions of the C++ IS which actually describe a complete working programming language, so it's not terribly important that if it were such a system it would be correct, still there's a preference for fewer rather than more horrible gotchas. I mentioned #embed so that's a useful example here, C++ 23 doesn't standardize #embed. So in theory C++ code can't use #embed, that's not C++. But of course in reality the vendors are going to ship a pre-processor which handles #embed, they don't care, so it'll work and it's widely expected you will be able to use it even in older C++ verisons. 4) If there was a specification then a tool like Crater might be somewhat helpful for that, but I expect that most effort would remain focused on a single implementation, today that is of course the Rustc compiler with its LLVM backend. The hypothetical EESmith Rust sounds spurious to me, how could it deliver 2x run-time performance by removing "rarely used features" ? I don't think spurious hypotheticals are a good use of anybody's time. |
And if there were a number, it would reflect only one of several ways to quantify "LOC", right? Resulting in a spread of numbers that could meaningfully be described as "LOC"?
> Breadth matters much more than depth for finding surprises which is the thing you won't get with an ad hoc approach
I would be quite interested in someone doing a research publication on this topic!
Give the history of using crater, which packages have proved most useful? Do the same core packages prove useful over time, or does the most significant subset change wildly? What does the cumulative distribution plot (#packages until time of appropriate feedback) look like?
How worthwhile is the additional breadth from crates.io + GitHub vs. just crates.io? Is it worthwhile to also include GitLab, and what are the tradeoffs (eg, additional compute costs, additional false positives).
For that matter, how useful would it be to add Linux+ARM to the current Crater tests? Or Microsoft Windows? If breadth is that important, then why skip out on the full set of Rust code you have available?
> As a result "a few dozen" won't cut it.
I did follow up with "If not, would ~100 packages be enough? What about ~1,000?" :)
If there's no equivalent of a dose-response curve / ROC curve / price-performance curve, and the answer is "must try everything" then how do I know the extra effort is useful, rather than FOMO-driven anxiety?
> Try all the C++ on github
Assuming there was a single way to build all C++ code - how much do you think it would cost to compile all the C++ code on GitHub? And why do you think it the additional cost would be worthwhile to C++ standards development?
> but I expect that most effort would remain focused on a single implementation,
Oh, given my experience with Python implementations, I agree!
But my point is processes change when you have multiple competing commercial vendors, which C++ has. So looking at how Rust does things doesn't mean it's also appropriate for C++.
> I don't think spurious hypotheticals are a good use of anybody's time.
Okay, something more practical. C++11 broke backwards compatibility by changing how 'auto' works. "auto int i;" used to be valid, now it's an error. This is a huge boon for usability. It's a trivial syntactic change to fix old code, and long experience shows the old "auto" storage class was rarely used.
How would the systematic compilation of all C++ code on GitHub (assuming that were possible) affect that decision more than the ad hoc methods they did use to make that decision?
Will there really never be something in Rust were a simple breaking change of a rarely used feature can result in an easier-to-use language?
If there can, then you may have a schism, either temporary (gcc vs egcs fork) or more permanent (Perl5/Perl6/Raku). Which will be "Rust"?
The answer is legally quite clear. The Rust Foundation has the trademark to "Rust" (serial number 87796977). My version can't break backwards compatibility, even as a fork, so would have to call it, perhaps, "Verdigris". (As I recall, someone started to develop a "Python 2.8" with more backports from Python 3; the PSF got after them for using the Python trademark that way.)
C++ doesn't have trademark protection, so the legal concept of what is/is not C++ are also different than Rust.