| > I have nothing more than a finger in the air estimate for LOC, maybe hundreds of millions? And if there were a number, it would reflect only one of several ways to quantify "LOC", right? Resulting in a spread of numbers that could meaningfully be described as "LOC"? > Breadth matters much more than depth for finding surprises which is the thing you won't get with an ad hoc approach I would be quite interested in someone doing a research publication on this topic! Give the history of using crater, which packages have proved most useful? Do the same core packages prove useful over time, or does the most significant subset change wildly? What does the cumulative distribution plot (#packages until time of appropriate feedback) look like? How worthwhile is the additional breadth from crates.io + GitHub vs. just crates.io? Is it worthwhile to also include GitLab, and what are the tradeoffs (eg, additional compute costs, additional false positives). For that matter, how useful would it be to add Linux+ARM to the current Crater tests? Or Microsoft Windows? If breadth is that important, then why skip out on the full set of Rust code you have available? > As a result "a few dozen" won't cut it. I did follow up with "If not, would ~100 packages be enough? What about ~1,000?" :) If there's no equivalent of a dose-response curve / ROC curve / price-performance curve, and the answer is "must try everything" then how do I know the extra effort is useful, rather than FOMO-driven anxiety? > Try all the C++ on github Assuming there was a single way to build all C++ code - how much do you think it would cost to compile all the C++ code on GitHub? And why do you think it the additional cost would be worthwhile to C++ standards development? > but I expect that most effort would remain focused on a single implementation, Oh, given my experience with Python implementations, I agree! But my point is processes change when you have multiple competing commercial vendors, which C++ has. So looking at how Rust does things doesn't mean it's also appropriate for C++. > I don't think spurious hypotheticals are a good use of anybody's time. Okay, something more practical. C++11 broke backwards compatibility by changing how 'auto' works. "auto int i;" used to be valid, now it's an error. This is a huge boon for usability. It's a trivial syntactic change to fix old code, and long experience shows the old "auto" storage class was rarely used. How would the systematic compilation of all C++ code on GitHub (assuming that were possible) affect that decision more than the ad hoc methods they did use to make that decision? Will there really never be something in Rust were a simple breaking change of a rarely used feature can result in an easier-to-use language? If there can, then you may have a schism, either temporary (gcc vs egcs fork) or more permanent (Perl5/Perl6/Raku). Which will be "Rust"? The answer is legally quite clear. The Rust Foundation has the trademark to "Rust" (serial number 87796977). My version can't break backwards compatibility, even as a fork, so would have to call it, perhaps, "Verdigris". (As I recall, someone started to develop a "Python 2.8" with more backports from Python 3; the PSF got after them for using the Python trademark that way.) C++ doesn't have trademark protection, so the legal concept of what is/is not C++ are also different than Rust. |
I doubt it would affect the actual decision at all, WG21 has been very comfortable relying on gut instinct, even in the face of reality, so there's no reason they'd be affected by the results of more systematic testing.
> Will there really never be something in Rust were a simple breaking change of a rarely used feature can result in an easier-to-use language?
Now we're talking about something woollier than your performance hypothetical. Surely almost any change can be sold as "easier-to-use" if you're motivated. Herb Sutter seems motivated for example, every CppCon he has a proposal for how to make C++ "easier to use" by further complicating it. An immediate caution though is, in what way is it "easier-to-use" half of a fractured ecosystem ? The other half is no longer available to you, that's certainly not easier to use than before.
Rust programmers aren't used to taking such deals because Editions have been leveraged to give them better alternatives without the compromise.
This promise got stronger over time, rather than weaker as you seem to expect. There's complicated Rust 1.0 era code (e.g. early ripgrep) which doesn't even build today on a current compiler, because something it did is wrong and Rust 1.0 compiler didn't spot that but modern ones do - back then it was less likely they'd see the compatibility break as a big deal, it was "just" a bug fix.
C++ compilers fix those sort of bugs all the time even today. Rust wouldn't take those fixes so easily, modulo crater measurements, but as you've shown C++ doesn't have that.