There is an interesting approach to this in Rust: if a potentially breaking change (e.g. a soundness fix) is being proposed, they usually test it against all publicly available Rust code.
It's exactly the flex you might think it is. The point is not in the number, but rather in the fact that you can build and test almost all crates available on almost all supported platforms, regardless of how many there are, and with no human intervention.
For C++, there's no registry to start with, but even if there was: there's no standard way of building and testing projects. So, a crater build like this is simply impossible at a scale.
That's not right, there are multiple. There's no single registry.
You can use the vcpkg registry, for example - that holds all small and large libraries I've ever needed (even one of my own). They also come with a standard way to build them, of course (CMake targets).
Have you done C++ development recently? I fear a large part of the C++ crowd may not be aware that package managers and build systems are available, they're just not preinstalled with C++.
I know about vcpkg but there's only 2k packages there. The point of the grandparent comment was that there's SO much C++ code / so many packages you can build in a crater run that it would be physically unfeasible.
The problem with vcpkg, just like with any other "ports" package manager, is that it's not maintained by the original authors of the code but rather by a separate community. It's a bunch of "ports", trying to standardise the builds and installs to a common format. Out of wonder, I looked at a few recipes, they seem to just install things for you, but you have no automated way to do a crater run still - since most of those libraries are header-only you will need to write library-specific code in each case to actually use each package; at least a single include. The tests are seemingly also not being run.
That's fair - but since this is about core language changes, compiling the non-header-only libraries already covers most of the commonly used libraries. Libraries like boost will also use most existing C++ features. I can't think of a single feature boost doesn't use, actually.
Even in the best case scenario, where this partial coverage touched everything that mattered, you are still screwed in C++ because of IFNDR ("Ill-formed, no diagnostic required" a recurring phrase in the ISO document).
If what I wrote isn't a Rust program, it doesn't compile. But if what I wrote isn't a C++ program, because of IFNDR it might compile anyway, and in a whole bunch of cases it must compile anyway because the alternative would be that our fundamental understanding of mathematics is wrong (or the compiler is broken).
This makes Crater runs fundamentally more powerful, even ignoring the practical problems C++ hasn't solved such as a lack of tooling.
Note that this tool (crater) is mainly used to catch regressions, not to judge whether it's worth to intentionally break things (which is what editions are for). If many crates depend on the bug which devs want to fix crater will probably detect that and devs will consider other alternatives.
I understand the intent of the flex, but if true, it suggests there's very little public Rust outside of packages that can be downloaded from crates.io and a smallish list of alternatives.
By comparison, there's so much publicly available Python code, from so many sources, that no one can honestly say they can even find it all. The same for C++.
I've seen papers where the source code was included in the paper itself (eg, the FORTRAN code in Sibson's 1973 "SLINK" paper), or only distributed as a zip file from the author's web site, or in the supplementary data (eg, https://scholar.google.com/scholar?q=%22source+code+in+the+s... ) .
Personally, I don't think it's true. I suspect Rust changes - just like new proposed C++ changes - are checked against only easily and "well-known" accessible package.
>if true, it suggests there's very little public Rust outside of packages that can be downloaded from crates.io and a smallish list of alternatives.
You seem to be suggesting that it's a good thing that the public code is spread across so many different places that it cannot all be found. I don't see how that's an inherently good thing. It says less about the total amount of code than it does about the lack of any central resource that can be consulted.
Like, if I'm teaching a Rust course, and put a hello-world.rs program on my department's public GitLab instance, under an MIT license, do you think I should also put that on GitHub? And register it as a crate?
> the lack of any central resource that can be consulted.
And you say that like it's a good thing.
You want everything to be centralized on GitHub? If so, you want to force all research software developers to agree to the GitHub's terms, including those who are ardent free software advocates.
You also prevent 12 years olds from publishing their Rust source code. (GitHub's terms of service don't allow that.)
Or, do you also allow BitBucket [1], and GitLab [2]?
What bearing does any of this have on the previous thread of discussion?
Why do you think a 12 year old needs to publish their "hello world" programs because of Crater? The purpose of Crater is uncovering subtle compiler regressions. If "hello world" is ever broken then it would likely be discovered by the standard test suite or generally long before the Crater run.
This isn't a matter of "allowing" anything. It's just a statement that yes a Crater run does test all meaningful publicly available code, where "meaningful" at the very least means code which is consumed via crates.io. Sure, there is very likely public code that exists elsewhere which Crater cannot find, and that's OK. The point is that a Crater run coming back clean means something, because a very very wide swath of code was tested.
To be fair to the original argument, I think it's important to understand that there is next to no Rust code in comparison to the amount of C++ code out there. It has almost no projects in comparison, and those projects are much, much smaller. I don't think that's a very controversial statement, because it's very obviously true.
Now, it's also important to keep in mind that C++ has a terrible story when it comes to centralized (or otherwise, really?) repositories for packages, so the corresponding system for C++ is at the moment completely infeasible and not at all useful. That doesn't really make the Rust code that's tested against any more meaningful in comparison to the vast amounts of C++ code out there, though.
Edit:
At the kind of pointless and debilitating scale that C++ exists and then with the relationship C++ has with packages and dependency management this entire idea is basically impossible.
Rather than hypothesising about an imagined tool you could look at the actual tool which of course is in Rust's source code repo: https://github.com/rust-lang/crater
> new proposed C++ changes - are checked against only easily and "well-known" accessible package.
Now that I have, so to say, shown you mine, lets see yours. Where is the tool to perform these checks in C++?
Thank you for showing that I was right to in my belief: 'I suspect Rust changes - just like new proposed C++ changes - are checked against only easily and "well-known" accessible package.'
My point is that dthul's comment "they usually test it against all publicly available Rust code" implies Rust has a very small user base. Since crater runs only against "parts of the Rust" - those available on GitHub and crates - it implies a rather larger ecosystem.
As for "mine" - what I know about C++ development comes from reading links posted to HN; hardly "mine" in any meaningful sense. I also don't accept your wording "these checks", because my point is that similarly useful checks are done, not exactly identical tests. I wrote 'FWIW, the C++ standards developers use do use code search tools to help identify possible breakage.'
From previous readings, I know they do code surveys, and experiments using existing code bases and compilers.
I don't see "We sometimes do some ad hoc checks including looking for stuff with code search" as "similarly useful" to using proper test automation at all.
And I think the results continue to speak for themselves.
At least, I interpret it as saying there isn't much publicly available Rust code, and only a few places to find Rust code.
I have a hard time even estimating how long it would take to test a change against all publicly available C++ code.
FWIW, the C++ standards developers use do use code search tools to help identify possible breakage.