| "Estimated 30 to 80 millions LOC compiled" sounds more than code search, yes? Don't confuse my ignorance of the process for lack of process. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p27... describes a proposal to "zero-initialize all objects of automatic storage duration", with a test-implementation as an "opt-in compiler flag", and tested on "The OS of every desktop, laptop, and smartphone that you own; The web browser you’re using to read this paper; Many kernel extensions and userspace program in your laptop and smartphone; and Likely to your favorite videogame console." Or from https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n43... "To assess how common these cases are likely to be in practice, we conducted a ClangMR analysis of a codebase of over 100 million lines of C++ code, identifying every location where a std::function is given a new target". "Proper" and "ad hoc" have very strong personal components. Is it proper or ad hoc that Crater only tests public code, while C++ developers have access to large private code bases ("the OS of every desktop") for carrying out their tests? Is it proper or ad hoc that Crater only checks crates.io and some GitHub repos? Is it proper or ad hoc that Crater doesn't test under Microsoft Windows? As for the results, what will Rust language development look like when there's 10 billion lines of Rust code, and only a tiny fraction of it is visible? |
Does it? Your belief is that the authors wrote two compilers (C and C++ because these codebases are in two different languages) with these features they're not proposing and don't think should be used, in order to actually compile this code and check it works - but alas although they had to do all this complex compiler internal work they didn't find time to have the frontend parser count the lines of input ?
"They just used code search and estimated" doesn't sound infinitely more likely to you?
> Don't confuse my ignorance of the process for lack of process.
Your ignorance certainly plays a role, but I don't see process.
P2723 is talking about widespread experience in real systems, but it's not a "test" implementation, it's just widespread real world tooling because this is a real world safety hazard regardless of whether C++ ever fixes it. -ftrivial-auto-var-init is the name of the Clang and GCC flag for example. That's how they can be confident it's used by "The OS of every desktop, laptop and smartphone you own" - it's one of the early checklist items that OS vendors have to slightly improved their C and sometimes C++ programs at very low cost.
Microsoft's team actually gave a talk about landing their equivalent, they had to fight harder because inside a proprietary codebase turns out even more C++ programmers mistake their ignorance for competence, and thus are convinced the C++ standard is correct here and such mitigations are at best a waste of time and at worst actively destructive. Also their optimiser is apparently terrible, which if you've used MSVC checks out.
Thus this C++ proposal is, like in "days of yore" just citing existing real world use.
The C++ developers don't actually have direct access to other people's code. JF Bastien (the paper's author) used to work for Apple, so it's possible he's actually seen Apple's teams using this flag, but either way Apple have announced that they do so. Microsoft publicly talked about using their equivalent for Windows, and the Linux vendors advertise that they have such mitigations. Anecdotes. To insulate this proposal (not very effectively it turned out) against people who insist the price of this change is too high to be feasible.
It turns out that in C++ land "We actually did this and it works" does not trump "I don't think it would work"
N4348 is talking about, and indeed cites, Google's experience with its own code using a smarter "refactoring" tool that Chandler and Hyrum have talked about publicly on several occasions. This is slightly fancier than code search, but it's still very much ad hoc which is why this gets mentioned once in that paper but isn't in the others you looked at.
When a tool systematically does the same thing, over, and over, that's anything but ad hoc.
In some ways you should expect Rust code to grow more slowly. If you ask that Code search guy from your previous comment, he'll tell you that a lot of C and C++ software has big machine generated data files as "source code". Until C23 there is no #embed whereas Rust has from the outset offered std::include_bytes! which is what you'd want instead of #embed if you weren't fighting neanderthals (Jean-Hyde sounds exhausted by the experience)
However over time of course software grows, and the more powerful, safer abstractions in Rust are expected to encourage that, so sure, 10 billion lines of Rust, I'm not sure why that's such a milestone. No I don't expect big changes as a result.
Did the documents you reviewed make you think the hidden C++ is so much different than the piles of it that are available in a public code search? Was that the message you received?