Hacker News new | ask | show | jobs
by eesmith 1092 days ago
Thank you for showing that I was right to in my belief: 'I suspect Rust changes - just like new proposed C++ changes - are checked against only easily and "well-known" accessible package.'

My point is that dthul's comment "they usually test it against all publicly available Rust code" implies Rust has a very small user base. Since crater runs only against "parts of the Rust" - those available on GitHub and crates - it implies a rather larger ecosystem.

As for "mine" - what I know about C++ development comes from reading links posted to HN; hardly "mine" in any meaningful sense. I also don't accept your wording "these checks", because my point is that similarly useful checks are done, not exactly identical tests. I wrote 'FWIW, the C++ standards developers use do use code search tools to help identify possible breakage.'

From previous readings, I know they do code surveys, and experiments using existing code bases and compilers.

For examples, there's https://codesearch.isocpp.org/ ("developed for ISO Standard C++ proposal authors in order to explore existing C++ practice and to provide empirical evidence to support claims about existing practice made in proposals.") done in surveys to understand how code is used. For example, https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p14... .

At https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p11... they used a custom tool to analyze Boost, Chromium, Firefox, the Linux Kernel, Libreoffice, LLVM, and Qt: "Estimated 30 to 80 millions LOC compiled".

1 comments

I don't see "We sometimes do some ad hoc checks including looking for stuff with code search" as "similarly useful" to using proper test automation at all.

And I think the results continue to speak for themselves.

"Estimated 30 to 80 millions LOC compiled" sounds more than code search, yes?

Don't confuse my ignorance of the process for lack of process.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p27... describes a proposal to "zero-initialize all objects of automatic storage duration", with a test-implementation as an "opt-in compiler flag", and tested on "The OS of every desktop, laptop, and smartphone that you own; The web browser you’re using to read this paper; Many kernel extensions and userspace program in your laptop and smartphone; and Likely to your favorite videogame console."

Or from https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n43... "To assess how common these cases are likely to be in practice, we conducted a ClangMR analysis of a codebase of over 100 million lines of C++ code, identifying every location where a std::function is given a new target".

"Proper" and "ad hoc" have very strong personal components. Is it proper or ad hoc that Crater only tests public code, while C++ developers have access to large private code bases ("the OS of every desktop") for carrying out their tests?

Is it proper or ad hoc that Crater only checks crates.io and some GitHub repos?

Is it proper or ad hoc that Crater doesn't test under Microsoft Windows?

As for the results, what will Rust language development look like when there's 10 billion lines of Rust code, and only a tiny fraction of it is visible?

> "Estimated 30 to 80 millions LOC compiled" sounds more than code search, yes?

Does it? Your belief is that the authors wrote two compilers (C and C++ because these codebases are in two different languages) with these features they're not proposing and don't think should be used, in order to actually compile this code and check it works - but alas although they had to do all this complex compiler internal work they didn't find time to have the frontend parser count the lines of input ?

"They just used code search and estimated" doesn't sound infinitely more likely to you?

> Don't confuse my ignorance of the process for lack of process.

Your ignorance certainly plays a role, but I don't see process.

P2723 is talking about widespread experience in real systems, but it's not a "test" implementation, it's just widespread real world tooling because this is a real world safety hazard regardless of whether C++ ever fixes it. -ftrivial-auto-var-init is the name of the Clang and GCC flag for example. That's how they can be confident it's used by "The OS of every desktop, laptop and smartphone you own" - it's one of the early checklist items that OS vendors have to slightly improved their C and sometimes C++ programs at very low cost.

Microsoft's team actually gave a talk about landing their equivalent, they had to fight harder because inside a proprietary codebase turns out even more C++ programmers mistake their ignorance for competence, and thus are convinced the C++ standard is correct here and such mitigations are at best a waste of time and at worst actively destructive. Also their optimiser is apparently terrible, which if you've used MSVC checks out.

Thus this C++ proposal is, like in "days of yore" just citing existing real world use.

The C++ developers don't actually have direct access to other people's code. JF Bastien (the paper's author) used to work for Apple, so it's possible he's actually seen Apple's teams using this flag, but either way Apple have announced that they do so. Microsoft publicly talked about using their equivalent for Windows, and the Linux vendors advertise that they have such mitigations. Anecdotes. To insulate this proposal (not very effectively it turned out) against people who insist the price of this change is too high to be feasible.

It turns out that in C++ land "We actually did this and it works" does not trump "I don't think it would work"

N4348 is talking about, and indeed cites, Google's experience with its own code using a smarter "refactoring" tool that Chandler and Hyrum have talked about publicly on several occasions. This is slightly fancier than code search, but it's still very much ad hoc which is why this gets mentioned once in that paper but isn't in the others you looked at.

When a tool systematically does the same thing, over, and over, that's anything but ad hoc.

In some ways you should expect Rust code to grow more slowly. If you ask that Code search guy from your previous comment, he'll tell you that a lot of C and C++ software has big machine generated data files as "source code". Until C23 there is no #embed whereas Rust has from the outset offered std::include_bytes! which is what you'd want instead of #embed if you weren't fighting neanderthals (Jean-Hyde sounds exhausted by the experience)

However over time of course software grows, and the more powerful, safer abstractions in Rust are expected to encourage that, so sure, 10 billion lines of Rust, I'm not sure why that's such a milestone. No I don't expect big changes as a result.

Did the documents you reviewed make you think the hidden C++ is so much different than the piles of it that are available in a public code search? Was that the message you received?

> 30 to 80 millions LOC compiled

I figured it was because "line of code" is not all that meaningful, and not worth specifying more precisely than that.

Does it include comments? Is it after macro expansion? What about \ continuations? Does a bare "}" on its own count as a line of code?

BTW, how many LOC does Crater run in a full test, and how long does it take/how expensive is a run? I failed to find that information.

> The C++ developers don't actually have direct access to other people's code

I don't know what you mean by that. They certainly have access to public source code, just like Rust developers do. (Chromium, LLVM, Boost are mentioned in https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p11... ).

It would seem very odd if Microsoft's representatives had no idea how changes to C++ would affect internal Microsoft code. I strongly suspect VC++ changes/extensions are tested against in-house Microsoft code bases before making their way to the standard, because it makes no sense to undermine your own systems. For the same reason, I suspect proposed changes are tested internally at Microsoft.

And from papers like https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p21... I know there is in-house experience of proprietary code bases guiding how the C++ standard changes.

> Did the documents you reviewed make you think the hidden C++ is so much different than the piles of it that are available in a public code search? Was that the message you received?

That's not really my point. (Indeed, as that last paper link from Bloomberg points out, "It is our understanding that Bloomberg’s experience is not dissimilar to most Free/Libre Open Source Software communities".) Instead:

1) How much of the "very very wide swath of code" is meaningful, in terms of language feedback? That is, how much of the automation being employed because it's there, rather than because it's useful?

If an automated method checks 500M LOC but the interesting cases only ever come from the same set of 1M LOC, wouldn't reducing the working set help with turnaround?

(Indeed, https://ethz.ch/content/dam/ethz/special-interest/infk/chair... uses Crater to look at only the 500 most used crates, implying they think using an ad hoc subset is sufficient for their purposes.)

(Incidentally, it's hard to find any published scholarly papers on Crater. There's a lot of rust, both the iron and plant kind, in terrestrial craters!)

2) Would a C++ equivalent for Chromium, Qt, LibreOffice, KDE, Firefox, and a few dozen well-known large packages give the same feedback for C++? Why or why not?

If not, would ~100 packages be enough? What about ~1,000?

3) How do you know that Rust compilation of the packages on crates.io, only for x86-64 Linux, give better feedback for the types of issues that C++ faces, than the "ad hoc" methods they use for C++?

That is, just because a tool fits Rust's needs and goals doesn't mean it fit's the C++ spec developers needs and goals.

4) How would a tool like Crater help in a possible future where there are a dozen different and competing Rust implementations?

That is, https://blog.m-ou.se/rust-standard/ argues there doesn't need to be a standards committee for Rust because there is only one Rust implementation, with tools like Crater to help maintain compatibility. I'm familiar with this viewpoint as I come from the Python world; while there are alternative Python implementations, they all look to CPython as the reference language.

But in C++ there are many C++ vendors, some with economic incentive to have new features which might break old code, but which their customers will pay for. On the other hand, their customers have the economic inventive to prevent vendor lock-in. Hence, a standard.

If a hypothetical EESMith Rust drops a few rarely used features to give a 2x run-time performance gain and 5x compilation performance gain, then you can bet that people will switch to it. But is that Rust? And will mainline Rust still preserve backwards compatibility even in the face of competition?

> I'm not sure why that's such a milestone

Do you expect Crater to scale to compile 10 billion lines of Rust in a reasonable time and cost? Or will Crater drop testing most packages by then?

> Jean-Hyde sounds exhausted by the experience

Developing a C++ standard with multiple entrenched and sometimes competing vendors is no easy task. Rust doesn't have to deal with it ... yet.

> BTW, how many LOC does Crater run in a full test, and how long does it take/how expensive is a run? I failed to find that information.

I have nothing more than a finger in the air estimate for LOC, maybe hundreds of millions?

I have never watched a "full test" like for a release build, I believe those take several days - but when Crater is asked just to build everything that takes a little under 24 hours with its current footprint.

> I strongly suspect VC++ changes/extensions are tested against in-house Microsoft code bases before making their way to the standard, because it makes no sense to undermine your own systems.

Surely it stands to reason that if Microsoft are proposing standardisation of a feature they've shipped in MSVC, that's also a feature they've tried using? This model of ISO C++ features (which the developer of Circle also prefers) maps much better to what was initially envisioned than today's reality however. Most C++ proposals today are not submissions of existing compiler features from the big three compilers (MSVC, GCC and Clang) but instead fresh before the committee, often with no implementation experience at all.

That's certainly one way to do it, after all Rust contributors don't have their own Rust compiler either, but it means you need very different tooling.

1) Breadth matters much more than depth for finding surprises which is the thing you won't get with an ad hoc approach. Going from 10% of some big corporate code base to 20% won't make anywhere near the difference you get from adding a hundred one-man-band projects that are smaller even in total, because different stylistic and idiomatic choices make so much more practical difference for this work.

2) As a result "a few dozen" won't cut it. Try all the C++ on github, that seems like a much better place to start.

3) Sure, the primary goal of WG21 proposers is to get into the IS - it would be nice if what they've proposed actually works, but ultimately if it doesn't work that can be fixed later, whereas if it's not adopted then it doesn't matter whether it would work.

Arguably there have never been any versions of the C++ IS which actually describe a complete working programming language, so it's not terribly important that if it were such a system it would be correct, still there's a preference for fewer rather than more horrible gotchas.

I mentioned #embed so that's a useful example here, C++ 23 doesn't standardize #embed. So in theory C++ code can't use #embed, that's not C++. But of course in reality the vendors are going to ship a pre-processor which handles #embed, they don't care, so it'll work and it's widely expected you will be able to use it even in older C++ verisons.

4) If there was a specification then a tool like Crater might be somewhat helpful for that, but I expect that most effort would remain focused on a single implementation, today that is of course the Rustc compiler with its LLVM backend.

The hypothetical EESmith Rust sounds spurious to me, how could it deliver 2x run-time performance by removing "rarely used features" ? I don't think spurious hypotheticals are a good use of anybody's time.

> I have nothing more than a finger in the air estimate for LOC, maybe hundreds of millions?

And if there were a number, it would reflect only one of several ways to quantify "LOC", right? Resulting in a spread of numbers that could meaningfully be described as "LOC"?

> Breadth matters much more than depth for finding surprises which is the thing you won't get with an ad hoc approach

I would be quite interested in someone doing a research publication on this topic!

Give the history of using crater, which packages have proved most useful? Do the same core packages prove useful over time, or does the most significant subset change wildly? What does the cumulative distribution plot (#packages until time of appropriate feedback) look like?

How worthwhile is the additional breadth from crates.io + GitHub vs. just crates.io? Is it worthwhile to also include GitLab, and what are the tradeoffs (eg, additional compute costs, additional false positives).

For that matter, how useful would it be to add Linux+ARM to the current Crater tests? Or Microsoft Windows? If breadth is that important, then why skip out on the full set of Rust code you have available?

> As a result "a few dozen" won't cut it.

I did follow up with "If not, would ~100 packages be enough? What about ~1,000?" :)

If there's no equivalent of a dose-response curve / ROC curve / price-performance curve, and the answer is "must try everything" then how do I know the extra effort is useful, rather than FOMO-driven anxiety?

> Try all the C++ on github

Assuming there was a single way to build all C++ code - how much do you think it would cost to compile all the C++ code on GitHub? And why do you think it the additional cost would be worthwhile to C++ standards development?

> but I expect that most effort would remain focused on a single implementation,

Oh, given my experience with Python implementations, I agree!

But my point is processes change when you have multiple competing commercial vendors, which C++ has. So looking at how Rust does things doesn't mean it's also appropriate for C++.

> I don't think spurious hypotheticals are a good use of anybody's time.

Okay, something more practical. C++11 broke backwards compatibility by changing how 'auto' works. "auto int i;" used to be valid, now it's an error. This is a huge boon for usability. It's a trivial syntactic change to fix old code, and long experience shows the old "auto" storage class was rarely used.

How would the systematic compilation of all C++ code on GitHub (assuming that were possible) affect that decision more than the ad hoc methods they did use to make that decision?

Will there really never be something in Rust were a simple breaking change of a rarely used feature can result in an easier-to-use language?

If there can, then you may have a schism, either temporary (gcc vs egcs fork) or more permanent (Perl5/Perl6/Raku). Which will be "Rust"?

The answer is legally quite clear. The Rust Foundation has the trademark to "Rust" (serial number 87796977). My version can't break backwards compatibility, even as a fork, so would have to call it, perhaps, "Verdigris". (As I recall, someone started to develop a "Python 2.8" with more backports from Python 3; the PSF got after them for using the Python trademark that way.)

C++ doesn't have trademark protection, so the legal concept of what is/is not C++ are also different than Rust.