Hacker News new | ask | show | jobs
by ncmncm 1423 days ago
Disingenuous.

The bad C++ code is in the very first line of the "make_appender" definition: capturing the closure's environment by reference is nonsense: It is equivalent to returning a reference to an argument. It is not, then, a closure at all.

Using a correctly-defined make_appender would not, then, produce undefined behavior when you use it, with or without "move".

What the author has done here was to take a too-obviously wrong operation, returning a reference to an argument, but dress it up with syntax that will look less familiar to some readers, and pass it off as insightful.

But using a wrong function and getting wrong results is not surprising.

Piling on more uses after, that give more wrong results, does not reveal anything more.

When you need disingenuous arguments to make your point, it tells us more about your point than about the thing you are trying to make a point about. And, publishing anyway tells us more about you.

Returning that fake closure should evoke a compiler warning, if you turn on warnings.

7 comments

>The bad C++ code is in the very first line of the "make_appender" definition: capturing the closure's environment by reference is nonsense: It is equivalent to returning a reference to an argument.

If it is so bad, it should (in the sense of how things would be in an ideal world) not compile.

>It is not, then, a closure at all.

It is a closure because all variables are closed-over and there are no free variables in the lambda body anymore. That is the definition of closure.

>But using a wrong function and getting wrong results is not surprising.

In an ideal world, there should be a compilation error. (There is in Rust)

The majority of what's wrong in C++ is that it lets you do nonsensical (even dangerous) things, most of the time without even a warning (and not because it's technically impossible to warn--it just didn't occur to them). It's okay to acknowledge that--it's a product of its time.

>Returning that fake closure should evoke a compiler warning, if you turn on warnings.

That "should" tells me all I need to know. In the end either safety is important, or it isn't. Choose accordingly.

> If it is so bad, it should (in the sense of how things would be in an ideal world) not compile.

Yes, C++ can be a bad solution to a lot of problems, and that's okay. Use rust if you need a machine guarantee for memory safety (or you just like the language), you can use Go if you just don't care about that at all and want the language to take care of it. But you can use C++ for non-critical software that needs to be fast (games come to mind). Rust can be too much of a mental overhead than it's worth for some.

That closure is just not a good example. Nobody would write this, because when writing C++ code you _do_ think about whether you want a reference or not. Sure, a lot of bugs can happen but this is not, in my opinion, one of them.

If you want to prove a point, prove it fairly.

Note: I don't use C++ anymore, and I don't like it very much for other reasons.

> Rust can be too much of a mental overhead than it's worth for some.

Rust has the least mental overhead of any language I've ever used. The compiler literally takes all the mental overhead away.

> The compiler literally takes all the mental overhead away.

Increasing the strictness of the language has one effect: decreasing the number of potential solutions for a given problem.

If the restrictions are carefully chosen (like they are in rust) this leads to generally safer code. But don't fool yourself, the restrictions don't merely generate new solutions - they reject the ones that don't pass the tests.

A more extreme example is formal theorem provers. Carefully constructed proofs will take a lot of effort, but it will also make you confident that the code does what it needs to do.

The rust borrow checker is just a more restricted theorem prover that doesn't touch the logic, it just deals with the memory side of things. It's indeed very helpful in trying to explain what's wrong (and even suggest fixes), but it doesn't take the programmer overhead away.

In a more complex system rust inevitably makes it harder to come up with solutions. It will reject valid code just because it can't prove it's right, not because it's actually wrong. As a programmer, you're going to come up with such solutions, and while in time you get more used to writing code that rust likes, and rust too gets better at accepting correct solutions, you're going to have to fight the borrow checker sometimes.

I don't have a ton of experience with rust, but I encountered cases where equivalent C++ code would've worked just fine, but I had to change it because rust didn't like it.

Rust is an amazing language, but it definitely doesn't 'take all the mental overhead away'.

"Doctor, it hurts when I do that!"
Pain is a strong warning that prevents [further] trauma. This analogy defeats its own purpose.
The doctor says, "then don't do that". So, the analogy is correct, and you have missed the point.
I don’t think so. In this analogy you don’t go to the doctor at all, because you feel nothing until a bone cracks. The whole point is based on “but nobody would do that”, so in reality we should see no results of UB and segfaults outside of educational experiments. But the only way to do that is to keep our eyes closed.
>> In the end either safety is either important, or it isn't. Choose accordingly.

Not to disagree with your post but this comes across as very black-and-white.

To expand:

There's a prioritized list of requirements in engineering. Use the right tool for the job.

Rust has: (see https://hackmd.io/@rust-ctcft/r1plN4You#/5 )

1. Safety

If you don't need that at the first slot, then the biggest strength of Rust doesn't apply to your problem (and you would waste time having to track lifetime parameters more than necessary to solve your problem).

There's also another item in that list, and that is performance. As soon as you slightly relax that one, Rust does become massively easier. A well placed .clone() or Arc makes the rest of the code performant enough and easier to understand, which makes the equation of choosing Rust over other alternatives in spaces that aren't necessarily systems programming less problematic.
Arc is equivalent to std::shared_ptr, and so ought to be code smell in Rust.

The .clone() would have been [=] in C++, which would have safely quieted the warning.

Big problem in these discussions is what should be prioritized.

To me, with the 20 years of experience I have in 8 programming languages, I'll always include safety. Especially having in mind that it's nearly zero-cost in terms of runtime performance.

So to me not choosing safety is a strong sign that I don't want to work with the people who practice that.

The point missed everywhere is that Rust lacks key language features needed to capture essential semantics in libraries, that C++ provides. To code libraries I want to code, I cannot use Rust. Rust cannot express them.

So, safety, good, fast enough, good, but insufficiently expressive? No, thank you.

It's probably missed everywhere because I at least have not seen anyone giving concrete info about which are these missing key language features.

Want to elaborate on that? Quite curious.

can you elaborate on those key missing language features ? You have commented multiple times about that, but haven't seen you giving any concrete example. I'm Genuinely curious.

    std::vector<int> suffix{3, 4};
    auto append34 = make_appender(suffix);  // Version 1: test will work

    auto append34 = make_appender({3, 4});  // Version 2: test will fail
I was in a mood to type the examples but I accidentally made Version 1 because I had started with the author's first example. I wondered where the problem was and couldn't find any (neither running nor reading the code) until I noticed the author had changed to Version 2.

The problem is "obvious" in hindsight but one of the problems with C++ is that it makes certain things too implicit/convenient. I get that references are a staple feature in C++ but it's not at all obvious from the calling location that Version 2 is a bug. To see that, you have to navigate to the callee (finding which might be hard enough alone without a very solid IDE), and make sure that it's not capturing the argument by reference or not doing anything unsafe there.

It's often a problem when a seemingly "value" argument is turned into actually a pointer-to argument at the callee. There are other languages that have this too, and I've never liked it.

As someone who doesn't regularly code in C++ but has a solid understanding of the basics, I wonder why C++ ever allowed to have a reference parameter be called with a temporary? To me it feels like "References have value syntax but pointer semantics BUT you should program like it had value semantics"? Which to me would be exactly a premature optimization that is looking for trouble.

Again, this isn't a right/wrong thing. Rust moves by default (and a lot of people find "=" a weird pun for that), C++ copies by default and has rvalue-references and an explicit `std::move`.

If you want copy in your C++ lambda, you start it `[=](...) {...}`, if you want take a pointer in your C++ lambda, you start it `[&](...) {...}`, if you want something trickier you do trickier stuff.

Rust opts you in to the nitpicky static analyzer and you have to opt out with `unsafe`, in C++ you have to opt in with e.g. `clang-tidy` or some annotations.

They are remarkably similar, just with different defaults.

> They are remarkably similar, just with different defaults.

I'm going to lead with this, because I think it's most important: Culturally there's a world of difference. Safety is a part of Rust's culture. "Culture eats strategy for breakfast".

Take sorting. In C++ the default sort is unstable, while in Rust the default sort is stable, that's just those defaults you mentioned (each has both kinds), although the choice speaks to culture. But look closer, in C++ the sort has undefined behaviour if your type isn't totally ordered. In Rust you can't sort the partially ordered types without saying how to order them fully. Still, in both languages we can write a custom order, so what happens then? In C++ if your custom order is nonsense you get... undefined behaviour. In Rust sorting won't necessarily work with a nonsense custom order but the behaviour remains well defined.

> Rust opts you in to the nitpicky static analyzer and you have to opt out with `unsafe`

Unsafe gives you a small number of dangerous "super powers" needed to write efficient low-level code, it does not opt out of the borrow checker's analysis, or indeed most other checks. This misconception makes me wonder how much of what you've written is conjecture rather than practical experience.

The sibling says that C++ has the wrong defaults, full stop.

Well in Rust the default `HashMap` uses a cryptographic hash, and you see it everywhere, it's the de facto community "default". In C++ the community "default" is `absl::flat_hash_map`/`folly::F14`, which use SIMD to compare a whole stripe of key-prefixes simultaneously.

I want different defaults for different programs, but the idea that it's esoteric to ever want an associative container within arms reach that fucking demolishes the other one is, ugh, God I want to like Rust even more than I do but this "we're right and everyone else hasn't seen the light" routine is infuriating and pushes me at least away.

My parent comment is trying to emphasize that this isn't a right/wrong thing, different tools for different jobs. And people are just like: "nope, everything but Rust is wrong".

I like and use both C++ and Rust. I also have plenty of bones to pick with both languages.

However, I’ve never gotten the sense that Rust itself promotes the idea that “everything except for Rust is wrong.” I also don’t read much on the Internet these days, and I’m not doing so, probably avoid much of the hype that people are pushing about Rust.

Since it has been established as Hot New Thing, there are huge social incentives tied up in promoting it.

Why do you think that a language with better details would prevent you from opting in to whatever non-default behavior you need?

Having the "right" defaults is better for everyone. Folks who don't know or care get a good, safe default with no undefined behavior or unexpected danger, and folks who know better can opt into something that fits their needs explicitly.

Seems ideal to me.

> a language with better details

Whoops - that should have said "a language with better defaults"...

> In C++ the community "default" is `absl::flat_hash_map`/`folly::F14`, which use SIMD to compare a whole stripe of key-prefixes simultaneously.

Just to be clear, the Rust HashMap does the same thing.

TIL. Thanks for letting me know that: https://doc.rust-lang.org/src/std/collections/hash/map.rs.ht...

I don't know how I missed this, is the Swiss port a fairly recent development?

In fact the crypto hash does not actually improve security. It is just slow.

So, the trade is just: slow for nothing.

The (default) hash for Rust's HashMap and HashSet is a SipHash. People shouldn't call this a "cryptographic hash" or a "crypto hash" - that's misleading as it would lead you to think of algorithms like the SHA-2 family - but this is literally a cryptographic algorithm just one with very specific properties suitable for this task.

Such algorithms are crucial to avoid being subject to a Denial of Service attack which is, in fact, a security problem. Of course under the C++ "blame the programmer" philosophy you don't deserve protection from Denial of Service unless you knew you needed that and figured out how to ask for it properly.

Just as with the sort functions this is about safe defaults, not about constraining people who know what they're doing. Dropping in FNV instead of SipHash, or even using the identity function as a "hash", is not difficult if you are sure that's what you need.

There lies the problem with C++ state of affairs, most defaults are wrong.

It doesn't help that those of us that care about enabling the right defaults are a tiny minority as per C++ surveys.

https://blog.jetbrains.com/clion/2021/07/cpp-ecosystem-in-20...

You can also 'const' your std::vector when you declare it and it won't work. Any linter or having warnings turned on will catch these issues.
const doesn't make a difference in this case. It's about the passed-by-reference object being destroyed after the function returns. That is because the object was passed as a temporary.
The fact that you may know that make_appender definition being bad does not mean every C++ user knows as well. It's also impossible for you to know ALL the possible bad, UB leading C++ code out there.

I think the point the author tries to make is that, while C++ and Rust are probably "the same" for the most skillful and disciplined programmers (such as you), for average human, Rust just catches way more errors they make for them early. An extreme analogy would be trying to climb Everest all by yourself vs with a professional guide.

Not to disagree necessarily, but if you (give me a little rope) bucket languages that are hard to get past the compiler but more often correct when you do (Rust, Haskell), and languages where it's pretty easy to get something past the compiler and tweak it until it works well enough for your purposes (JS, C/C++), the tweak-it-until-it-kinda-works languages are fucking killing it on adoption.
I dunno man. I've done C, C++, JavaScript and TypeScript professionally for significant chunks of my career, and the trend that I've observed has overwhelmingly been towards stricter compilers. For example in the front-end world, TypeScript has absolutely exploded in adoption. Everyone could be still using JavaScript, but companies from startups to huge corporates have explicitly decided they want compile type safety -- often in "only" front-end code.

I suspect that the adoption of Rust as a system language is going slower because that's just the natural pace of embedded/systems development, not because of anything intrinsic to Rust. There are now 30k+ lines of Rust code [1] in the Linux kernel. C++ can't claim that.

[1] https://www.phoronix.com/news/Rust-v6-For-Linux-Kernel

Oh I think we probably see largely eye-to-eye. My weapon of choice when there are no other constraints is Haskell, and one reason I really like Rust is that I can get a lot of the Haskell features I like in a highly-performant setting. Most of the C++ I maintain these days is in whole or in part generated by Haskell. And if I have to write something fast by hand and it doesn't need to link to stuff I need, I reach for Rust generally.

I was also unaware that Rust has such significant penetration in the Linux kernel, and that's a place where I can see it really shining.

My first comment in this thread was something to the effect that Rust has tons of great stuff to offer, and that the memory safety argument is actually weaker than people think and probably not the only thing people should talk about.

The resulting gang-tackle is just one more data point that the community is still too small and evangelical for me to want to get involved past my proprietary Rust stuff.

Ha, Rust community is very energetic, but IMHO they largely put that energy to good use!

I’m using it for a new project and honestly I’m using it more for the modern tooling and easy C interop than the safety features, but I’m a fan overall. Think it’s a really good language.

> Most of the C++ I maintain these days is in whole or in part generated by Haskell

Can you expand on that? I'm currently researching something similar but lower level & lisp instead of Haskell. It would help to see some existing examples to figure out if it's worth it or not.

Ideally we'll just be able to open source it soon!

Basically we have a nice Haskell DSL for generating arbitrary C++, and we deal with lots of code you wouldn't want to write by hand (big nested switch statements and other kinds of state-machine logic, choices about loop unrolling, lots of template overloads, SIMD intrinsics that require immediate values, etc. etc.) so we write Haskell that generates C++ and feeds it to e.g. `clang`.

Some of this is directly in Haskell, and some of it is little compilers mostly done using Megaparsec. It's a really nice approach where it fits!

> it's pretty easy to get something past the compiler and tweak it until it works

That's just as true of Rust if you use clone(), Rc<> and RefCell<>. You just have to familiarize with a few boilerplate patterns, and the best part is you're only trading off a modicum of performance while preserving safety. But Rust can work quite well as a language for exploratory programming.

> the tweak-it-until-it-kinda-works languages are fucking killing it on adoption.

Industry clearly prioritises speed of delivery above all else. Security, reliability and maintenance are future problems (that would be nice to have).

However, in order to gain the power to reason about our code and be able to prove the correctness of various properties (what a typechecker does), we have to program with more restrictive models. This has been argued many times before (e.g. Dijkstra's structured programming). Haskell and Rust are just two of many examples. My favourite is regular expressions, choose actual proper regexes and you have guaranteed O(n) execution, choose Perl/Python "Regex" and you have a potential security hole.

JS and C++ are killing it on adoption because they had near monopolies for an extended period of time in their respective areas (browser, native higher level language). They are popular despite their obvious (some in hindsight) shortcomings due to lack of alternatives in the same categories.
Javascript‘s adoption has much less to do with its language features but with its unique positioning.

There was (and still isn’t) a competitor that can compete on the same level with Javascript in the browser.

Literally anybody can turn on warnings and heed the results.
You can add [[clang::lifetimebound]] on the suffix parameter and you get

  <source>:21:35: warning: temporary whose address is used as value of local variable 'append34' will be destroyed at the end of the full-expression [-Wdangling]
    auto append34 = make_appender({3, 4});
You dont need annotations, cppcheck already warns with:

    test.cpp:16:45: error: Using object that is a temporary. [danglingTemporaryLifetime]
        assert((std::vector<int>{1, 2, 3, 4} == append34({1, 2}))); // FAIL: UB
                                                ^
    test.cpp:3:12: note: Return lambda.
        return [&](std::vector<int>&& items) {
               ^
    test.cpp:2:50: note: Passed to reference.
    auto make_appender(std::vector<int> const& suffix) {
                                                     ^
    test.cpp:4:36: note: Lambda captures variable by reference here.
            return append(move(items), suffix);
                                       ^
    test.cpp:15:35: note: Passed to 'make_appender'.
        auto append34 = make_appender({3, 4});
                                      ^
    test.cpp:15:35: note: Temporary created here.
        auto append34 = make_appender({3, 4});
                                      ^
    test.cpp:16:45: note: Using object that is a temporary.
        assert((std::vector<int>{1, 2, 3, 4} == append34({1, 2}))); // FAIL: UB
As awesome as that is, cppcheck doesn't seem ready for real-world use. Literally the first invocation I ran resulted in this error:

  error: Syntax Error: AST broken, binary operator '!=' doesn't have two operands. [internalAstError]
   explicit operator bool() const { return this->get() != pointer(); }
This is for a simple wrapper class that looks like this:

  template<class T>
  class Foo : private Bar<T> {
  public:
   typedef value_type *pointer;
   pointer get() const { return ...; }
   explicit operator bool() const { return this->get() != pointer(); }
  };
Another example:

  void foo(uintptr_t const (&input)[2]) {
   if constexpr (sizeof(uintptr_t) == sizeof(int) && sizeof(long long) == 2 * sizeof(int)) {
    long long value;
    memcpy(&reinterpret_cast<uintptr_t *>(&value)[0], &input[0], sizeof(input[0]));
    memcpy(&reinterpret_cast<uintptr_t *>(&value)[1], &input[1], sizeof(input[1]));
   }
  }

  error: The address of local variable 'value' is accessed at non-zero index. [objectIndex]
    memcpy(&reinterpret_cast<uintptr_t *>(&value)[1], &input[1], sizeof(input[1]));
                                                 ^
A function returning a value that depends on the lifetime of the function's parameter is not crazy at all. Every class getter method that returns a reference to a member of the class does this.
Sure. But returning the address of a stack-allocated object is (usually) broken.

There isn't a right and wrong here: sometimes you want to opt in to the check for that, sometimes you want to opt out of it.

Sometimes I actually do want to fuck with addresses on the stack in weird, potentially architecture-dependent ways, it's rare but it happens.

I happen to think that Rust's linear/affine typing is by far the most usable low/zero-cost memory management model that anyone has demonstrated at scale and a real achievement in practical computer science, but it comes at a pretty serious cost in `Box`-this and `Arc`-that and `Rc`-other-thing and generally the borrow-checker being a PITA about some stuff we're used to doing.

Rust is very cool and I use it, but the "using C/C++ is fucking strangers without protection"-vibe got old years ago.

> returning the address of a stack-allocated object is (usually) broken.

What matters is the lifetime, where the object lives is a rule of thumb for guessing lifetime that results from C++ trauma.

Hint: getters are code smell.

Getters that return references are code stink.

I was trying to be a little more diplomatic in my sibling reply because I've locked horns with the Rust community before and not enjoyed it, but you're not wrong.
It's possible the author actually made this mistake, and because c++ is C++ they didn't realize why/how. Not a lot of programmers have a good handle on c++, maybe even most don't... Hence these other languages.
If the author deliberately chose to compile with warnings turned off, in order to present an example that would crash, then that tells us more about the author than about the point.