Hacker News new | ask | show | jobs
by jcelerier 1513 days ago
C++ objects are values.

E.g. in

    std::string str;
str is never null, and always a valid object, because it's a value, not a pointer like in most GC'ed or dynamic languages
3 comments

Wait, but what if you have

std::string str1{"Apple"}; std::string str2 = std::move(str1);

Isn't str1 in an invalid (err, unspecified?) state and you shouldn't use it? It's not `null` sure, but it's not good to use.

This issue is caught and flagged by clang-tidy. Though technically the standard says it is in valid but un-specified state. Because of short string optimisation, the move might be possibly a copy, so one should not rely on the contents being empty.
Clang tidy covers only the most trivial use after move scenarios. It's useful coverage but isn't and never will be complete like the rust borrow checker.
No, you can absolutely use it (but you most likely want to clear() or assign an empty string to it before doing e.g. push_back as move does not necessarily clear the moved-from object - std::move(some_int) won't clear the int)
I mean, it won't have a useful value, so I need to re assign to it, at which point I could just be using a new variable right?

It might not be null, but I can't do much more with it than I can with null. I need to assign something new first

> I mean, it won't have a useful value, so I need to re assign to it, at which point I could just be using a new variable right?

I have a hard time understanding why "I could just be using a new variable" follows from "it won't have a useful value".

It depends on what you mean by "object". For example, pod types share syntax with classes but are not always initialized.

What's better is that sanitizers don't detect this form of undefined behavior! (Clean on memory,address,undefined.)

https://godbolt.org/z/7jsxnMs8E

also relevant: https://i.imgur.com/3wlxtI0.gifv

By object, I mean: https://eel.is/c++draft/intro.object ; pod types, even uninitialised, are still objects.

Also, uninitialized by itself does not mean invalid though ? For your example, I wonder if it would make sense to flag it: printf could be implemented in assembly or Fortran for what we know (a few popular libc implementations are done in c++ for instance), and as such I don't know how much sense it would make for its internal usage of the pointed value to be checked against the c++ rules. I'd assume the outcome would be different with std::format or std::cout for instance

My complaint is that it is too easy to create an object with uninitialized members by accident. On account of the syntax being identical. I guess its too late to add affordances there.

I suppose a definition of valid that prohibited uninitialized members would preclude lots of useful stuff like container and buffer types.

From the msan documentation [1], the flaw with my earlier example is that `printf` isn't instrumented.

And to address your other question, I don't know if msan can instrument a function implemented in assembly. It definitely can't deal with something it didn't compile as the instrumentation is added during compilation.

It seems that on godbolt the platform library also isn't instrumented because the equivalent iostream code is msan clean [2]. I suppose that makes sense as it's allowing you to pass arbitrary options to the compiler.

In summary, msan can detect these uninitialized reads but it requires quite a lot of fiddling.

[1]: https://clang.llvm.org/docs/MemorySanitizer.html#handling-ex... [2]: https://godbolt.org/z/Gsxsfn9GT

I think there could also be issues with what godbolt prints ; this example: https://godbolt.org/z/x9W3xGeeb

shows nothing on godbolt, but when run on my local machine yields the expected

    $ clang++ bar.cpp -fsanitize=memory -DFMT_HEADER_ONLY=1 -std=c++20
    $ ./a.out
    42
    ==269465==WARNING: MemorySanitizer: use-of-uninitialized-value
        #0 0x55ba20d00c8e in fmt::v8::appender fmt::v8::detail::write<char, fmt::v8::appender, int, 0>(fmt::v8::appender, int) (/tmp/a.out+0xb9c8e)
        #1 0x55ba20cff0ac in fmt::v8::appender fmt::v8::detail::default_arg_formatter<char>::operator()<int>(int) (/tmp/a.out+0xb80ac)
        #2 0x55ba20d6ef67 in char const* fmt::v8::detail::parse_replacement_field<char, void fmt::v8::detail::vformat_to<char>(fmt::v8::detail::buffer<char>&, fmt::v8::basic_string_view<char>, fmt::v8::basic_format_args<fmt::v8::basic_format_context<std::conditional<std::is_same<fmt::v8::type_identity<char>::type, char>::value, fmt::v8::appender, std::back_insert_iterator<fmt::v8::detail::buffer<fmt::v8::type_identity<char>::type> > >::type, fmt::v8::type_identity<char>::type> >, fmt::v8::detail::locale_ref)::format_handler&>(char const*, char const*, void fmt::v8::detail::vformat_to<char>(fmt::v8::detail::buffer<char>&, fmt::v8::basic_string_view<char>, fmt::v8::basic_format_args<fmt::v8::basic_format_context<std::conditional<std::is_same<fmt::v8::type_identity<char>::type, char>::value, fmt::v8::appender, std::back_insert_iterator<fmt::v8::detail::buffer<fmt::v8::type_identity<char>::type> > >::type, fmt::v8::type_identity<char>::type> >, fmt::v8::detail::locale_ref)::format_handler&) (/tmp/a.out+0x127f67)
        #3 0x55ba20cfaca9 in void fmt::v8::detail::vformat_to<char>(fmt::v8::detail::buffer<char>&, fmt::v8::basic_string_view<char>, fmt::v8::basic_format_args<fmt::v8::basic_format_context<std::conditional<std::is_same<fmt::v8::type_identity<char>::type, char>::value, fmt::v8::appender, std::back_insert_iterator<fmt::v8::detail::buffer<fmt::v8::type_identity<char>::type> > >::type, fmt::v8::type_identity<char>::type> >, fmt::v8::detail::locale_ref) (/tmp/a.out+0xb3ca9)
        #4 0x55ba20cf8a57 in fmt::v8::vprint(_IO_FILE*, fmt::v8::basic_string_view<char>, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >) (/tmp/a.out+0xb1a57)
        #5 0x55ba20cf86c2 in fmt::v8::vprint(fmt::v8::basic_string_view<char>, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >) (/tmp/a.out+0xb16c2)
        #6 0x55ba20cf7cce in main (/tmp/a.out+0xb0cce)
        #7 0x7f45c65b330f in __libc_start_call_main libc-start.c
        #8 0x7f45c65b33c0 in __libc_start_main@GLIBC_2.2.5 (/usr/lib/libc.so.6+0x2d3c0)
        #9 0x55ba20c6a444 in _start (/tmp/a.out+0x23444)
    
    SUMMARY: MemorySanitizer: use-of-uninitialized-value (/tmp/a.out+0xb9c8e) in fmt::v8::appender fmt::v8::detail::write<char, fmt::v8::appender, int, 0>(fmt::v8::appender, int)
    Exiting
`fmt::print` isn't using the uninitialized variable in your example. After correcting that, it produces a report. https://godbolt.org/z/97bo3YTbc
wops, good catch
But that would compile and run, and you'd still get garbage if you tried to access str later, correct?

In Swift you are not allowed to declare a variable without assigning some value or nil to it before the current scope ends.

I don't think that's true. The default constructor for std::string creates an empty string. See: https://www.cplusplus.com/reference/string/string/string/

The semantics of c++ object initialization are such that a constructor will always be called.

No, either you get the previous content of the string (if it was under the small-string optimization limit) or a new empty string, but never garbage.
In C++, you can prevent that with compiler options.