| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by varajelle 1264 days ago
	Regarding the sloc count, the default automated Rust formating tool is very eager to adds lot of lines by basically keeping only one word per line. Something I'm not a fan of, I must say.

2 comments

sph 1264 days ago

It usually does that on iterator chains, which AFAIK do not exist as such in C++, so multiple operations would be expressed as multiple imperative statements.

My C++ is rusty (no pun intended) but I struggle to imagine their variant of `vector.iter().map().collect()` to be as concise and fit in fewer than 4 lines.

I wonder if OP's C++ port doesn't use iterators that much, and how idiomatic it is.

EDIT: the code is not idiomatic at all.

link

strager 1264 days ago

> I wonder if OP's C++ port doesn't use iterators that much, and how idiomatic it is.

I think I only used iterators in places where there's no built-in function on slices like C++'s strchr and strspn. (I think Rust's str has these, but not [u8].) For example:

C++: https://github.com/quick-lint/cpp-vs-rust/blob/f8d31341f5cac...

    std::size_t length = std::strcspn(c, separators);
    if (c[length] == '\0') {
      return found_separator{.length = length,
                             .which_separator = static_cast<std::size_t>(-1)};
    }
    const char* separator = std::strchr(separators, c[length]);

Rust: https://github.com/quick-lint/cpp-vs-rust/blob/f8d31341f5cac...

    match s
        .as_bytes()
        .iter()
        .position(|c: &u8| separators.contains(c))
    {
        None => FoundSeparator {
            length: s.len(),
            which_separator: INVALID_WHICH_SEPARATOR,
        },
        Some(length) => {
            let found_separator: u8 = unsafe { *s.as_bytes().get_unchecked(length) };
            match separators.iter().position(|c: &u8| *c == found_separator) {

link

mgaunard 1264 days ago

Of course it exists in C++, and has done since before Rust even existed.

Syntax is usually `vector | map | collect`.

link

strager 1264 days ago

> Of course it exists in C++, and has done since before Rust even existed.

Not in C++'s standard library until C++20.

link

mgaunard 1264 days ago

Things don't need to be standardized in an ISO document to exist and be readily available.

I remember using it as early as 2008.

link

sph 1264 days ago

Wow, my C++ knowledge is even worse than I thought. I didn't know it had "pipelines".

https://en.cppreference.com/w/cpp/ranges

link

maleldil 1264 days ago

It's not "pipelines". It's just an overloaded bitwise-or operator.

link

gpderetta 1264 days ago

> It usually does that on iterator chains, which AFAIK do not exist as such in C++, so multiple operations would be expressed as multiple imperative statements.

https://en.cppreference.com/w/cpp/ranges

Before C++20, similar functionality has been available in boost.

link

strager 1264 days ago

> the default automated Rust formating tool is very eager to adds lot of lines by basically keeping only one word per line.

This is not my experience.

Lifetime and '&mut self' noise (and four-space indentation) did cause rustfmt to sometimes split function signatures across multiple lines, but overall, I think rustfmt did a good job.

C++: https://github.com/quick-lint/cpp-vs-rust/blob/f8d31341f5cac...

    lexer::parsed_identifier lexer::parse_identifier(const char8* input,
                                                     identifier_kind kind) {
      const char8* begin = input;
      const char8* end = this->parse_identifier_fast_only(input);
      if (*end == u8'\\' || (kind == identifier_kind::jsx && *end == u8'-') ||
          !this->is_ascii_character(*end)) {
        return this->parse_identifier_slow(end,
                                           /*identifier_begin=*/begin, kind);
      } else {
        return parsed_identifier{
            .after = end,
            .normalized = make_string_view(begin, end),
            .escape_sequences = {},
        };
      }
    }

Rust: https://github.com/quick-lint/cpp-vs-rust/blob/f8d31341f5cac...

    fn parse_identifier(
        &mut self,
        input: *const u8,
        kind: IdentifierKind,
    ) -> ParsedIdentifier<'alloc, 'code> {
        let begin: *const u8 = input;
        let end: *const u8 = self.parse_identifier_fast_only(input);
        let end_c: u8 = unsafe { *end };
        if end_c == b'\\'
            || (kind == IdentifierKind::JSX && end_c == b'-')
            || !is_ascii_code_unit(end_c)
        {
            self.parse_identifier_slow(end, /*identifier_begin=*/ begin, kind)
        } else {
            ParsedIdentifier {
                after: end,
                normalized: unsafe { slice_from_begin_end(begin, end) },
                escape_sequences: None,
            }
        }
    }

link