Hacker News new | ask | show | jobs
by Varriount 718 days ago
I can understand braces (to an extent...). What really confuses me are new languages that still require semicolons at the end of expressions/statements.
5 comments

Speaking from a Rust-perspective, having semicolons at the end of statements makes perfect sense and is a brilliant design decision.

Note that I said 'statements', not 'expressions'.

A lot of the confusion here (and maybe yours, too) stems from this difference. In Rust, (almost) everything is an expression by default, and you turn it into a statement by adding a semicolon. This allows you (and the type checker) to very neatly distinguish between expressions and statements, which is great. It's a very nice and elegant approach imo.

Yes, but: - Is having everything be an expression by default a good design choice? A not uncommon idea in programming language design is that certain language constructs (for example, variable assignments) are better off as only statements, as their potential for confusion outweighs their usefulness. - Why should the burden of additional syntax be placed on the most common scenario? Statements (or the desire for statement-like behavior) tend to be far more common than lone expressions. Why not require lone expressions to be specially marked, rather than statements? - Do the benefits of this approach outweigh its costs? Is elegance a desirable trait, and what is its importance relative to other values, such as clarity (how accurately and easily is the writer's intention conveyed to readers) and user experience (what is the potential for this syntax to be forgotten or misused).

Personally, I much prefer the design Go uses (where semicolons are implicitly added at the end of newlines following an identifier, numeric or string literal, keyword, or operator).

Does the difference ever come up in a meaningful way? Is there a syntax that without the trailing semicolon is ambiguous between the two, and the difference changes the behavior?

    let x = {
        3
    };
    let y = {
        3;
    };
    assert_eq!(x, 3);
    assert_eq!(y, ());
i think it'd also mean having to parse whitespace or newlines without something like that?
Thanks, I hate it. Is there any time where you want to use this? Like if Rust worked like other languages and just figured out if it was an expression or statement are there times where you would need an override?

I'm playing around with this in the compiler explorer and I'm now even more confused why you want this.

   let x = {
     println!("Don't mind me in the struct definition");
     3
   }
   assert_eq!(x, 3);
One of the things going on here is consistency.

It's not so much that this ability is specifically put in for some reason. It's just something that falls out of several other things.

Rust chose a "curly braces and semicolons" syntax because that's the sort of syntax that is normal in the sorts of PL spaces Rust wants to be used in. I am not sure exactly why being expression-oriented was chosen, but if I had to guess, it would be due to that just generally being considered a better option among many people at the time it was chosen.

So okay, you want expressions, and you want semicolons. Therefore, "you separate expressions with semicolons" is a pretty natural outcome. And since we're expression oriented, "a block is an expression that evaluates to the final value" is near tautological. And since it's an expression, it can go anywhere an expression can go.

Not being able to do this would mean creating specific restrictions against it, and then having to memorize when things don't follow the usual rules. That's more complicated than just letting expressions be expressions.

(also your let is missing a semicolon)

> I'm playing around with this in the compiler explorer and I'm now even more confused why you want this.

It works with RAII to create a temporally isolated resource scope, think context managers / using / try-with-resource; it provides a scratch space where you limit collisions with the rest of the function and avoid the risk of mis-reuse of those values; and before non-lexical lifetimes it was critical to limit borrow durations. It's also routinely used in combination with closures, to prepare their capture. Similar to C++ capture clauses, but without special syntax.

It's essentially a micro-namespace inside the function.

This kind of thing is useful for memory management. Anything you allocate within the scope that isn't returned gets dropped at the end.

You can use this to e.g. acquire a Mutex guard, move/clone something out of the mutex, and ensure it gets dropped as quickly as possible.

    let x = {
        let items = vec![1, 2, 3];
        items[1] // copied out of the vec since usize implements the Copy trait
        // compiler inserts drop(items) here
    };
    // items is no longer valid
    assert_eq!(2, x);
It also comes up in if/else blocks, which have exactly the same syntax and semantics (i.e. they are expressions, not statements):

    let condition = true;
    let x = if condition {
        println!("condition is true");
        5
    } else {
        println!("condition is false");
        10
    };
    assert_eq!(5, x);
edit: and of course, function blocks work exactly the same way! It's neat.
idk there are a lot of uses for it

    let dx = {
        let prev_x = x;
        x = get_x();
        x - prev_x
    };
often, it's slightly better cpp style scoping blocks if nothing else? there are tons of other little QoL things it enables though, but they're all going to be little ergonomics things that only seem worth it if you've used the language for awhile
Also common to set up "capture clauses" for lambdas as Rust does not have that in the language, IME most common with threads:

    spawn({
        let a = a.clone();
        let b = &b;
        move || {
            // do something with the a you cloned and the b you borrowed
        }
    })
All the time. A block (in the general sense so that includes function) which ends with an expression will have that expression’s value as its own value (/ return). A `;` converts the trailing expression into a statement, which does not have a value, and thus the block returns `()` (the something representing nothing).

Method chaining is also common in Rust, because builders are common, and chains of iterator adapters are common, and chains on monadic structures (option/result) are common, … having every line break implicitly insert a `;` would be horrid.

Wait, so this is legal:

    fn five() -> i32 {
        5
    }
and this isn't?

    fn five() -> i32 {
        5;
    }
I can't tell if I'm amazed or terrified.
this is what those end up being, it's pretty straightforward?

    fn five() -> i32 {
        return 5;
    }

    fn five() -> i32 {
        5;
        return ();
    }
semicolon changes the expression from returning the result of the expression to running the expression and returning the unit type. if you accidentally do that and specified a non-unit-return-type in the function signature, the type checker tells you about it:

    error[E0308]: mismatched types
     --> src/main.rs:1:14
      |
    1 | fn test() -> i32 {
      |    ----      ^^^ expected `i32`, found `()`
      |    |
      |    implicitly returns `()` as its body has no tail or `return` expression
    2 |     5;
      |      - help: remove this semicolon to return this value
which is also pretty clear about the solution
Yes.

> I can't tell if I'm amazed or terrified.

The rule is very simple and obvious, and the compiler will yell at you if you get it wrong.

It's also very useful and even critical of how expression-oriented the language is: an `if/else` or a match statement must typecheck, all branches have to have the same type. It's obvious if you're using it as an expression, but it doesn't go away if you're using it for the side-effect (as an imperative conditional/switch) and then things can get more dicey as the expressions in each branch can have different types. `;` solves that by making every branch `()`-valued.

Okay, this is really neat design. I went to check whether if..else works in an expression context and it does:

    fn five() -> i32 {
        let a = if (true) { 3 } else { 6 };
        return a + 2;
    }
Pure fun!
What confuses me is languages that can split statements over multiple lines as long as certain conditions are met (such as the break occours inside braces). I rather have a semicolon at the end of the statement to make it more explicit.
This is where I stand. Semicolons convey intent, much like parens in math equations. Sure, the code makes it clear (if you understand the rules) what "the code does", but having semicolons (and parens) make it clear that "what the code does" and "what the writer intended the code to do" are the same thing.

Plus I don't think I've ever seen a case where semicolons made the code harder to read (or write).

But why a semicolon? Sentences have been terminated by periods for hundreds of years until programmers decided to be special snowflakes.
I like semicolons, because it lets me use the brackets style I prefer (allman) instead of whatever the language devs think I should use. This is a really big issue for me trying to use Go, as they automatically insert a semicolon to the end of every line that doesn't end with a {, so the language forces you to use K&R, which I really dislike reading
That's not really true though, I've used golang for years and never had semicolons added automatically.
The Go compiler inserts semi-colons. https://go.dev/ref/spec#Semicolons

If you follow those rules, you will see that this:

  if true;
  {
      fmt.Println()
  }
  else
  {
      fmt.Println()
  }
Will get rewritten to:

  if true;
  {
      fmt.Println();
  };
  else;
  {
      fmt.Println();
  };
And that won't work. That's why the braces need to be as "} else {". and "if .. {".

It used to be that the compiler gave some pretty confusing errors about semi-colons on this, but it seems that's been improved now.

JavaScript has similar semi-colon insertion by the way, but with some different (more confusing) rules.

That's really neat, it had escaped me till now. Thanks for the explanation (today I'm one of the lucky 10,000!)
I don't mean it actually adds them to the file itself, its like it adds virtual ones during compilation. Try placing the { from the func main() { on a new line and it will fail to compile
Oh, I see, that's what you meant. Thanks for clarifying :)
In general, what helps parsers for disambiguation also tends to help human readers for disambiguation. In addition, some amount of redundancy helps to prevent errors when (for example) two statements were intended but they are parsed as one, or vice versa. Furthermore, consistency helps avoid errors, such as always ending a statement with a semicolon and not only when it would otherwise be ambiguous.
Why is that so much more different from requiring '\n' instead of ';'?
Because when reading as a human you're typing `\n` anyway which fits in with the whole "Can't the computer just figure out what I'm seeing?" thing. While it's technically correct (the best kind of correct) that it's '\n' vs ';', in practice it's really ';' vs ';\n'. I can only speak to my personal experience but seeing two expression on one line is so rare that I can't even think of an example of the last time I saw it.
The problem is not two expressions on one line, it’s one expression on two lines. e.g.

    foo
        .bar()
now you need to either add syntax to specify that you’re “continuing” (python), play tricks for the parser (Go), or have unreliable magic biting you in the ass half the time (javascript).
It's not that difficult; just place the "." at the end of the line:

  foo.
    bar()
And that will work fine. At least in Go.

You can endlessly argue what location is better for the ".", but it really doesn't really matter. Regardless, you don't really need to do parser tricks.

That's exactly what I mean by parser trick. You're relying on the idea that if the parser sees an incomplete expression it won't end the statement then and there.

It also looks like shit, because now you have to check the end of the previous line to know whether it's a method call or a function call which was indented in incorrectly.

Sure but that doesn't quite answer my question: is the parser trick really that bad?

I do see what you mean, of course, and am aware of the arguments but I've never found lack of non-whitespace expression terminator even remotely confusing. I can clearly see indentation in my periphery signalling that it's multi-line. I can also say that in ~13 of using exclusively using semicolon-less languages, I've never had or even heard of a problem caused by someone misunderstanding where an expression ends.

Anyway, this is getting pretty bikesheddy, lol but ya, to each their own, obviously.

With gofmt it's always indented correctly.
I understand it's likely stupid in Python since it's whitespace significant, but are those Go parser tricks really such a big deal or even very tricky?
Maybe I like the ‘\n’ for me and the ‘;’ for the compiler.
There should be only one source of truth.