Hacker News new | ask | show | jobs
by Animats 20 days ago
There's a discussion of "delayed bounds checking", but not "hoisted bounds checking", where bounds checking is done early. Consider

    let mut tab: [usize;100] = [0;100];
    ...
    for i in 0..101 {
        tab[i] = i;
    }
This must panic at i=100. Panic becomes inevitable at entry to the loop. Is the compiler entitled to generate a check that will panic at loop entry? The slides suggest that Rust does not hoist such checks, and, so, with nested loops, it has trouble getting checks out of the loop, which prevents vectorization.
3 comments

Currently LLVM cannot do that because the panic message includes the erroneous index. You can do it manually though if you add `_ = tab[100]`.

Even if the panic message would not include the index, LLVM was unable to do that if the previous iterations had side effects (for example if `tab` is not a local variable).

On https://godbolt.org/ select Ada and compiler option "-O2"

    function Square(num : Integer) return Integer is
        tab : array (0..100) of integer;
    begin
        for i in 0..101 loop 
            tab(i):=i; 
        end loop;
        return tab(100);
    end Square;
The assembly code generated is :

    sub     rsp, 8    #,
    mov     esi, 11   #,
    mov     edi, OFFSET FLAT:.LC0     #,
    call    "__gnat_rcheck_CE_Index_Check"  #
Loop is not run and exeption handler is called directly.

Link : https://godbolt.org/z/qT4TsKPxz

Right, that's the extreme case, where the problem is detected at compile time. Unfortunately, it's not a user-visible error message at compile time.

Need to try an example where the size isn't known until run time.

Panics in Rust do not currently time-travel like that (including panics from failed bounds checks), and that's a good thing. The reason is that panicking does not imply terminating the process - they can be caught and handled, just like exceptions in C++. In fact, they use the same stack unwinding mechanism by default.

What the compiler is allowed to do is to shorten the loop by one and unconditionally panic after the loop, but this falls under the purview of the LLVM optimizer.

It's true that panics (unlike UB) cannot automatically time-travel, but your justification is weak. Recovering from panics can only prevent this optimization if the loop have side effects, and LLVM knows when panic=abort is set.
The post-panic situation is a problem in Rust. After a panic, you're in a somewhat abnormal state. Rust panics are not supposed to be a catchable exception system. If something other than program termination is in the near future, that's a problem.

That does create a problem for early panics, panicking when panic becomes inevitable but has not happened yet. This deserves more thought.

I mean, sure, dead code elimination applies to all optimized code. The important thing to understand is that panicking in Rust does not get magic treatment by the Rust compiler. It’s just a function that is declared in the type system to never return.
Once it shortens the loop, the compiler can also observe that `tab` is a local variable and therefore move the writes up "to the initializer." It can then see that the variable is unused and delete it, and also delete the loop.