Hacker News new | ask | show | jobs
by aw1621107 75 days ago
> For example you cannot design something that comes evwn close to expression templates libraries.

You keep saying this and it's still wrong. Rust is quite capable of expression templates, as its iterator adapters prove. What it isn't capable of (yet) is specialization, which is an orthogonal feature.

2 comments

Rust cannot take a const function and evaluate that into the argument of a const generic or a proc macro. As far as I can tell, the reasons are deeply fundamental to the architecture of rustc. It's difficult to express HOW FUNDAMENTAL this is to strongly typed zero overhead abstractions, and we see where Rust is lacking here in cases like `Option` and bitset implementations.
> Rust cannot take a const function and evaluate that into the argument of a const generic

Assuming I'm interpreting what you're saying here correctly, this seems wrong? For example, this compiles [0]:

    const fn foo(n: usize) -> usize {
        n + 1
    }

    fn bar<const N: usize>() -> usize {
        N + 1
    }

    pub fn baz() -> usize {
        bar::<{foo(0)}>()
    }
In any case, I'm a little confused how this is relevant to what I said?

[0]: https://rust.godbolt.org/z/rrE1Wrx36

> Rust is quite capable of expression templates, as its iterator adapters prove.

AFAIU iterator adapters are not quite what expression templates are because they rely on the compiler optimizations rather than the built-in feature of the language, which enable you to do this without relying on the compiler pipeline.

I had always thought expression templates at the very least needed the optimizer to inline/flatten the tree of function calls that are built up. For instance, for something like x + y * z I'd expect an expression template type like sum<vector, product<vector, vector>> where sum would effectively have:

    vector l;
    product& r;
    auto operator[](size_t i) {
        return l[i] + r[i];
    }
And then product<vector, vector> would effectively have:

    vector l;
    vector r;
    auto operator[](size_t i) {
        return l[i] * r[i];
    }
That would require the optimizer to inline the latter into the former to end up with a single expression, though. Is there a different way to express this that doesn't rely on the optimizer for inlining?
Expression templates do not rely on optimizer since you're not dealing with the computations directly but rather expressions (nodes) through which you are deferring the computation part until the very last moment (when you have a fully built an expression of expressions, basically almost an AST). This guarantees that you get zero cost when you really need it. What you're describing is something keen of copy elision and function folding though inlining which is pretty much basics in any c++ compiler and happens automatically without special care.
> since you're not dealing with the computations directly but rather expressions (nodes) through which you are deferring the computation part until the very last moment (when you have a fully built an expression of expressions, basically almost an AST).

Right, I understand that. What is not exactly clear to me is how you get from the tree of deferred expressions to the "flat" optimized expression without involving the optimizer.

Take something like the above example for instance - w = x + y * z for vectors w/x/y/z. How do you get from that to effectively

    for (size_t i = 0; i < w.size(); ++i) {
        w[i] = x[i] + y[i] * z[i];
    }
without involving the optimizer at all?
The example is false because that's not how you would write an expression template for given computation so the question being how is it that the optimizer is not involved is also not quite set in the correct context so I can't give you an answer for that. Of course that the optimizer is generally going to be involved, as it is for all the code and not the expression templates, but expression templates do not require the optimizer in the way you're trying to suggest. Expression templates do not rely on O1, O2 or O3 levels being set - they work the same way in O0 too and that may be the hint you were looking for.
> The example is false because that's not how you would write an expression template for given computation

OK, so how would you write an expression template for the given computation, then?

> Expression templates do not rely on O1, O2 or O3 levels being set - they work the same way in O0 too and that may be the hint you were looking for.

This claim confuses me given how expression templates seem to work in practice?

For example, consider Todd Veldhuizen's 1994 paper introducing expression templates [0]. If you take the examples linked at the top of the page and plug them into Godbolt (with slight modifications to isolate the actual work of interest) you can see that with -O0 you get calls to overloaded operators instead of the nice flattened/unrolled/optimized operations you get with -O1.

You see something similar with Eigen [2] - you get function calls to "raw" expression template internals with -O0, and you need to enable the optimizer to get unrolled/flattened/etc. operations.

Similar thing yet again with Blaze [3].

At least to me, it looks like expression templates produce quite different outputs when the optimizer is enabled vs. disabled, and the -O0 outputs very much don't resemble the manually-unrolled/flattened-like output one might expect (and arguably gets with optimizations enabled). Did all of these get expression templates wrong as well?

[0]: https://web.archive.org/web/20050210090012/http://osl.iu.edu...

[1]: https://cpp.godbolt.org/z/Pdcqdrobo

[2]: https://cpp.godbolt.org/z/3x69scorG

[3]: https://cpp.godbolt.org/z/7vh7KMsnv