Hacker News new | ask | show | jobs
by uecker 16 days ago
It can always specialize where you could do monomorphization.
1 comments

Not necessarily. In pure theory, yes. But in pure theory, monomorphized code can be deduplicated too.

Monomorphization often forces types information to flow, whereas type-erasure considers that a bonus. As such, a compiler may accidentally introduce opaque boundaries that hide type information. Even in cases where this is the programmer's fault and the type info is truly lost, in monomorphisation, that becomes an explicit error, while, for devirtualization, since it's just an optimization, the call is silently revirtualized (until someone notices it while profiling/decompiling). One has explicit intent, the other doesn't.

The point is much easier to propagate information (and not accidentally lose it) than to deduplicate after specialiation, so I do not agree that this is only of theoretical relevance.

I agree that monomorphization has explicit rules that force specialization, I do not think this is an advantage at all because it is too rigid and one can introduce explicitness at the language level also for devirtualization where this is needed. In this sense, monomorphization is premature optimization which limits what can be expressed at the language level and makes the job of the optimizer much harder (and practically impossible to undo the damage completely).

You seem to be underestimating how easy it is to lose type information. Creating a array of type-erased is already enough to cause confusion, even if the surrounding code is enough to prove the array only has one type. If we are to assume perfect deductive ability for type-erasure, why not for monomorphism as well?

Monomorphization isn't premature optimizations either. It and type-erasure are semantically different. For instance, only monorphization can enable inlining of data structures (as devirtualizating an array of fat pointers involves dealing with ownership concerns, among other things). You either have to type-erase the array itself or box all the items, none of which are very ergonomic for a system-level programming. Also, in terms of performance, monomorphization generally makes the compiler's job easier overall in comparison to type-erasure (specially in Rust where types also encode lifetime), so I don't see the damage unless you are dealing with an extremely size-constrained environment or dealing with instruction cache misses (which is an indication of poor abstractions).

Finally, as stated elsewhere, it's trivial to implement any type-erasure polymorphism in a monomorphic system (save for very specific exceptions, related to hardware), but the reverse is not true. To do so reliably will require assistance of an external codegen tool, like the C macros, generally recreating monomorphism, but worse, as the compiler doesn't actually know the polymorphic code is related, making deduplication even harder.

I don't see why it would easy to lose type information in circumstances where you can do monomorphization. Note that I do not assume "perfect deductive ability". The practical observation is simply that it is far more effort to deduplicate than to clone.
I am not sure why either (I believe it is aliasing), here's an example: https://godbolt.org/z/9szhxoxrf

From the code, we can already tell all elements have the same type, but the compiler is already adding checks. One of the options would be to convert the array of fat pointers into a fat slice. But now you have to enlarge the slice's vtable with all possible item operations, even if we don't use them often. Alternatively, use monomorphism. Or use both.

I see, this is what you meant. GCC can not currently do this. But this also not something that would correspond to a monomorphic version. A polymorphic function over a generic array would be specialized to one member type via monomorphization. This would correspond to this: https://godbolt.org/z/nd333xeTP