Hacker News new | ask | show | jobs
by estebank 1284 days ago
Reflection is usually not available in AoT compiled languages. The prevalent Rust coding styles rely heavily on monomorphic data types and functions, meaning there's nothing left to reflect at runtime. But if you want to deal with trait objects and need to access the underlying type, you need to use Any::downcast or rely on annotations on every type you want to reflect on. Or now, leverage DWARF info on Linux with deflect.
2 comments

That's runtime reflection.

Compile time reflection AFAIK is available in D and Zig, and is planned for C++.

"Compile time reflection" is an inconsistent and nonstandard concept; originally it seemed to just mean typeclasses for people who hadn't heard of typeclasses.
for weirdos who only have ad-hoc constraints instead of knowing what typeclasses are, it means that you can first say "I only have ad-hoc constraints" then say "wait wait I need to make decisions based on the specifics of the type" which may be useful for e.g. generating serializers and deserializers at compile time instead of using code generators like protobuf
Yup. I consider runtime reflection an antifeature, which has negative performance effects, is unsafe (see e.g. log4j) and leads to fragile code.

I would however welcome static reflection with open arms. In Rust in particular, I’d prefer it if derive was implemented using static reflection, rather than proc macros.

Derive or equivalent ought to be something you can implement on top of frunk (so you're ultimately still depending on a proc macro, but the whole ecosystem only needs to depend on that one macro, and tools etc. can build in support for it) - that's how it's usually done in Scala.
That's right. Nim does as well. It's amazing. Once you get used to having CTTI and being able to use it, it's hard to program without it. Bonus points if you can do basic dependent types too.

In C++ with SFINAE you can effectively do CTTI-style programming in C++. C++ has long had runtime type reflection as well (RTTI), though it needs to be compiled in. Looks like there's a boost library for CTTI.

The C++ reflection improves a lot in C++20, but it's still very limited compared to that aspect of Nim, or even Zig. The std::meta::info and "splices" based on Haskell for C++26 are incredibly exciting to me. I have many use cases in mind. Splices in combination with std::embed will make C++ basically just a bad Racket (but one with inline assembly!).
What are monomorphic data types? What should be my first read on the subject?
It's a fancy way of saying "every time this type is used, replace all the generic type params with what was used and generate code for it". It's how generics are implemented in Rust. If you have

    struct Foo<T>(T);
And you create Foo(42i32) and Foo(0.0f64), the compiler will create the equivalent to

    struct Fooi32(i32);
    struct Foof64(f64);
In other languages like Java, generics are implemented the way that Rust does "trait objects" (&dyn Trait).

Rust is not the only language that does this, to be clear.

If you're interested in a quick intro on the compiler side of this, you can read https://rustc-dev-guide.rust-lang.org/backend/monomorph.html

Expanding on trait objects: these are implemented as "V-Tables", structs holding pointers to the trait's methods and to the underlying type. This means that if you need to know what the underlying type, you have to do something fancy, usually referred to as "reflection". Also, invocation of generic functions that use V-Tables require "chasing pointers", which makes cache locality worse (because data might not be in the same cache read as the v-table itself), but makes the generated binary smaller (because if you have something like Foo<T> used with 1000 types, with monomorphization you end up with 2000 generated types in the binary, instead of 1001 with trait objects).
To add to this, even the Foo-wrapper is gone, just the i32 remains. Rust values are amorphous data blobs at runtime.
Yes, that's true but that is an implementation detail that only comes into play when dealing with ABI, and then you should be using #[repr(transparent)] to ensure that the compiler won't do something else :)
Sure, it’s good to point out the difference between “the behavior of a typical optimizing compiler” and “things actually guaranteed by the language”. The context of the discussion was the former, I think. I’m not even that certain that monomorphization is actually required in theory.
Yes, monomorphization isn't needed in theory, as long as the user-visible behavior remains the same, and in practice the team is exploring options[1] to identify cases where the currently manual practice of writing

    pub fn foo<T: AsRef<X>>(x: T) {
        inner_foo(x.as_ref());
    }
    fn inner_foo(_: &X) { todo!() }
can be instead done by the compiler automatically (turning monomorphized code back into polymorphic code, hence the polimorphization hame).

[1]: https://rustc-dev-guide.rust-lang.org/backend/monomorph.html...

ABI wise that is not true though. structs have struct ABI, even just a newtype struct around an integer will not use integer ABI unless annotated with #[repr(transparent)].
Pretty sure that some usage patterns of polymorphic types can not be completely monomorphized. Here's example in Golang:

    package main

    import (
        "fmt"
    )

    type wrapper[T any] struct {
        Value T
    }

    func (w wrapper[T]) String() string {
        return fmt.Sprintf("{%v}", w.Value)
    }

    func stringWrapped[T any](n int, v T) string {
        if n == 0 {
            return fmt.Sprintf("%v", v)
        }
        return stringWrapped(n-1, wrapper[T]{Value: v})
    }

    func main() {
        n := 0
        fmt.Scanf("%d", &n)
        result := stringWrapped(n, "test")
        fmt.Println(result)
    }
Go refuses to compile because it can't possibly generate all instances of wrapper[T] that this program may use: wrapper[string], wrapper[wrapper[string]], wrapper[wrapper[wrapper[string]]], etc.
Rust will complain about a recursion limit being reached during instantiation[1]. The solution in Rust is to use &dyn Trait or Box<dyn Trait> instead.[2]

[1]: https://play.rust-lang.org/?version=stable&mode=debug&editio...

[2]: https://play.rust-lang.org/?version=stable&mode=debug&editio...

^ This blows the stack because it keeps calling itself with no break condition, but shows how the type system accepted the code.

I think this is called polymorphic recursion in Haskell circles.

In C++ you can monomorphize as long as you can somehow prove the recursion terminates at compile time (for example by threading a static recursion counter).

Nice examples - you can also have languages (like SML) where monomorphization is simply an implementation detail. Some compilers (e.g., MLton) perform monomorphization and others don't.
I only recently realized that certain type system features, like polymorphic recursion, make monomorphization impossible in the general case. In Haskell for example, it’s by necessity only an optimization that’s used where applicable, and not the general strategy.
That depends on what you mean. SML has "polymorphism" boiling down to being able to plug an arbitrary type in some places, which is denoted like 'a. But when people talk about generics, they more often talk about C++ templates, Java generics, Rust traits, etc. whose SML equivalent are signatures, structs and functors. Signatures are a bit like Rust traits, structs are a bit like Rust implementations of traits, whereas functors are like Rust's "templates", i.e. wherever you swap angle brackets to parametrise something with types constrained by traits, or values constrained by types. Except in Rust this parametrisation can be slapped on a bunch of things. It can be on structs, on functions, on traits, on implementations of traits etc. In SML you need to group all the "parametrised" things into a struct (and a corresponding signature), which is going to be returned by a functor.

And now the thing is: with transparent signature ascriptions, functors are monomorphised in SML, instead of everything being hidden behind signatures (as is in the case of Rust with traits when you use dyn), which has semantic consequences. E.g. a struct returned by a functor may contain a type. You can't perform proper type-checking without monomorphising, because you don't know what the exact type is. E.g. in the following program, the final line couldn't be type-checked without monomorphisation:

   signature ITERABLE = sig
       type ElemT
       type SrcT
   
       val new_iter: SrcT -> unit -> ElemT option
   end
   
   signature LIST_ELEM_TYPE = sig
       type T
   end
   
   functor ListIterFun (ListElemType: LIST_ELEM_TYPE): ITERABLE = struct
       type ElemT = ListElemType.T
       type SrcT = ElemT list
   
       fun new_iter l = let val lr = ref l
                        in
                          fn () => case !lr of
                                     nil => NONE
                                   | (x::xs) => (lr := xs; SOME x)
                        end
   
   end
   
   structure IntElemType: LIST_ELEM_TYPE = struct
       type T = int
   end
   
   structure IntListIter = ListIterFun(IntElemType)
   
   val next = IntListIter.new_iter [1, 2, 3, 4, 5]
If I change the signature ascription on ListIterFun to an opaque ascription (:> ITERABLE), the final line won't type-check, because it's not obvious from the signature, that ElemT is int. So transparent signature ascriptions require monomorphisation (Rust traits without dyn), and opaque signature ascriptions free the compiler from having to do monomorphisation (Rust traits with dyn*).

There was a lot of discussion of this issue when Go was settling on a design for its generics, under the phrase "reified generics".

Not exactly the same thing but JITs can turn dynamic objects into structs if the structure is consistent. JS runtimes and Julia do this as far as I know.
Firefox's js runtime also do tricks like generate multi copy of optimized function when the function has multi call site instead make one with lots of if else. So it no longer suffer from the problem that function that frequently get multi different type of parameters from different call site has poor performance.

It's probably exactly how templates work, except the details are invisible to users.

https://hacks.mozilla.org/2020/11/warp-improved-js-performan...

Yes! Java as well. And this is how those languages can show impressive benchmarks for consistent workloads. In theory they can even surpass AoT languages. In practice it depends on the specifics.
Julia doesn't do this. It just has structs in the first place.
I think cpp does this too
It indeed does. The only difference is that Rust has traits (similar to C++'s concepts) which require explicit mention of what interface the type parameters have inside the function, whereas C++'s templates will have a compile error after instantiation if you passed something that didn't meet the expected contract. This is closer to Rust's macros in operation.

Given

    fn foo<T>(a: T, b: T) -> T { a + b }
The compiler will complain that you should have been explicit on how T is going to be used:

    error[E0369]: cannot add `T` to `T`
     --> src/lib.rs:1:32
      |
    1 | fn foo<T>(a: T, b: T) -> T { a + b }
      |                              - ^ - T
      |                              |
      |                              T
      |
    help: consider restricting type parameter `T`
      |
    1 | fn foo<T: Add<Output = T>>(a: T, b: T) -> T { a + b }
      |         +++++++++++++++++
whereas in C++ this would have been accepted until you called foo with two things that couldn't be added together, like a Rust macro[1].

[1]: https://play.rust-lang.org/?version=nightly&mode=debug&editi...