Hacker News new | ask | show | jobs
by nshepperd 2172 days ago
This strikes me as having some interesting similarity to the practice in Haskell of placing type class constraints on functions, not data types. A deprecated feature of Haskell used to allow you to write data types like:

    data Ord k => Map k v = ...
This defined a data type representing an ordered map from k to v, with the additional restriction that before you can talk about such a map you had to know that the type k is orderable (has an instance of the Ord type class which defines comparison operators).

This may be considered to be similar to the definition of (/) requiring that its second argument is nonzero.

This style was supplanted by leaving the data type definition 'bare' and placing the constraints on the API's functions instead:

    data Map k v = ...
    insert :: Ord k => k -> v -> Map k v -> Map k v
    lookup :: Ord k => k -> Map k v -> Maybe v
This style is similar to instead placing the nonzero-ness constraint on the theorems which define the API of (/).

In both cases there are basic engineering reasons for the switch:

- Having the constraints in the definition of terminology (Map, /) doesn't save any work, as you still need to prove those constraints to do anything or refer to it (in Haskell this meant sprinkling `Ord k =>` in the type of anything that referred to Map k v).

- In fact it results in doing more work than necessary, as you are sometimes forced to prove the constraints just to mention Map, even when what you are doing does not require them. For instance defining the empty map 'empty :: Map k v' shouldn't require you to know anything about k, because an empty tree doesn't contain any keys, let alone care about their ordering. In this case requiring Ord k in order to mention Map k v would be superfluous.

2 comments

Rust tends to take the second approach as well; it tends to place trait bounds on `impl` blocks, and not on the struct itself.

(This is by no means a standard across the community though; I'd say it's a 50/50 split)

A case where it is unavoidable is when you want to refer to an associated type in your struct declaration.

In most other cases, putting bounds on impls leads to less repetition of those bounds and better error messages!

Another related topic is trait methods with method-level type parameters that have bounds. Those have to be repeated on every usage site so it is usually preferable to go for a design that has the type parameters in the actual trait.

Instead of:

    trait Foo {
        fn foo<T: Bar>(&self, bar: T)
    }
do this instead:

    trait Foo<T> {
        fn foo(&self, t: T)
    }

    impl<T: Bar> Foo for XYZ {
        fn foo(&self, bar: T) {

        }
    }
One bound might seem fine in this example but once stuff needs to be Send, Sync and 'static is when it gets annoying.
You can still do

    data TC x where
      DC :: Ord x => x -> TC x
Yes, but that doesn't mean quite the same thing. With the constraint on the type (data Ord x => TC x = DC x) you can be sure that `x` implements `Ord` for any type `TC x`. In your GADT, however, the constraint only applies to the DC constructor, so `x` is only guaranteed to implement `Ord` when you actually have a value inside a `DC` constructor. If you don't match on the constructor then there might not be an `Ord` instance:

    -- `Ord x` exists when matching on DC constructor
    ex1 :: TC x -> x -> x -> Ordering
    ex1 (DC _) a b = compare a b

    -- Without constructor match, no Ord instance
    ex2 :: TC x -> x -> x -> Ordering
    ex2 _ a b = compare a b  -- ERROR!

    -- This is why the above ex2 definition is an error
    badUser :: x -> x -> Ordering
    badUser a b = ex2 undefined a b

    -- No `TC x` value also means no Ord instance even
    -- though no arguments are ignored or undefined
    data Proxy x = Proxy
    ex2 :: Proxy (TC x) -> x -> x -> Ordering
    ex2 Proxy a b = compare a b  -- ERROR!