Hacker News new | ask | show | jobs
by mruts 2496 days ago
I can't have much experience with dependent types, but I don't see how a compiler could ever tell if zip took two lists of the same length. Don't dependent types happen at run-time? It seems like you would have to solve the halting problem for them to work at compile time.
2 comments

It takes a while to grok but for example constant expressions would have type “List Int 7” but a list value from I/O might have type “List Int n” where n is a natural number that also exists at runtime. Now zip would be “forall a, b, n. List a n -> List b n -> List (a, b) n”. Right away you could do a self-zip with your I/O list because the compiler knows that n == n but if you want to zip two different such lists it gets more interesting.

Now you have n, m, and xs : List Int n, ys : List Char m, and you want to do zip xs ys but m doesn’t “unify” with n so the type is wrong. So you have to do an equality check at runtime which gives you a proof of equality or inequality. In the equal case you have a value of type “n == m” and then you can use a language feature to rewrite the type of xs (or ys) according to this equality.

This was written on a phone kind of hastily but maybe it makes some sense.

I've been learning the dependently typed functional language Idris over the past few weeks. The type-safe zip over Vects (linked lists with a length in the type) is a canonical example of compile-time proof that they must be the same length:

    zip : Vect n a -> Vect n b -> Vect n (a, b)
This type specifies that both Vects must be length n. If they are not equal in length, the code will not compile. In fact, this type is so narrowly specified that Idris can infer the entire implementation of the function from it.

Don't dependent types happen at run-time?

Dependent types in Idris occur at compile time and are erased as part of compilation. At runtime, those Vects do not carry around any length information.

It seems like you would have to solve the halting problem for them to work at compile time.

Idris addresses this by having a totality checker. Every function in Idris is marked as either total (known to halt, barring any bugs in the totality checker) or partial (possibly not halting). Only total functions are allowed in types. For example, the function append:

    append : Vect m a -> Vect n a -> Vect (m + n) a
This type uses addition as a type level function. Since (+) for natural numbers (which m and n are) is proven to be total when Idris compiles the library where it's defined, Idris allows it to be used as a type level function.

So how does the totality checker work? Wouldn't that purport to solve the halting problem? No, the totality checker in Idris isn't magic, it's conservative about what it considers to be total. In order for it to declare a function total, it must identify at least one argument that progresses towards a base case at every step. It can do this even with mutually recursive functions.

In practice, if you write an interactive program in Idris, you can make every function total except for main, which presumably contains an infinite loop that checks for input forever.

> This type specifies that both Vects must be length n. If they are not equal in length, the code will not compile. In fact, this type is so narrowly specified that Idris can infer the entire implementation of the function from it.

It's more accurate to say that Idris can guess an implementation of the function, but it's not the only possible implementation with that type. There's nothing in the type to stop the function from reversing one or both of the lists before the zipping stage, for example. More generally, it could randomize the order of elements in one or both of the supplied lists.

That's true, though it has no reason to do those other things. In general it tries to do the least amount of work possible to accomplish the goal.

It's interactive, though, and it's extremely fun to use. It gives you the feeling that you're solving a puzzle, and like a good puzzle game, it can do the obvious bits for you.

Yes, but the "least amount of work", as far as Idris is concerned, has no necessary relationship to what the programmer intended when they defined the type. It guesses the simplest thing that fits, but that's far from a guarantee of correctness.

It is a serious shortcoming that even when you pay the complexity cost of dependent types, you still can't specify the intent in this case. I mean, the semantics of "zip" can't be specified with a type. You can define a "sorted list" or "ordered list" type, but you cannot enforce maintainance of the natural, arbitrary order of list elements. In this sense dependent types are both too complex to reason about and simultaneously not powerful enough to fully specify the function.

One solution to ensuring that the order is preserved (that I've actually used in production Haskell code where this was a real risk) is to use a heterogeneous list instead of a Vec.

  rzipWith :: (forall a. f a -> g a -> h a) -> Rec f as -> Rec g as -> Rec h as
With this type I don't think you'd be able to rearrange the elements, but you do have to pay the cost of using Const everywhere, as in you can specialize approximately my function as follows (handwavy, won't compile):

  zipWith :: (a -> b -> c) -> Vec n a -> Vec n b -> Vec n c
  zipWith f as bs = recToVec (rzipWith f' as' bs') where
    as' = vecToRec (fmap Const as)
    bs' = vecToRec (fmap Const bs)
    f' (Const a) (Const b) = Const (f a b)

  vecToRec :: Vec n (f a) -> Rec f (Replicate n a)
  recToVec :: Rec (Const a) as -> Vec (Length as) a
Rec is defined in Vinyl: https://hackage.haskell.org/package/vinyl-0.11.0/docs/Data-V...
> has no necessary relationship to what the programmer intended when they defined the type

Right, for instance I wouldn't be surprised if "the least amount of work" here amounted to reversing both lists as you zip.