Hacker News new | ask | show | jobs
by mrgriffin 2496 days ago
> there are equally disastrous bugs caused by intent - i.e. the programmer had no idea the code could fail.

At least some of these sorts of bugs (depending on how you define "fail") are captured even in Haskell today by things like Maybe. In fact, if Haskell was total (like Idris) you wouldn't even be able to write a function that searches a list for an element with this type:

    find :: (a -> Bool) -> [a] -> a
Because there's no way for you to magic an a out of nowhere in the empty list case (whereas laziness lets you use bottom). You could however write a buggy find over known non-empty lists because you could always return one of the known elements, types aren't a panacea.

Dependent types go further and let you guarantee things like "zip only compiles for two lists of exactly the same length" which can be helpful in some circumstances (specifically in Haskell we use things like zip [0..] too often to want to change the function literally called zip, but you could imagine another name, perhaps zipExact as in the Safe library).

I suspect you probably already know this and meant a different kind of failure, such as mathematically invalid (which you could probably encode via a sufficiently powerful type system), or a simple misunderstanding of the requirements (which you—as the misunderstander—obviously can't).

2 comments

> whereas laziness lets you use bottom

in principle, only if you can prove you don't use the result. there's nothing wrong with not calculating a value that doesn't exist.

I can't have much experience with dependent types, but I don't see how a compiler could ever tell if zip took two lists of the same length. Don't dependent types happen at run-time? It seems like you would have to solve the halting problem for them to work at compile time.
It takes a while to grok but for example constant expressions would have type “List Int 7” but a list value from I/O might have type “List Int n” where n is a natural number that also exists at runtime. Now zip would be “forall a, b, n. List a n -> List b n -> List (a, b) n”. Right away you could do a self-zip with your I/O list because the compiler knows that n == n but if you want to zip two different such lists it gets more interesting.

Now you have n, m, and xs : List Int n, ys : List Char m, and you want to do zip xs ys but m doesn’t “unify” with n so the type is wrong. So you have to do an equality check at runtime which gives you a proof of equality or inequality. In the equal case you have a value of type “n == m” and then you can use a language feature to rewrite the type of xs (or ys) according to this equality.

This was written on a phone kind of hastily but maybe it makes some sense.

I've been learning the dependently typed functional language Idris over the past few weeks. The type-safe zip over Vects (linked lists with a length in the type) is a canonical example of compile-time proof that they must be the same length:

    zip : Vect n a -> Vect n b -> Vect n (a, b)
This type specifies that both Vects must be length n. If they are not equal in length, the code will not compile. In fact, this type is so narrowly specified that Idris can infer the entire implementation of the function from it.

Don't dependent types happen at run-time?

Dependent types in Idris occur at compile time and are erased as part of compilation. At runtime, those Vects do not carry around any length information.

It seems like you would have to solve the halting problem for them to work at compile time.

Idris addresses this by having a totality checker. Every function in Idris is marked as either total (known to halt, barring any bugs in the totality checker) or partial (possibly not halting). Only total functions are allowed in types. For example, the function append:

    append : Vect m a -> Vect n a -> Vect (m + n) a
This type uses addition as a type level function. Since (+) for natural numbers (which m and n are) is proven to be total when Idris compiles the library where it's defined, Idris allows it to be used as a type level function.

So how does the totality checker work? Wouldn't that purport to solve the halting problem? No, the totality checker in Idris isn't magic, it's conservative about what it considers to be total. In order for it to declare a function total, it must identify at least one argument that progresses towards a base case at every step. It can do this even with mutually recursive functions.

In practice, if you write an interactive program in Idris, you can make every function total except for main, which presumably contains an infinite loop that checks for input forever.

> This type specifies that both Vects must be length n. If they are not equal in length, the code will not compile. In fact, this type is so narrowly specified that Idris can infer the entire implementation of the function from it.

It's more accurate to say that Idris can guess an implementation of the function, but it's not the only possible implementation with that type. There's nothing in the type to stop the function from reversing one or both of the lists before the zipping stage, for example. More generally, it could randomize the order of elements in one or both of the supplied lists.

That's true, though it has no reason to do those other things. In general it tries to do the least amount of work possible to accomplish the goal.

It's interactive, though, and it's extremely fun to use. It gives you the feeling that you're solving a puzzle, and like a good puzzle game, it can do the obvious bits for you.

Yes, but the "least amount of work", as far as Idris is concerned, has no necessary relationship to what the programmer intended when they defined the type. It guesses the simplest thing that fits, but that's far from a guarantee of correctness.

It is a serious shortcoming that even when you pay the complexity cost of dependent types, you still can't specify the intent in this case. I mean, the semantics of "zip" can't be specified with a type. You can define a "sorted list" or "ordered list" type, but you cannot enforce maintainance of the natural, arbitrary order of list elements. In this sense dependent types are both too complex to reason about and simultaneously not powerful enough to fully specify the function.

One solution to ensuring that the order is preserved (that I've actually used in production Haskell code where this was a real risk) is to use a heterogeneous list instead of a Vec.

  rzipWith :: (forall a. f a -> g a -> h a) -> Rec f as -> Rec g as -> Rec h as
With this type I don't think you'd be able to rearrange the elements, but you do have to pay the cost of using Const everywhere, as in you can specialize approximately my function as follows (handwavy, won't compile):

  zipWith :: (a -> b -> c) -> Vec n a -> Vec n b -> Vec n c
  zipWith f as bs = recToVec (rzipWith f' as' bs') where
    as' = vecToRec (fmap Const as)
    bs' = vecToRec (fmap Const bs)
    f' (Const a) (Const b) = Const (f a b)

  vecToRec :: Vec n (f a) -> Rec f (Replicate n a)
  recToVec :: Rec (Const a) as -> Vec (Length as) a
Rec is defined in Vinyl: https://hackage.haskell.org/package/vinyl-0.11.0/docs/Data-V...
> has no necessary relationship to what the programmer intended when they defined the type

Right, for instance I wouldn't be surprised if "the least amount of work" here amounted to reversing both lists as you zip.