Hacker News new | ask | show | jobs
by dwohnitmok 1722 days ago
I find the often ML-inspired syntax too high of a bar for most programmers to clear when being introduced to a language.

I think a syntax more similar to something like what is introduced here: https://shuangrimu.com/posts/language-agnostic-intro-to-depe... more accessible for a lot of programmers.

I separately think that proofs are actually a bit overblown when it comes to dependent types and that dependent types are most useful for specification, but often times could benefit from "fake" implementations.

5 comments

I agree that ML-style function signature is probably the best and most readable. `name: type` is a lot better than `type name`. I would argue that `match x with` would be better than `case x of`, to highlight that pattern matching is different than your usual switch. Calling out explicitely the contructors with `contructors` is a good idea, I think it can make things easier for new people.

Edit: I wonder if it would help to also have a record type. Enum allows you to have full ADTs, but named tuples are often more useful than regular tuples.

> `name: type` is a lot better than `type name`

This is highly subjective. I find the latter to be far more readable. Reading a variable name knowing it's type makes more sense.

As long as types can only be a single identifier both are fine. But that is not the case very often. Look at C and it's derivates. It's just unreadable in those cases compared to separating type and value with a colon.
The order is one thing, but for me, it's the addition of a colon to separate the two properly that's important. That's probably also highly subjective as some people find syntax highlighting useless and I think it's great, but in general I like having a bit more tokens to know where I am.
Obviously highly subjective, yet heavy of conventions. Usually the subject comes before the qualification. "counter is an integer", "let counter be an integer" instead of "from the integer take counter".
Programming languages aren't natural languages though, and I don't think they should be forced to look like them.

One thing for the "type before name" camp is that you don't have to use a keyword like "let" or "function" when declaring a function, but on the other hand you're going to introduce a keyword anyways if you add type inference.

"We have an integer called x, ..."
You say "Nixon is a President" but also "President Nixon".
I personally feel like I've just been smacked in the face with a dump truck full of syntax that I don't even barely understand, and I've used FP before. It probably actually is simple but for me, I need a bit more explanation
"smacked in the face", or "presented with something that takes 5 minutes to get a handle on"?

Think about how long it takes to learn a new API or data model. A few minutes to learn a syntax isn't a big deal.

"smacked in the face" might be a bit extreme, but I can sympathise with GP's point. I read the first line of the first example:

    A :: ~ * ~> * => {(list A) :: |new |empty }.
And got stuck at the first hurdle. I started parsing it in my mind and read "A such that tilde star arrow tilde double arrow". I stopped there, and thought "I'm lost". I started trying to interpret it: * might mean wildcard; single arrow could be curried functions (per Haskell). But I don't know what tilde means, nor the double arrow.

Now, of course, I can go and look it up: it'll all be defined somewhere. But that first experience - that very first "I've never seen this before, it sounds interesting" hit a road block right there.

Maybe not smacked in the face, but enough friction for me to click the back button.

You can take issue with the order of the presentation but the definition is in the following section. There’s nothing to look up, it’s just an arbitrary choice by the author.
Yep, these codifications are just examples of what you can do, as Vmladenov said, try reading the examples above.
The syntax you presented is really very accessible. Of course, it makes things a little more verbose, but I think you are right, the path for bringing more attention to dependent types is probably making more accessible also. Btw, great article I will have a more detailed look after.
That syntax might be unacceptably verbose with too many usually implicit arguments. However, I also think that implicit arguments primarily are needed when working with heavily-dependently-typed data structures, which I think is a bad way of working with dependent types (e.g. I don't like length indexed vectors).

Instead if you primarily work with non-dependent data structures and then pass in dependently-typed invariants as a separate argument, you can avoid a lot of the complexity associated with dependent types. This allows your dependently typed arguments to not have a runtime representation, e.g. they could be given zero multiplicity in Idris. This in turn allows you to easily substitute a property test or an external solver (or just a human "trust me" override) instead of being forced to always prove everything since you no longer require a true implementation of a dependent type.

This in turn simultaneously severely reduces the need for elaborate implicit arguments and makes verbosity much easier to deal with.

This sounds interesting. Can you give a concrete example?

This seems to really fall into place with my feeling that heavily dependently typed data types Ans programs using them are hard to refactor as the invariants that you break in the refactor cascade through your entire program.

Separating them probably makes life easier

If you're familiar with refinement types, what he's mentioning is using dependent types to achieve refinement types.

Let's take the typical example of "lists of size n". It's either a special inductive type vec A n, or a list l with a proof that length l > 0. The signatures of a head function (using Coq syntax, where this approach is most common) would be:

  head_safe1 {A n} : vec A (S n) -> A
  head_safe2 {A} : forall (l : list A), length l > 0 -> A.
In both cases you match on the list and prove the nil case to be impossible.

One benefit is that one can easily stack predicates. Need a list of length n that is also sorted? Just add an argument of type Sorted l.

Programs that are 'truly' dependently typed have their benefits, but they definitely can feel more rigid.

> Instead if you primarily work with non-dependent data structures and then pass in dependently-typed invariants as a separate argument, you can avoid a lot of the complexity associated with dependent types. This allows your dependently typed arguments to not have a runtime representation, e.g. they could be given zero multiplicity in Idris. This in turn allows you to easily substitute a property test or an external solver (or just a human "trust me" override) instead of being forced to always prove everything since you no longer require a true implementation of a dependent type.

I'll provide a counterpoint since i'm personally more of an "integrated" dependentist! Just to make things more clear, what you propose is to segregate computational data from propositional data (proofs) and i prefer "integrating" the two, writing "proofs with computational content" or "correct-by-construction code" depending on the point of view. Agreed, when separating you get (easier) solvers and escape hatches. But afaik a big reason for this separated style in Coq is that their support of dependent pattern-matching is bad (luckily now there is coq-equations).

If you want to write that "map" is size-preserving, in separated style you'll have the function `map : (X -> Y) -> list X -> list Y` and `map-len : (f : X -> Y) (xs : list X) -> len xs = len (map f xs)`. First the verbosity increases with the complexity of your integrated datastructure. Second in integrated-style, you would have gotten this lemma for free since `map : vec X n -> vec X n` on vectors admits the same implementation as on lists [1]. In my experience, in separated style, you end up duplicating a lot of work. In language theory, subject-reduction (reduction preserves typing) is for free if you're working with intrinsically-typed terms. Eg stuff in my last coq work [2], where manipulating purely computational DeBruijn indices would have been error-prone whereas intrinsically typing terms enables to just auto-complete the single correct implementation.

Going a bit further, both approaches are equivalent: one can prove that `vec X n ≅ (xs : list X) × (len xs ≡ n)`. There is ongoing work on first-class datastructure declaration (ability to construct and manipulate data declarations from code) to actually declare such refinements and derive all the tooling (like [1]) automatically (search for "ornaments", a couple papers there by McBride and Dagand). This would enable to mix-and-match both presentations. There is a fun chain where vectors are a refinement of lists and lists are a refinement of (unary encoded) naturals, and vectors are actually lists indexed by the natural to with they erase (the so-called "reornament" of ornament list/nat, getting additional results generically). Very rich theory, very related to stuff like mathcomp's hierarchy-builder.

[1] With use of `erase : vec X n -> list X` (should get compiled to noop) and `erase-coherent : (xs : vec X n) -> len (erase xs) = n`.

[2] https://sbi.re/~peio/LCD.html and https://sbi.re/~peio/OGSD.html (code at https://git.sr.ht/~lapinot/ogs-coq)

Thanks for this fantastic link. I will share it with my students at the end of my intro to compilers class.
How is ML syntax any different from Typescript?
In nearly every aspect other than the "name: type" part. I think they were talking about the whole thing. And indeed, the language in the article they posted is not so far off Typescript syntax-wise.