Hacker News new | ask | show | jobs
by armchairhacker 1623 days ago
Most of the less abstract parts of this research (generic and polymorphic types w/ variance, ADT structs/unions, even type functions) are in most modern languages (Scala, TypeScript, C++20, Kotlin, Swift, Go). Rust even has affine types

The more abstract parts like dependent types are really complicated and even unintuitive to use. See: issues with Rust’s borrow checker, or Haskell being so confusing.

Earlier languages like C and Java are mostly legacy code, they use libraries in legacy code, or they’re for developers familiar with them.

Untyped scripting languages are fine for scripts. Honestly idk why developers write big libraries in untyped languages like JavaScript and Python.

The main case where advanced formal methods are particularly useful is proving program correctness. And AFAIK this stuff is used by industry, although you don’t hear about it as much. The thing is, most programs either don’t “need” to be proved, or they’re way too big.

3 comments

> The more abstract parts like dependent types are really complicated and even unintuitive to use.

I disagree with this: dependent types are way easier than lots of the convoluted schemes that non-dependent languages have come up with.

As a simple example, dependently-typed languages don't need parametric polymorphism or generics: we can achieve the same thing by passing types as arguments; e.g.

    map: (inType: Type) -> (outType: Type) -> (f: inType -> outType) -> (l: List inType) -> List outType
    map inType outType f l = match l with
      Nil  inType      -> Nil  outType
      Cons inType x xs -> Cons outType (f x) (map inType outType f xs)
When I program without dependent types, I regularly find myself getting "stuck" inadvertently; knowing that (a) there's no way to make my current approach work in this language, (b) that it would be trivial to make it work if I could pass around types as values, (c) that I need to throw away what I've done and choose a different solution, and (d) the alternative solution I'll end up with will be less correct and less direct than my original approach (e.g. allowing more invalid states)
Is there work on cleanly subsetting dependently typed languages to ones where well-typed-ness can be automatically proven? That's not generally the case, right? I think those concerns are basically why people come up with less-general typing mechanisms.
The two main hurdles with dependent types are:

- Type-checking is undecidable. I've not found that to be much of a problem, since it can be given a timeout (or hit Ctrl-C).

- Soundness requires functions to be total (either terminating or co-terminating). That's generally a good idea anyway, but the approaches for doing this aren't particularly great; e.g. structural recursion is simple to implement and understand, but is very restrictive.

Idris makes the pragmatic choice of allowing the totality-checker to be turned off on a per-function basis; which is certainly no worse than using a language with no totality checking.

Another approach I've enjoyed using is Mtac in Coq: general recursion is an effect, which gets wrapped in a type (similar to Haskell's IO); there is an unwrapping function (like unsafePerformIO) but it gets executed at compile-time, so non-total functions create an infinite loop at compile time (rather than at runtime).

> Type-checking is undecidable.

This is only true if you allow non-total functions in your types, which most dependently-typed languages don't by default (i.e. the condition implied by your soundness condition).

By default in e.g. both Idris and Agda type-checking is decidable.

Dependent types aren't terribly complicated intrinsically, see e.g. https://shuangrimu.com/posts/language-agnostic-intro-to-depe... for one explanation of dependent pairs.

Nor do I think the value of dependent types comes from proofs, which I agree are still too cumbersome to really use at a large scale for most codebases.

My hypothesis for the sweet spot for dependent types is to express invariants, and then use other mechanisms such as property tests to check those invariants, rather than actual proofs (except for trivial proofs). This does mean that I favor a very proof irrelevant style of dependently typed programming (i.e. in the language of Agda, way more `prop` than `set` or in Idris lots of zero multiplicities for dependently typed arguments) which is quite different from the way a lot of current dependently-typed code is written.

If I understood the article correctly the four type of functions are present in Julia:

  julia> factorial(3) # term -> term
  6
  julia> typeof(3)  # term -> type
  Int64
  julia> one(Float32) # type -> term
  1.0f0
  julia> Vector{Float32} # type -> type
  Vector{Float32} (alias for Array{Float32, 1})
And actually more mixed:

  julia> using StaticArrays
  julia> SVector{3, Float32}(1, 2, 3)
  3-element SVector{3, Float32} with indices SOneTo(3):
   1.0
   2.0
   3.0
  julia> convert(Float32, 1)
  1.0f0
Julia is of course dynamic, so maybe it's cheating? I'm writing this comment because all the greek in that cube, and actually using a cube, makes this look like very abstract stuff but I consider Julia a very practical language.

> Untyped scripting languages are fine for scripts. Honestly idk why developers write big libraries in untyped languages like JavaScript and Python.

And yet it works

I am no expert on the topic but pretty much every language has an instanceof-like pattern — that doesn’t really fulfill the generic term-to-type capability in my opinion. That would be being able to create a new type at runtime from a term based on a user-defined function — which yet again may have some constraints like only compile-time known constants can be used, etc.

I don’t know the peculiarities of julia to comment on this specific language though.

Also not an expert, or even educated on it.

In the case of Julia for example static arrays can be created, at runtime or not, with a fixed length that is part of the type definition, to give a less trivial example than "is an instance of". It's actually a very practical example as static arrays are of great importance for performance on certain applications.

I think most people would dismiss dynamic language examples, and that's the reason I comment Julia is dynamic in my comment, because most people interested in types are interested in using them at compile time to prove, or strengthen at least, correctness. In the case of dynamic languages types feel like just another piece of metadata and if you have "eval" available you basically can do anything. Julia actually does not use types to catch bugs but to structure programs and to improve performance when possible (to great extent).

Again, only superficially knowing julia, in what way is this static array a type over being a value as it is in java, with having a field with the length (the type being the x[] only, without length)?
It's a good question. Intrinsically, there is no difference: you need to consider how both pieces of data are used. Consider these two "equivalent" 1D arrays:

  x = Vector{Float64}(1, 2, 3)
  y = SVector{3, Float64}(1, 2, 3)
Both store the same data, and as a matter of fact you probably can pass them as arguments to most functions that accept arrays (if the functions are well designed). The difference however is that `y` has a more restricted interface: you cannot resize `y`. This restriction allows the compiler to optimize better the code both in time and space: dynamic arrays for example may allocate more memory than strictly necessary to speed up resizing and may do more bounds checks. In an ideal world should also catch bugs like detecting that `y[4]` is an error. So, to summarize, when you use `y` instead of `x` you are telling the computer that `length(y) = 3` is an invariant of your code and that it's safe to make that assumption. Having the number "3" stored as a type has an special meaning.

Off tangent, but these reminds me the first time I read the phrase "everything in Python is an object". That got me puzzled because in my mind if "everything is an object" then the phrase has zero information content. You cannot understand what is an object in isolation, you need to consider how the interpreter uses them. Or the first time you try understand what is a vector in mathematics. You cannot understand a vector in isolation, you need to consider the vector space it lives in, what operations are allowed.

Without any authority, I really don’t see how is it any different from any language’s arrays vs lists. I feel like it is syntax only for the age-old “let’s store the size of the array next to the data” vs a more complicated implementation like std::vector or Java’s ArrayList which may copy the content into a new, larger array behind the scenes.

So it is probably meaningless to talk about it in case of dynamically typed languages, even though type enforcement may not be needed (as it is impossible in the generic case for dependent types)

To turn the question around, what _would_ make it part of the type in your opinion, what criterion do you use to distinguish that from "having a field with the length (the type being the x[] only, without length)"?