Hacker News new | ask | show | jobs
by kaba0 1159 days ago
I always wanted to ask someone whose “native tongue” is untyped languages — when you reason about code, what do you think of an object, is it of specific type? Nominal, or more like structural typing that you know that it has to have this and that method?

I have started programming in untyped languages, but simply can’t remember back at all, and now I can’t really imagine dealing with objects in my mental medal as not having some type.

Note: this is not a rebuttal for/against dynamic typing, I do think that types are really important at boundaries, but they may not be the silver bullet - contracts may be better at some things, for example. This may be an open question.

2 comments

I generally think about "types" in terms of capabilities more than shapes. When I write Ruby, I don’t generally think "this parameter must be an array". Instead, I think "this parameter must be an Enumerable". Or I think "incoming objects must behave like strings" (that is, they implement #to_str or are Strings)…although most of the time, I would really think "incoming objects must have useful string representations".

In Elixir, I do think about shapes more than capabilities (because Elixir is not OO), but with pattern matching, I can either specify "this must be a MyApp.Account struct" (which is just a fancy map) or I can specify "we will handle any map that has the keys X and Y, and Y must be a map itself".

I replicate this more formally when writing TypeScript, usually by building up type definitions and specifying those.

I’ll first say that my favorite languages are StandardML (very strongly typed), Common Lisp (not very typed), and JS (even less typed).

Look at Erlang. It has bigints, floats, Booleans, but-strings (sequence of bits —added because it is so common in telecom), string (not technically a primitive data type), functions, atoms, list, tuple, and map.

None of these look the same or act the same. People dream about seeing `123 == myMap`, but it simply isn’t a common thing because it doesn’t make sense.

The common rebuttal becomes: but how do I know if property X is a string or number when I’m using it?

If that’s your question, you are already messing up. What you really want to know is what X actually represents. Otherwise, you’re just shooting in the dark which is at least as dangerous as getting the type wrong and probably more so because a wrong type will become obvious quickly while mangling that number or string may not be caught until a much later time after serious damage has propagated throughout the data.

Let’s say you have something called `login`. Is it a number or string? If it’s a string, is it an ISO date, UTC date, or something else? Is it when they logged in or when their login expires? If it’s a number, it could be a Unix string. It could also be a calculated value for how long the user has been logged in and could be days, hours, minutes, seconds, milliseconds, or something less common.

How do you know which thing is correct?

In a good codebase, you read the docstring comment on the data constructor that describes what it does. If it says “milliseconds since last login” vs “token expiration using ISO datetime format” do you have any question at all about whether it’s a number or string?

If there isn’t a docstring, you’ll be digging through that code or playing around with the responses and will see the data type anyway.

The result is that you’re forced to better understand what you’re doing which isn’t a bad thing in my opinion. There may still be mistakes, but that leads to the next point.

Dynamic languages generally make it easy to dynamically check the types of incoming data and tend to be more flexible with mistyping (especially JS). I can’t count the number of major errors from common typed languages because they make introspection hard, so programmers don’t do it and crash on malformed data or when an API suddenly changes.

It’s also worth talking about null exceptions. Many dynamic languages expect type weirdness and handle it well. This usually includes null. Most statically typed code out there has LOADS of null exceptions lurking about which only trigger in obscure cases during runtime. In this regard, you could argue that the worse type issues also happen in typed languages, but are more dangerous in those languages too.

Untyped languages also tend to use safe numbers everywhere. Infinity is mostly useless, but not usually dangerous. Bigints everywhere are slightly slower in some cases, but completely eliminate overflow errors. Most typed languages use risky numeric types, so they must also force users to think about those things.

Finally, no common typed language offers good runtime introspection like smalltalk, Common Lisp, Erlang, or even JS. I feel static types are just a crutch to make up for this deficiency.

That brings us to good static languages. Typescript offers all the benefits of normal, unsound static typing combined with all the robustness of JS’s dynamic environment. The same can be said for Coalton and Common Lisp.

StandardML offers a language that feels like a dynamic language, but still offers static checks that are actually sound and completely eliminates null exceptions. Rust (an ML in spirit if not in syntax) does the same things in environments where garbage collection and other such amenities aren’t possible.

I like both kinds of languages and good examples from each word around the problems of each approach to make them (in my experience) about equal in productivity for equally skilled and experienced developers.