Hacker News new | ask | show | jobs
by toast0 1161 days ago
I work on a lot of 'glue' issues, often with languages like Perl, PHP, and Erlang (and a bit of Javascript here and there). Specifying types all over the place in languages like C, C++, Java, and Rust feels like it gets in the way and limits more than it helps. (feelings more than data here, of course)

Sure, at boundaries between teams, you need to specify the data in some way. That could be a type, but for me, often the other team is using a different language than me, so it needs to be a language agnostic type, and it can't include unsigned numbers because Java can't cope, and it can't include large integers because Javascript can't cope, etc. Protobufs are popular, json is too.

I have a lot of unpopular opinions though, and that's fine. It's just tiresome that everyone wants to come in and add types to things that don't need them. Also, I agree with dllthomas, most developers and teams are capable of creating dumpster fires in all sorts of environments, with all sorts of tooling. :)

4 comments

You can absolutely create a dumpster fire in any language.

Putting the fire out in an untyped language is a Herculean effort.

In my experience it is even harder in a typed one because now you have to deal with the type system nightmare they built. So the compiler fight your refactoring.
You never actually get away from types, they are a core requirement of using any data beyond raw bytes. The guarantees that a strong type system provide mean you can be certain about certain things before your program even runs. If that's a problem for you, you're likely just leaving bugs on the table to be discovered at runtime.
I’d modify your statement to say "strong, static type system". Strong and weak typing are orthogonal to dynamic and static typing. JavaScript has weak dynamic typing; TypeScript has strong-ish static typing sitting on weak dynamic typing. Ruby and Erlang/Elixir have strong dynamic typing. Rust and Go strong static typing (Go’s is weakened by interface{}, IMO, but it’s a valid choice).

With the way that Erlang and Elixir pattern matching can be used in function heads, I can have much the same feeling of certainty that people express from Haskell and Rust. (Erlang typespecs help here, but are not checked by the compiler itself, only by additional tools like dialyzer or gradualizer.)

I'll admit, I do not know Elixir very well. As long as it's checked at compile-time or checked across an entire application at startup/import/init, I think it serves the purpose well. The problems I want to avoid is discovery at (production) runtime the shape of data doesn't match what my functions expected.
Nobody code without a type system. The distinction is build time vs runtime type checking. At build time you catch bugs that would appear later at runtime. Which is more costly to fix later.
> Which is more costly to fix later.

This assumption is changed, IMHO, by Erlang. Hot loading makes the cost to make small changes very low. So the question becomes, do you pay the definite cost of build time type checking (usually includes coding time type annotation), or do you accept the possible future cost to making small fixes.

Of course, if you work in an organization where even a small fix requires months to release, then do all the things you can to prevent making small mistakes.

The cost at runtime also includes loss of data, inferior user experience, direct financial loss or even loss of human life for some systems.

It really depends on the domain but it definitely is more than pushing an update.

The cost is also in debugging. It’s much harder to figure out a problem after the fact because you’ve forgotten how the code in question works. If you catch a mistake while you’re coding something up, you can fix it immediately and not give it a second thought. But if you need to track a bug down weeks or months after writing the code, it can take a lot of work to figure out what the code does (and why), and why it is behaving incorrectly.

I’ve lost weeks to a memory leak once in javascript that was a 2 line change to fix. If I realised the problem when I wrote the code, I would have saved myself a lot of trouble.

Essentially every statically typed language contains runtime errors that can be costly too.

Erlang was designed to achieve 9 nines of uptime. It achieved this without static typing across very large applications. The fact that this is a regular occurrence with Erlang disproves the idea that a lack of types is fundamentally unsafe.

Static types are most useful with monolithic application design. They counter your massive ball of code growing too complex and the complete lack of introspection at runtime. They attempt to handle the problem that any error crashes your entire system.

Erlang uses a different approach.

First, it uses safe datatypes. You aren’t going to crash because you chose a 32-bit integer and rolled it over (something a type system won’t actually help with). This is maximally likely to corrupt user data without even raising any flags. Integer rollover has actually killed people (famous in the Therac-25).

Second, it uses all immutable data, so data sharing is safe. Also something types don’t help with. This is also a maximal risk of data corruption. Incorrectly mutating data has also killed people (also in Therac-25).

Third, it is functional. Toys reduces passing around giant balls of mud. Those balls are unusable disasters unless you add some types. Functional programming with immutable records means that a program can’t accidentally change the types of incoming data. Because the pattern encourages separating data and mutable state, the most common typing accidents are simply avoided.

Erlang is designed with concurrency first. This helps to keep those balls of code even smaller and further reducing the chances of typing errors. And of course, combined with immutable data, we eliminate another set of errors typing does nothing about and that have caused massive damage and probably deaths (a deadlock causing a NYC blackout leaps to mind).

Finally (I probably missed some points), Erlang is designed expecting crashes to happen. Few runtimes are capable of anything close to the elegant Crash handling of BEAM. Instead of fearing crashes, you understand they’re inevitable and embrace them. This means that you are prepared for not just an occasional type error, but will also elegantly handle null exceptions that plague most of the most common statically typed languages.

> Nobody code without a type system.

A small number do. Assembly languages are generally untyped. The Forth language is also untyped.

> At build time you catch bugs that would appear later at runtime. Which is more costly to fix later.

Generally agree. Programmers proficient in Haskell or Ada tend to consider types to be integral to their development process. The real question is whether this is a good tradeoff against development velocity, for your given project. Neither language markets itself for rapid application development, instead they tend to emphasise that the language aids with correctness and the ability to reason about code's behaviour.

Well no since unit tests run faster than build time.
Conway's Law means you have to fix the organization before you can fix the software. That's the real Herculean effort.
Test coverage is, in my experience, much more important factor than typing. A codebase with great testing is much easier to aggressively refactor/change whether typed or not.

That said, a dumpster fire usually has no or little tests, so maybe we're arguing non-existent hypotheticals :|

I’ve gone back and forth over the years on whether tests are a good enough replacement for types. A few thoughts:

- Types and tests find different bugs. I’ve found new bugs by converting a project from javascript to typescript. The project in question had a 2:1 test:code ratio but as soon as the typescript compiler could read it, it spotted a couple obvious errors.

- Large test suites often make refactoring harder, not easier. If you have a clear, fixed API boundary and your tests test that boundary, then testing helps. But most refactoring also involves changing up those APIs as well - since bad APIs are often the reason you want to refactor in the first place. When you do that, you have to also rewrite all your tests. Good type systems help refactoring. Writing rust in Intellij, I can globally rename functions and types in my project, promote tuples to structs, reorder function arguments, and all sorts of other handy refactorings. My tests get updated too. And the compiler tells me immediately if I missed anything, without needing to rerun my tests.

- Reading the types is my favourite way to get up to speed on a project, or get back up to speed on something I wrote myself that I’ve forgotten. "Show me your flowchart (code) and conceal your tables (type definitions), and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)

- I find I need far fewer tests to write reliable software when I’m using a language with a good type system. Most rust code I write works correctly once it compiles. Javascript is easier to write than typescript, but it’s harder to test and debug.

So with all that, I’m personally in camp type these days for most software. I think it’s usually the right choice.

> If you have a clear, fixed API boundary and your tests test that boundary, then testing helps.

A clear, fixed API boundary is exactly what Phoenix tries to encourage with contexts. Unfortunately, a lot of developers find them hard to understand. They're simple if you read up on DDD but again, a whole host of developers won't, or don't, do that either. LiveView in particular has a really a really great testing library [0] where you can write what are essentially end-to-ends that never touch even a headless browser. Since I'm always writing LiveViews, I pretty much only write LiveView tests and contexts tests which gives me large coverage (also some unit tests for the odd utility function). Otherwise, it's really important when writing non-typed functions to make it really obvious what is coming in and out, which is arguably a nice forcing factor.

The number one thing people bring up when shilling types is large codebases (it's been brought up in these comments). My opinion there I have found is quite unpopular and that is that pair programming should be far more prevalent than it is. I think the whole notion of "just stick a junior on that" is broken and I don't understand how types make that situation _that_ much better.

All said and done, I'm not actually anti-type. I mostly just find them to be incredibly noisy compared to a well-written function. I really like Ocaml where it's statically typed without needing to actually specify them.

> I mostly just find them to be incredibly noisy compared to a well-written function. I really like Ocaml where it's statically typed without needing to actually specify them.

Yeah; I haven't worked with ocaml but I've done some haskell (where you think about types so much more). Personally I don't mind rust / typescript's approach of needing types at the function boundary (function input & output types must be specified) while doing inference wherever possible inside each method. As an example, here's a very complex function in a project I'm working on chosen vaguely randomly[1]. The function diffs a run-length encoded DAG using a breadth-first search.

Visually scanning for types, there's a couple at the top of the function - both in the function definition and the BinaryHeap:

    let mut queue: BinaryHeap<(LV, DiffFlag)> = BinaryHeap::new();
But I think thats about it. Maybe there's more manually specified types in "normal" rust because most functions are smaller than that. But, it doesn't feel so bad. In this case I could probably even remove the explicit type annotation for that queue definition if I wanted to, but it makes the compiler's errors better leaving it in.

[1] https://github.com/josephg/diamond-types/blob/66025b99dbe390...

> I've done some haskell (where you think about types so much more)

You definitely still think about types in Ocaml, you just don't need to annotate due to the language design. A big part of what makes it possible is that there are no overloaded operators, eg, you can't add an int and a float without casting as the mathematical operators are different: `1 + 1` v. `1.0 +. 1.0`. While I've dabbled in both, I'm no expert in either Ocaml or Haskell, though.

Really for me it's just that I've never felt the pain as I haven't worked in a big enough project, I guess. There is something that just kind of annoys me about (pseudocode): `(name : string) :: string -> "Hi, #{name}"` because, like, no shit it takes a string and returns a string! It's a death by a thousand cuts thing where I don't want to read that stuff and the compiler doesn't need to be explicitly told that in order to do static analysis.

Anyway, again it's really not the end of the world as I'm not anti-type. I just don't yearn for them in Elixir or anything. If it had a solid typing system I even might use it, but I don't yearn for them or anything.

You have some really interesting projects on your github, though! I mostly build glorified CRUD web apps! I do always get a sense that a lot of the type-talk is centred around organization disorganization.

have you tried doing it in elixir? It's not that bad.
I think that glue-issues are especially well captured by languages with good static type-system. Moreso than in languages without static types, because when it comes to glueing, there are many things, especially errors, to consider that can easily be forgotten without the help of a compiler.

However, a language with an insufficient type-system indeed makes things harder than they are without it. I would count all the languages you listed into this category.

As another poster mentioned, typescript is fairly expressive. There are other (production) languages too, such as Scala or maybe D. And there are lots of academic/very-niche languages.

> It's just tiresome that everyone wants to come in and add types to things that don't need them

Well, types are there, if you like them or not. There's a reason that you have e.g. typeof in javascript, gettype in PHP. The question is rather if you explicitly annotate them or not. But yeah, sometimes it's not helpful to annotate types, especially if the language is incapable of expressing the correct type anyways, which is true for most programming languages.

IMO, TypeScript strikes a great balance here. I loved the way I could cast something to `any` when hacking something out, then add proper type annotations once it's ready to be productized. It also a did a good job with type inference.

Disclaimer: I work at Microsoft, but not in the Developer Division

I always wanted to ask someone whose “native tongue” is untyped languages — when you reason about code, what do you think of an object, is it of specific type? Nominal, or more like structural typing that you know that it has to have this and that method?

I have started programming in untyped languages, but simply can’t remember back at all, and now I can’t really imagine dealing with objects in my mental medal as not having some type.

Note: this is not a rebuttal for/against dynamic typing, I do think that types are really important at boundaries, but they may not be the silver bullet - contracts may be better at some things, for example. This may be an open question.

I generally think about "types" in terms of capabilities more than shapes. When I write Ruby, I don’t generally think "this parameter must be an array". Instead, I think "this parameter must be an Enumerable". Or I think "incoming objects must behave like strings" (that is, they implement #to_str or are Strings)…although most of the time, I would really think "incoming objects must have useful string representations".

In Elixir, I do think about shapes more than capabilities (because Elixir is not OO), but with pattern matching, I can either specify "this must be a MyApp.Account struct" (which is just a fancy map) or I can specify "we will handle any map that has the keys X and Y, and Y must be a map itself".

I replicate this more formally when writing TypeScript, usually by building up type definitions and specifying those.

I’ll first say that my favorite languages are StandardML (very strongly typed), Common Lisp (not very typed), and JS (even less typed).

Look at Erlang. It has bigints, floats, Booleans, but-strings (sequence of bits —added because it is so common in telecom), string (not technically a primitive data type), functions, atoms, list, tuple, and map.

None of these look the same or act the same. People dream about seeing `123 == myMap`, but it simply isn’t a common thing because it doesn’t make sense.

The common rebuttal becomes: but how do I know if property X is a string or number when I’m using it?

If that’s your question, you are already messing up. What you really want to know is what X actually represents. Otherwise, you’re just shooting in the dark which is at least as dangerous as getting the type wrong and probably more so because a wrong type will become obvious quickly while mangling that number or string may not be caught until a much later time after serious damage has propagated throughout the data.

Let’s say you have something called `login`. Is it a number or string? If it’s a string, is it an ISO date, UTC date, or something else? Is it when they logged in or when their login expires? If it’s a number, it could be a Unix string. It could also be a calculated value for how long the user has been logged in and could be days, hours, minutes, seconds, milliseconds, or something less common.

How do you know which thing is correct?

In a good codebase, you read the docstring comment on the data constructor that describes what it does. If it says “milliseconds since last login” vs “token expiration using ISO datetime format” do you have any question at all about whether it’s a number or string?

If there isn’t a docstring, you’ll be digging through that code or playing around with the responses and will see the data type anyway.

The result is that you’re forced to better understand what you’re doing which isn’t a bad thing in my opinion. There may still be mistakes, but that leads to the next point.

Dynamic languages generally make it easy to dynamically check the types of incoming data and tend to be more flexible with mistyping (especially JS). I can’t count the number of major errors from common typed languages because they make introspection hard, so programmers don’t do it and crash on malformed data or when an API suddenly changes.

It’s also worth talking about null exceptions. Many dynamic languages expect type weirdness and handle it well. This usually includes null. Most statically typed code out there has LOADS of null exceptions lurking about which only trigger in obscure cases during runtime. In this regard, you could argue that the worse type issues also happen in typed languages, but are more dangerous in those languages too.

Untyped languages also tend to use safe numbers everywhere. Infinity is mostly useless, but not usually dangerous. Bigints everywhere are slightly slower in some cases, but completely eliminate overflow errors. Most typed languages use risky numeric types, so they must also force users to think about those things.

Finally, no common typed language offers good runtime introspection like smalltalk, Common Lisp, Erlang, or even JS. I feel static types are just a crutch to make up for this deficiency.

That brings us to good static languages. Typescript offers all the benefits of normal, unsound static typing combined with all the robustness of JS’s dynamic environment. The same can be said for Coalton and Common Lisp.

StandardML offers a language that feels like a dynamic language, but still offers static checks that are actually sound and completely eliminates null exceptions. Rust (an ML in spirit if not in syntax) does the same things in environments where garbage collection and other such amenities aren’t possible.

I like both kinds of languages and good examples from each word around the problems of each approach to make them (in my experience) about equal in productivity for equally skilled and experienced developers.