Hacker News new | ask | show | jobs
by munificent 632 days ago
I have such mixed feelings about dynamically typed languages. I've designed a whole pile of hobby programming languages, and dynamically typed languages are at least an order of magnitude simpler to design and for users to learn and start using.

At the same time, they inevitably seem to lead to user stories like this where a user really does know exactly what types they're working with and wants the language to know that too (for performance or correctness), and they end up jumping through all sorts of insane hoops to get the optimizer to do exactly what they want.

6 comments

I think Common Lisp got it exactly right. Strongly typed dynamic language where you can optionally specify types for the extra performance/correctness if need be (especially with SBCL).

Honestly, I think weak typing is more of an issue than dynamic typing and people cry for static types when they suffer mostly from the former.

Dynamic typing is great because it allows you to have extremely complex types for basically free. It allows for insane expressiveness. It also makes prototyping much easier and does not force you into over-specifying in you types early on. In dynamic language most of your types are the most general type that would work by default while static types forces you to use very specific types (especially when lacking structural typing.)

If you want to allow just half the expressiveness of dynamic languages in your static language you will quickly find huge complexity with dependent types, long compile time, cryptic error messages and whatnot.

Generally, I think gradual typing is rising in popularity for good reason. It allows for quick prototyping but also to drill down on your types when you want to. Best of both worlds.

I find much the same to be true. I'm a big fan of Racket's define/contract and clojure's Malli/guardrails. You get one of the biggest benefits of static types (code that self-documents the expected shape of data) while enjoying all the benefits of a dynamic language (like creating types at runtime, and repl-driven development).
I wish Common Lisp had integrated type declarations with TYPEP and CHECK-TYPE, instead of punting with "consequences are undefined if the value of the declared variable is not of the declared type," i.e., sucks to be you.
No doubt you know this, but Common Lisp's standardization gave leeway to implementations. So you could choose which implementation suits you better. Common Lisp is an umbrella for different Lisps to have some commonality. It was a process of negotiation, during a time when there were many design forks in the road, and diverse use-cases

For example, type declarations can enable performance optimizations, compiletime type checking, runtime type checking, IDE autocompletion, etc. Or they can be ignored, if compiler simplicity is more valued. All these things have engineering tradeoffs. For example, runtime checks may have runtime costs at odds with performance optimization

There might be higher-value improvements to Common Lisp, if higher quality code is desired

As someone who doesn't know much about types, do SBCL type declarations provide as good type-based development experience as OCaml and Rust?

And perhaps I wouldn't get your answer, I mean is there something fundamentally inadequate in the way SBCL declares types? I think there is a phrase for it in CS theory.

> As someone who doesn't know much about types, do SBCL type declarations provide as good type-based development experience as OCaml and Rust?

First of all, op was talking about strongly typed languages. Asking are they good as statically typed ones like Rust and OCaml is raising the goal posts quite a bit.

Second of all, SBCL can indeed have a subsection of its code expressed in OCaml-like static types, see

https://github.com/coalton-lang/coalton

To add to the other comment, it is also important to understand in the more interactive mode of developing in CL. You basically have the program always running, fixing bugs and adding features while it is running.

You simply don't have the problem of batch style programming where you have written a bunch of code and now you want to know if it works so you run a lot of static analysis on it beforehand because running it and getting it to the point and state that is relevant costs time.

In CL you don't end up with lots of code that has never been run. You have constantly run it during development and are much more confident about its behavior. So just having this interactive way of programming already leads to much more reliable software. It is not a replacement for static analysis or unit testing of course but another pillar to help you write more correct software.

I like dynamic languages too. But I don't like the idea of "optimization", and I would be super interested in a dynamic language that didn't attempt to divorce performance from correctness. The worst part about jumping through insane hoops to enchant the optimizer is that it can all go wrong with the tiniest change--a flag here, a different usage pattern there, a new version, etc., and suddenly your program doesn't do what you need it to, as though an operation taking 1000x longer is just a matter of degree.
I agree completely.

At the same time, no one wants their code to run 100x slower than it would in any typical statically typed language. Unoptimized dynamic languages are sloooooow.

Rpython and Graal (and what else?) provide JIT-for-free (or at least cheap).

Of course, this really only works for code that is (a) statically polymorphic but dynamically monomorphic, and (b) has hot loops, but qualitatively that conjunction does seem like it ought to cover a lot of low-hanging fruit.

Anyone have quantitative measures?

There aren't many people looking at these JITs at the moment. Stefan Marr[1]'s group[2] is, I believe, the where most of the research is currently done. A recent paper[3] compares performance of interpreters in RPython and Graal. Their baseline performance is Java, and they achieve performance close to V8, which itself is about 2x slower than Java.

My summary is you can write fast interpreters + get JIT for free, but fast JIT for dynamic languages still means 2x slower than JIT for statically typed languages (and Java definitely leaves some performance on the table due to how it represents data).

[1]: https://stefan-marr.de/

[2]: https://research.kent.ac.uk/programming-languages-systems/

[3]: https://dl.acm.org/doi/10.1145/3622808

There is only so much a compiler can do when any operation can result in a complex function call or variable sized data.
I'm happy with dynamic languages for almost everything I do and generally do not want to sacrifice flexibility, which is the price to pay for a static type system. However, certain parts of a program become more crystalline over time, whether for performance or correctness reasons, and being able to express those parts using a static type system makes a lot of sense. PreScheme [0] is an example of a statically typed Scheme that composes with the host Scheme. I'd like to see more work in this direction as Scheme already works well as a multi-paradigm language.

[0] https://prescheme.org/

Yeah prescheme is really interesting, I really liked the SystemCrafters exploration of it:

https://www.youtube.com/watch?v=QqKuHylIqBs

As a long-time Python and JavaScript user, I've come to the conclusion that dynamic typing is just not a good idea for anything beyond exploratory or very small projects.

The problem is that you invariably have to think about types. If you mistakenly pass a string to a function expecting an integer, you better hope that that is properly handled, otherwise you risk having type errors at runtime, or worse—no errors, and silent data corruption. That function also needs to be very explicit about this, but often the only way to do that is via documentation, which is often not good enough, or flat out wrong. All of this amounts to a lot of risk and operational burden.

Python's answer has historically been duck typing, which doesn't guarantee correctness so it's not a solution, and is more recently addressing it with gradual typing, which has its own issues and limitations. Primarily that if specifying types is optional, most programmers will not bother, or will just default to `any` to silence the type checker. While for JS we had to invent entirely new languages that compile to it, and we've reached the point where nobody sane would be caught working with plain JS in 2024.

Static typing, in turn, gives you a compile time safety net. It avoids a whole host of runtime issues, reduces the amount of exhaustive and mechanical tests you need to write, while also serving as explicit documentation. Code is easier to reason about and maintain, especially on large projects. You do lose some of the expressiveness and succinctness of dynamic typing, but what you gain with static typing is far more helpful than these minor benefits.

Dynamic typing with deeply nested data forces you to put type bandaids all over the code. For example you end up defining Pydantic schemas and then validating the same thing more than once since you can't guarantee that the type of a thing was not changed somewhere in the middle.

Dynamic typing forces you to test behavior which could be tested much more thoroughly by a type checker, at compile time, with zero development time.

Dynamic typing does offer much faster time to early prototyping but then drags you down with each bug.

Static typing does force some early commitments to the structure of the data but it also allows faster iteration and refactoring.

Static typing with good type inference seems the best to me.

I was strongly in the static typing camp for a long time, with Haskell, Purescript, Idris, OCaml, and Typescript, but over time I realized that the costs mostly outweigh the benefits.

> The problem is that you invariably have to think about types. If you mistakenly pass a string to a function expecting an integer, you better hope that that is properly handled, otherwise you risk having type errors at runtime, or worse—no errors, and silent data corruption.

The silent data corruption is really only a problem with weak dynamic typing, that usually automatically coerces types. A lot of dynamically typed languages still have strong typing and will immediately error out. And usually in practice you end up testing all of the code you're writing anyone, so this almost never happens in practice except when someone is refactoring without thoroughly testing what they did, which should be done anyway whether there are types or not.

> Primarily that if specifying types is optional, most programmers will not bother, or will just default to `any` to silence the type checker.

They probably do bother when it's an important module, or a an edge boundary that needs to be documented with a contract, or during times of significant refactoring. And these days LLMs can generate the specs, optional types, and tests pretty easily for any sort of self-contained, modular, reasonably well written code.

So now with LLMs I think there's even better reasons to use dynamic typing. And type completion in an IDE still exists for a bunch of dynamically typed languages anyway, like javascript.

> The silent data corruption is really only a problem with weak dynamic typing, that usually automatically coerces types.

That's not necessarily true. A function could serialize the passed value, which would work without type conversion, and it could still result in data corruption somewhere down the line. The point is that with dynamic typing there's no guarantee of correctness. It has nothing to do with strong vs. weak typing, which incidentally I don't find helpful debating, since there's no single definition for those terms, and most languages can behave arbitrarily depending on the situation.

Furthermore, you ignored my primary point of runtime type errors. These are very common in Python, and there's really no solution to them besides doing offline type checking, which as I said, has its own problems and is not a silver bullet either.

> And usually in practice you end up testing all of the code you're writing anyone, so this almost never happens in practice except when someone is refactoring without thoroughly testing what they did, which should be done anyway whether there are types or not.

Assuming you were referring to data corruption, maybe. But type errors happen very often in practice, and no amount of testing can guarantee you won't run into them. Besides, most teams I've worked with weren't disciplined enough to achieve even 100% statement coverage, let alone branch coverage, or do more sophisticated testing like fuzzing. So while type errors are close to impossible to prevent by testing, even data corruption can easily fly under the radar.

Static typing gives you this safety net, _for free_. This alone is worth the minor inconvenience of having to specify type information, and think about types explicitly.

> They probably do bother when it's an important module, or a an edge boundary that needs to be documented with a contract, or during times of significant refactoring.

This requires experience to know good practices, when to follow them, and the discipline to do so. IME very few developers are this diligent 100% of the time, and most, if given the option, will do the minimum amount of work necessary. I'm not just blaming others, I've been lazy about good practices myself many times. This is why gradual or optional typing is not a solution to these issues.

Looking at it from the other direction, most statically typed languages can do type inference. This avoids the tedium of having to be explicit all the time, while still giving you the benefits of type checking at compile time. This is a much safer solution.

> And these days LLMs can generate the specs, optional types, and tests pretty easily for any sort of self-contained, modular, reasonably well written code.

Seriously? LLMs have no place in a discussion about correctness. They're glorified autocomplete engines, which can be useful, but trusting them to give you correct output for these issues is incredibly risky. At best you would need to manually verify everything they do, and I trust myself to do a quicker job in most situations with macros and `sed`.

> And type completion in an IDE still exists for a bunch of dynamically typed languages anyway, like javascript.

I feel like we're talking about two different things, and you're ignoring the main issue of type errors at runtime.

I bounced off learning Haskell a few times (infinite regress to downloading category theory textbooks), but almost the moment I saw its type inference, I immediately wanted that in lisp.

Roc is looking pretty nice (especially once your editor can paste in the inferred type like they want to do), but I still think there’s an empty space for an imperative language where type inference makes it feel as untyped (or at least as unceremonious) as (pre-annotation) Python

stanza (https://lbstanza.org/) is a very interesting experiment in designing for gradual types rather than retrofitting them onto a dynamic language