Hacker News new | ask | show | jobs
by EdwardDiego 1783 days ago
> An incoming HTTP request? it is a plain Clojure dictionary.

I learned to code in Python. Loved it. Dynamically typed dicts up the wazoo!

Then I learned why I prefer actual types. Because then when I read code, I don't have to read the code that populates the dicts to understand what fields exist.

8 comments

The two are not mutually exclusive. Clojure has namespaced keywords and specs[0] to cover that. (There is also the third-party malli, which takes a slightly different appproach.)

The advantage is that maps are extensible. So, you can have middleware that e.g. checks authentication and authorization, adds keys to the map, that later code can check it directly. Namespacing guarantees nobody stomps on anyone else's feet. Spec/malli and friends tell you what to expect at those keys. You can sort of do the same thing in some other programming languages, but generally you're missing one of 1) typechecking 2) namespacing 3) convenience.

[0]: spec-ulation keynote from a few years ago does a good job explaining the tradeoffs; https://www.youtube.com/watch?v=oyLBGkS5ICk

This is one of those self-inflicted Clojure problems. In Common Lisp you might use an alist or a plist for small things, but you'd definitely reach for CLOS classes for things that had relationships to other things and things that had greater complexity.

IIRC, the preference for complecting things via maps, and then beating back the hordes of problems with that via clojure.spec.alpha (alpha2?) is a Hickey preference. I don't recall exactly why.

No source to back this up, but my guess is that Clojure was driven by the need to interopt with Java so is to not get kicked out of production. This meant absorbing the Java object model. Shipping a language with both Java objects and CLOS and making them both play nice together sounds like a nightmare.
There's a Common Lisp implementation on the JVM, called ABCL: https://www.abcl.org/ The interop is... not the best, but it's something. I've only used it for proof-of-concept stuff (e.g. how-to make a Lisp module, export it as a jar that java code can include in their pom and use without knowing it's Lisp) and for minor development experience enhancements in a giant Java codebase (e.g. change method in Java, it gets hot-swapped in, I invoke it or an upstream method from Lisp with real data so I don't have to make an even higher upstream network request via some deep UI section).
This comment helpfully explains many of the reasons Rich had for choosing immutable, persistent, generic data structures as the core information model in clojure (instead of concrete objects / classes): https://news.ycombinator.com/item?id=28041219

Not wanting to misquote the above / Rich himself I would TLDR it to:

- flexibility of data manipulation

- resilience in the face of a changing outside world

- ease of handling partial data or a changing subset of data as it flows through your program

Please note that no one (I hope) is saying that the above things are impossible or even necessarily difficult with static typing / OOP. However myself and other clojurists at least find the tradeoff of dynamic typing + generic maps in clojure to be a net positive especially when doing information heavy programming (e.g. most business applications)

Namedtuples FTW! A de-facto immutable dict with the keys listed right there in the definition to obviate all the usage head-scratching. Then, if you need more functionality (eg factory functions to fill in sensible defaults), you can just subclass it.

TBH I've never understood the attraction of the untyped dict beyond simple one-off hackups (and even there namedtuples are preferable), because like you say you typically have no idea what's supposed to be in there.

Question: 1. Can a GET request have a non-empty request body?

2. Assuming you don’t know the answer to that question, will the type system you use be able to tell you the answer to that question?

This is a pretty simple constraint one might want (a constraint that only certain requests have a body) but already a lot of static type systems (e.g. the C type system) cannot express and check it. If you can express that constraint, is it still easy to have a single function to inspect headers on any request? What about changing that constraint in the type system when you reread the spec? Is it easy?

The point isn’t that type systems are pointless but that they are different and one should focus on what the type system can do for you, and at what cost.

Any statically-typed language with generics can express that by parameterising the request type with the body type. A bodiless request is then just Request[Nothing] (or Request[Unit] if your type system doesn't have a bottom type). Accessing the headers just requires an interface which all static languages should be able to express.
(1) note that “statically-typed language with generics” excludes a lot of statically typed languages, including C and Go (at least pre generics).

(2) this misses the meat of the question which is how to express that (eg) a GET request doesn’t come with a body and a POST request does. I suppose that you’re suggesting that one registers a url handler with a method type and that forces the handler to accept responses of a certain type. Or perhaps you are implicitly allowing for sun types (which aren’t a thing in many static type systems.)

(3) even in C++, isn’t this suggestion hard to work with. That is, isn’t it annoying to write a program which works for any request whether or not it has a body because the type of the body must be a template parameter that adds templates to the type of every method which is generic to it. But maybe that is ok or I just don’t understand C++.

1) Looking at the TIOBE index, all the static languages I recognised on there are: C,C++,C#,Visual Basic,Go,Fortran,Swift,Delphi,Cobol,Rust,Scala,Typescript,Kotlin,Haskell and D. Of these C and Go are the only two that don't appear to support generics so I don't think this approach excludes a lot of static languages.

2) If you want to distinguish GET and POST requests statically then you just need a type for them e.g.

    GetRequest<TBody> implements Request<TBody> { }
if you don't need to do this then you can just add a method field and use a single type for both. Either way you don't need to use sum types so a language like Java can express it.

3) Yes you'll have to make functions that don't care about the body type generic so this approach could become unwieldy if you have a few such properties you want to track.

How about values restricted to identifiers currently in the database table? There's always something the type system can't do.
F# has a feature called type providers that make this sort of bookkeeping between the database and the code less tedious, but even if you mess it up, static typing still gives you more safety than dynamic. If your code blew up because it should have accepted an identifier it didn’t, you know that the code has not been written to handle that case and can fix it. Alternatively, you can just choose to ignore this, and do what a dynamic language does. There is nothing stopping you from being dynamic in a static language, passing everything around as a map, etc.
A demo of a SQL type provider in action: https://youtu.be/RK3IGYNZDPA?t=2539

It requires a bit of elbow grease to make it work with a CICD system... but it works :D

That's nifty.
1. Yes. It's weird, but it's legal HTTP.

2. Sure. The request type has a body property.

Does “the request type has a body property” actually imply (1) though? In a language like C or C++ or Java, you could have a protocol like “body is always null on GET requests.” The question isn’t really about HTTP, that was just an easy-to-reach-for example, it is really about what having explicit types allows one to deduce about a program.
To be fair, an incoming request is, almost by definition, dynamic. It makes sense to have that as a map, since the main sensible thing to do on receipt is validation/inspection.

Granted, you may have a framework do a fair bit of that. Depends how much you want between receipt of the request and code you directly control.

Usually the approach in a statically-typed language is to transform your dynamic request into something that you know through parsing instead of validation. Here's a great article about this: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va....
This is the second time I've seen the link above. And while I agree with the premise, the author clearly does not understand how to properly use the `Maybe` monad (a term that does not make an appearance!).

There is little use in wrapping a call in `Maybe` to then immediately unwrap the result on the next line. Doing so isn't really using the construct... One would expect the lines following the creation of `Maybe` to bind calls through the monad.

In the end I see almost no meaningful difference between their "Paying it forward" example and simply utilizing an `if` to check the result and throw. In essence the author is using a parse and validate approach!

You might try re-reading it with some charity - the example's purpose isn't to teach the `Maybe` monad, but to remove the redundant check. To go into what `bind` does would be a diversion from the main topic (parsing vs validating).

FWIW SPJ has called this blog's author a "genius" so... I think they do know how `Maybe` works. https://gitlab.haskell.org/ghc/ghc/-/issues/18044#note_26617...

But `Maybe` is specifically designed to remove redundant checks for a value that may (or may not) be present! That's the whole point of the monad! It seems rather unfortunate this isn't highlighted (or at least illustrated) doesn't it?

I generally agree with the premise of the post.

I think you're referring to this part of the `getConfigurationDirectories` action, which has type `IO (NonEmpty FilePath)`:

    case nonEmpty configDirsList of
      Just nonEmptyConfigDirsList -> pure nonEmptyConfigDirsList
      Nothing -> throwIO $ userError "CONFIG_DIRS cannot be empty"
The "meaningful difference" you're looking for is the type of `getConfigurationDirectories`. The previous version had type `IO [FilePath] `, which _doesn't_ guarantee any configuration directories at all. It did indeed check the results and throw. But it doesn't guarantee that all the `[FilePath]` values in the program have been checked. There are neither tests nor proofs in this code. In contrast, with the revised version, you can be certain anywhere you see a `NonEmpty FilePath` it is indeed non-empty.

The code I've quoted that checks which case we have, is the only place that needs to handle that `Maybe`. Or maybe `main`, if we want to be more graceful. The author (I wouldn't say I know her but I know that much) does know how to chain maybes with bind but it's not necessary in this example code.

My point is that if you are not chaining `Maybe` then the utility of employing the construct is unobserved. The entire purpose of using `Maybe` is to relieve the client from the need to make checks at every call for a value that may (or may not) exist. If you intend to immediately "break out" of the monad and (even more specifically) throw an error, you might as well just use an `if`.

I'm sure `main` could be written to "bind"/"map" `getConfigurationDirectories` with `nonEmpty`, `head`, and `initializeCache` in a way that puts the `throw` at the top-level (of course the above implementations may need to change as well). Unfortunately I'm not familiar enough with Haskell to illustrate it myself.

The purpose of Maybe is to explicitly represent the possible non-existence of a value which in Haskell is the only option since there's no null value which inhabits every type. The existence of the monad instance is convenient but it's not fundamental. The type of getConfigurationDirectories could be changed to MaybeT IO (NonEmpty FilePath) to avoid the match but I don't think it would make such a small example clearer.
Lexi absolutely understands how to properly use the Maybe monad. What you're saying to do here is the exact opposite of what this post is advocating for. You're talking about pushing the handling of the Maybe till later and the post is all about the advantages of handling it upfront and not having to worry about it anymore. You might want to read it one more time.
I understand. But what is purpose of `Maybe`? The reason one would reach to the above construct is precisely to offload (pushing to later) the handling of a value that may (or may not) be present at runtime such that a developer can write code assuming the value is always present and ignore the `Nothing` case.

Sure you can unwrap it right away, but that isn't necessary because you could also just "bind" the next function call to the monad (which is more idiomatic to the construct). You never have to worry about that value in this case because... well... that's the benefit of using `Maybe`.

I'm not super familiar with Haskell, but my sense is that the author is trying more to please the compiler (at a specific point in the program!) than simplify the logic. That is, they want a concrete value (`configDirs`) to exist in the body of `main` more than they want the cleanest representation of the problem in code.

> But what is purpose of `Maybe`?

In this case, it's to provide a better error message in case there's an empty list than `fromList` would provide.

> You never have to worry about that value in this case because... well... that's the benefit of using `Maybe`.

But you do, your entire program doesn't live in `Maybe` so at some point you have to check whether it's `Just a` or `Nothing`. Once again, the whole point of the post is to argue that getting out of the `Maybe` as close to parsing time as possible is preferable so you have a more specific type to work with after that. You also see right away what didn't parse instead of just knowing that something didn't parse, which is what would happen if you stayed in the `Maybe` monad for all your parsing.

That is a valid approach in any language. Static or not. Doesn't change my point that heavily. And it is all too possible to pick a bad parsing/binding language such that protocol changes in the request are now foot guns.
That's true, but static languages are not worse at handling dynamic data. From the same author: https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-typ....
To an extent, I agree. I'm just pointing out that this is a bit of a bad example. I want there to be dynamic inspection of input.

That said, maps as the only tool is clearly messy. And is a straw man.

> Then I learned why I prefer actual types

I do like static typing. But honestly, no other PL¹ (statically typed or otherwise) even comes close in terms of the ergonomics and joy of writing software. Nothing is quite enjoyable for me like Clojure. Haskell is great but hard, and I'm years away from claiming I achieved production-ready proficiency with it. I don't want to berate other languages, but I looked into OCaml, F#, Kotlin, Scala, Rust, and a few others. And none of them feel to me as enjoyable as Clojure. After so many years of programming, I finally, truly feel like I love my job. Also, I never liked Python. Maybe just a little, in the beginning. Once I get to know it, I disliked it forever.

-------

¹ I mean languages used in the industry, not counting even more "esoteric" PLs

I agree. This doesn't seem much different to saying they're all objects. You still need to know what to expect inside the dictionary.
The difference being that objects have a class where you can look to see what fields it specifies.
Java doesn’t really have a nice interface for interacting with objects in general. Closure does have a nice interface for interacting with dictionaries. They have namespaces keyword symbols for keys which are much more ergonomic than typing strings, and they have lots of functions for modifying dictionaries. I think the big difference is in the philosophy of what the language thinks data is, and how the world ought to be modelled.
Sure, depending on the language. What I mean is having dictionaries doesn't mean you don't have to learn schemas.
Yeah, he mentions that later on as a drawback