Hacker News new | ask | show | jobs
by mabbo 2496 days ago
> 1. Economy of expression

Funny, this list is the same one I use as to why I'm so annoyed with Clojure right now.

I inherited a mission-critical Clojure ML library my team uses for it's primary business goals. It was written 4 years ago by a research scientist- who quit 3 years ago. We know what it's supposed to do. We know that it seems to do the job well. We just can't understand the code well enough to be certain of what it's doing or how it's doing it. If this thing breaks or stops working, we're screwed.

The problem is that the author, like most of us, found great joy in writing very few characters to express very big ideas. Clojure let him do that to an extreme degree. And I'm sure if you were sitting beside the author, with him explaining these dense expressions, you would be enlightened at the elegance and beauty of this language.

Sadly, I was not.

I'm sure Clojure is a lovely language. But it lets you write a Voynich manuscript that compiles.

7 comments

This is true literally for every language out there. My team maintains a Java application from over a decade ago. It's an utterly incomprehensible mess that nobody understands, and it's incredibly difficult to make any changes to the project.

It's not the job of the language, but rather that of the developer to write clear and readable code.

This problem is addressed by using good design practices, code reviews, testing, and documentation.

Oh, we also own plenty of those in Java. That's the job- carefully lay the legacy to rest as you replace it with something better.

But at least the Java is overly verbose. It takes a while to read all that code. With the clojure stuff, I'm staring at the same line for just as long and getting no where.

I find that I have the opposite problem myself. With Clojure, the code tends to be more declarative and it tells me what it's doing. Meanwhile, Java leaks a lot of implementation details into the code, and I find it much harder to decipher the intent.

For example if I see something like (filter even? numbers) I immediately know what is being done and why. There aren't any surprises there.

Meanwhile, when I see something like:

    List<Integer> acc = new ArrayList<Integer>();
    for (int number : numbers) {
      if (number % 2 == 0) acc.add(number);
    }
then I have to walk through that code in my head to figure out what the intent is, and whether it's actually doing what it's meant to. On top of it I have to worry about mutability. Should a new variable like acc be used, or should the numbers list be updated in place, if it is updated in place where are all the other places that it's used, and are they being affected when I make a change, and so on.

So, there's a lot more to think about even in a completely trivial case such as the example here. In real world code that complexity quickly becomes overwhelming in my experience.

Clojure is not a particularly cryptic or difficult language so maybe a refresher in the language and standard library might help.
Well, what is your level of proficiency in Clojure? If it was your main language, do you think you would still struggle reading what the code does?

I've inherited code bases in different languages in the past, and generally, it's the unfamiliarity to the language that makes it harder. Especially Clojure, being so different, the unfamiliarity is even stronger.

For example, if you read the Clojure source code (https://github.com/clojure/clojure) are you similarly lost? Or when reading any random Clojure github project?

Also, is this the first time you inherit software like that? Like have you ever inherited other similarly complex piece of code in your main language? Because I have as well, and it is never easy, no matter the language and familiarity.

Not trying to attack you by the way, it may be true, but I'm trying to isolate the variable of it being written in Clojure against the rest to really get a feel of the effect Clojure has over what you are experiencing.

The clojure source code is not a good example, in my opinion. Much of it is written in a subset of clojure, since it builds itself up bit by bit and much of it is highly-optimised clojure code. I consider myself very proficient in clojure, yet I struggle reading clojure cores’s code.
My team is like an orphanage for legacy projects. By my last count, we have to maintain code in 8 different languages. Clojure is new to me, but I can pick up most languages (FP or otherwise) pretty quickly.

All the rest, my team has managed to decode the intent and flow.

> ML library (...) It was written 4 years ago by a research scientist (...)

Do you believe the language is the issue, been written by someone not trained on software engineering practices?

Would you find untested (I'm betting this code you inherited lacks tests, from experience working w/ ML research code artifacts) verbose Algol-family code any easier to understand?

You're not wrong, there are so many sins in the creation of this beast. A test would at least give me some purchase to grip the logic from.

But we own lots of legacy stuff. It's only this one that gives me a headache.

Honestly, I wish it had bugs or failed occasionally so I could justify replacing it!

Yes, agreed. That being said, there are languages (without naming names) that lend themselves to readability by others and those that don't.
You should be annoyed with the less than ideal coding practices of your former colleague not the language.
Didn't DJB famously use very small variable names when writing qmail in C? This is (obviously) not unique to C or Clojure. But it is something that is occasionally influenced the language best practices or culture, as you see in the giant names everywhere in Objective-C and the opposite in the push for minimalism in Ruby.

The names of complex functions, even with documentation, tends to be a bigger problem with scale than local variables and functions or utility functions.

Language design stuff like namespacing/modules helps provide a balance between short/descriptive and so does documentation of key/complex parts as well as how you architect the larger pieces and organize code in directories.

Additionally there's type systems, IDEs with docs/function + type signature integration, standards like JSDoc, etc that help. So it's a mix of a lot of things besides the human.

It does look alien. There's definitely a learning curve. But since you have the REPL, it can be very effective in understanding foreign code. Run some of the functions, add tracing to it, debug printlns, use with-redefs to replace functions dynamically etc. Once you understand the data schema better, add spec to the mix.

It is a very different approach and takes time to get used to.

I'm stuck on the other side: I have millions of lines of legacy Java code with high complexity. Even with all the types and very long names, understanding the code is not easy either and the 'step into/step over' buttons in the debugger are seared into my brain. Luckily IDEA can run arbitrary Java expressions during debugging, which gives you some interactivity back.

This is why I like python and C, almost impossible to layer in multiple layers of opaqueness with them. You can almost always deconstruct what the original person was trying to do, whether they were successful or not.
Hahahahah. The inheritors of my python code who are not my future self would beg to differ. I have so many opaque patterns that are vital and yet purely conventional it hurts. They will miss the intention and go off and do something else to solve a similar problem and now we have two patterns.
I work with Python in my day job. It is very possible to add tons of abstraction on top of a Python codebase. In fact, I would argue that it is very easy to do just that in Python, thanks to a combination of dynamic types and mutable state everywhere.

With Python, I find it's easy to understand what individual lines are doing, but really hard to understand what the codebase is attempting to do as a whole.

Our experiences with Python and C seem to have been very different.