Hacker News new | ask | show | jobs
by dsieger 1432 days ago
Would be curious to know what strategies other people apply in order to keep complexity down over time!
10 comments

I like a lot of your other replies. I also have a philosophy of doing net improvement every time I go in. If you put a little bit of elbow grease in every time, the net effect on your code over months is pretty nice.

But you also have to understand and internalize that it's OK to do a little bit of improvement each time. You don't have to go in, pick up a piece of code, sigh dramatically, and fix everything you can see about it. Just fix a bit. Turn some strings into enumerations or a custom type. Turn a recurring series of arguments into a single struct. Rename a deceptively-name parameter or function variable into something correct and meaningful. Add a test case for what you just did, or add a test case for something even related to what you just did that was not previously covered. Even just one of those is a good thing. Don't give in to the temptation to throw a 15th parameter on to a function and add another crappy if statement in to the pile of the god function. Don't fix the god function all at once, just take a bit back out of it.

If every interaction on the code base is net positive, even just a bit, over time it does slowly get nicer, and if you greenfield something with this attitude, it tends to stay pretty nice. Not necessarily pristine. Not necessarily nice in every last corner. But pretty nice. And if you do need to take out some technical debt, you'll have the metaphorical capital with which to do it; a non-trivial part of the reason why technical debt has such a bad rap is that it is taken out on code bases already bereft of technical capital, which means you're on the really bad part of the compounding costs curve to start with.

I'm not a greybeard by any stretch, but I personally get a lot of mileage out of just stopping to ask: Does the extra layer of abstraction, or extraction of code to a method, or creation of a class - does it make the code *right now* easier to understand? If yes, do it, if not, don't.

The example I keep coming back to is when I was a junior, one of the other juniors refactored the database handling code in one of our apps to use a class hierarchy. "AbstractDatabaseConnection" "DatabaseConnection" etc. And mind you this was on top of the java.sql abstractions already present.

I don't necessarily know what his end goal was, since the code still seemed pretty tightly coupled to how java and postgres handle connections and do SQL. One might theoretically now be able to create a testing dummy connection that responds to sql calls and returns pre-baked data. But the functions we had were already refactored to be pure functions, and the IO was just IO with no business logic.

Anyway, all it ended up doing was making it so I never touched the database code in that app ever again. Integration testing was handled by just hooking it up to a test db via cli args and auto-clicking the UI. And eventually when people started side-stepping it, I took the opportunity (years later) to just go back in and replace both it and all the side-stepped code with plain ole java.sql stuff that literally anyone with two thumbs and 6 months of java experience could understand.

So now, unless I have some really strong plan (usually backed up with a prototype I used to plan out the abstraction) for an abstraction model, I just write code, extracting things where the small-scale abstractions improve current readability, and wait for bigger patterns (and business needs) to emerge before trying to clamp down on things with big prescriptive abstraction models.

I'm a big fan of the "IO Sandwich". This is where you keep complex computation as pure functions as much as possible. And push the IO to the edges of the system. So you might have read-compute-write. This keeps the computation functions testable and composable.
In probably my favorite software-related talk[1] (certainly the one I most frequently share), this is referenced as “functional core, imperative shell”.

1: https://www.destroyallsoftware.com/talks/boundaries

Does anyone know of a transcript of this talk? There is a link on the YouTube copy of the video, but it seems to be dead.
Thank you for asking. I regret posting this without looking for a transcript first, especially since my capacity for consuming video/audio content has declined as rapidly as a lot of topics I’d be interested in have embraced video. I may well contribute to transcribing it if I find some free cycles.
Yes, this is the way. In addition, often the internal and external representations of information will be different, in which case I normally prefer to keep any conversion or validation logic as close to the corresponding I/O as possible. Then all the internal computation logic only has to work with a clean and well-defined internal data model.
For me the number one thing I try to focus on is _naming_. If something is hard to name, it's likely hard to understand or overly abstracted (misdirected). If something is easy to name, it likely follows [insert any software development "best practice" here].

What's a good name? I love the phrasing from _Elements of Clojure_ by Zachary Tellman [1]

> Names should be narrow and consistent. A *narrow* name clearly excludes things it cannot represent. A *consistent* name is easily understood by someone familiar with the surrounding code, the problem domain, and the broader [language] ecosystem.

1. https://leanpub.com/elementsofclojure/read_sample

Yeah, I find that if you can name something well then everything else falls in place much easier.

At work, for any large feature, we usually go over naming pretty extensively, and aim to be consistent in documentation, code, and discussions, so everyone knows exactly what everyone is talking about.

Trust your tooling, and your repository. It's safe to delete if you still have a record of the way the code was before. Too often I see code that doesn't need to exist because someone is afraid to remove it. Modern IDEs are excellent at showing dependent code, and GIT and other source control tools are excellent at giving you freedom to remove things.

Oh, and have good testing in place to make sure you aren't breaking a required path that your IDE can't detect, obviously. No IDE in the world can detect "Oh, we still had one client on that old obsolete REST call and they are pissed"

That's what we call 'scream testing'
Unit Tests. If you can't write a unit test for it, it's too complicated and it's going to snowball quickly into a giant mess.
Unit tests, while good at promoting decoupling, can absolutely be a major driver of complexity, as it may break the code into far more units than what is reasonable.
Be careful with this. Unit tests don't tell you much about the correctness of a system overall, and they rarely survive a substantial refactoring. Optimizing for unit testability can make individual classes/functions "simple" but at the expense of creating a ton of them and pushing the complexity to the interfaces and integration between them.
I love unit tests, but admit I have absolutely seen unnecessary complexity including complete classes and namespaces solely to enable testability in many cases.

It's a justifiable trade off for me, but I don't pretend that unit testing reduces complexity.

I think it is of at least slight interest to some who missed it, to bring back this thread from 2018, about Oracle code (I too once worked on it so I immediately saved that comment link when it was posted):

https://news.ycombinator.com/item?id=18442941

I'm not sure if you're saying so, but those are not unit tests.
Yes it is about tests in general. I think it fits the discussion and many comments very well, this does not really seem to be about only unit tests specifically. Many comments are more general in tone.

The very comment at the top of this sub-thread does not seem to limit itself to the subject of unit tests.

My experience with automated testing was great until I had to test I/O functionality: files, databases. That's when the test suite itself became too complicated.
Absolutely! For me, comprehensive testing is key to keep things clean over time. Not sure why this didn't come to my mind when writing the article. I think I was somehow assuming that this is a necessary pre-condition anyway.
Single source of truth is prob the biggest offender I see.

Same conceptual state gets represented in multiple variables or derived variables, and these must stay in sync. Very brittle

I call this "copy-paste-copy-paste-refactor": don't factor or abstract out a routine before the third time it's implemented. Until then you don't know what the actual commonalities among the uses will be, or if the callers will have so many special cases that the routine isn't really that reusable.
All dependencies should be injected (and possibly wrapped with custom interfaces, if they're libraries).

All globals should be configurable (most codebases I've seen have a ton of hidden globals).

All side effects should be isolated.

"Break any of these rules sooner than say anything outright barbarous."

Prioritize functional testing over unit testing, which penalizes refactoring.