Hacker News new | ask | show | jobs
Ask HN: What do you consider as “bad” code?
13 points by JZN666 1607 days ago
My opinion about this constantly changes through my programming experience. Functions longer than 20 lines of code? Pffff, an algorithm/problem can be such complex in nutshell and decomposing a function solving it will be non-relevant (see any std library of PL). Mutable global state? Pfff, anything in software is based on that, the problem is using it properly in target architecture (ex. in micro-controllers global state usually manages analog/digital output, calling a function for this adds memory and time overhead especially in real-time systems). Copy-paste? Pfff, similar code != this code solves the same problem, sometimes it's easier to include a change in "copy-paste" code than deciding how to include this in boundaries of "reusable" parts (ex. for the first time there're two structures X = (A, B) and Y = (A, B) with similar logic but different domains, and then manager decides to include specific logic for Y working with B data).

Now my general principle which distinguishes "good" code from "bad" one, is how good it resembles a model solving given problem. It can be a non-neccessary complexity of code (ex. instead of "if (isFinite(a)) { ... }" writing "if (a.toString() != "Inifinity" && a.toString() != "NaN") { ... }" when "a" is simply a number). And contrary not-enough complexity of code, relying on blind faith (famous XSS vulnerabilities, SQL injections in web apps are examples of that).

15 comments

Just my quick list

Code that relies alot on pointless abstractions (aka. bad "signal to noise ratio").

Code with lots of microdependencies to perform really trivial tasks.

Code with premature "optimizations" (usually done before actual performance measurements).

Code sacrificing readability (and thus maintainability) to be overly clever or to do trivial optimizations (usually with little to no impact on overall performance).

And of course all the different ways to mess up comments, style and versioning (Yes, I consider the versioning part of the code):

    * Uncommented code
    * Over-Commented code
    * Comment-Banners
    * Commented-Out code (git exists for a reason)
    * Inconsistent naming
    * entirely_too_long_variable_names for things that exist for 20 LOC
    * Unhelpful commit messages (my favorite: "fixed: some issues")
    * Bulk-Commits that change/fix/introduce multiple features or issues at once 
    * Missing tags
    * No semantic versioning
> Code with lots of microdependencies to perform really trivial tasks.

"trivial" measure highly relies on programmer's/developer's experience and knowledge. For first developer traversing a tree and applying some updates based on nodes conditions would be "trivial" so thee places all the code in one function. For second developer the same problem would not be such trivial, so thee creates two functions "fn listFromNodes<T, U>(t: Tree<T>, pred: (fn(e: T) -> bool), map: (fn(e: T) -> U)) -> List<U>" and "fn applyUpdates(updates List<Update>)".

> Uncommented code

Why comment a code that's mapped to problem trivially observably (ex. if it's code describing some GUI component with style fields and your task is change style of its part)? I prefer commenting code that contains complex logic (which is measured through code reviews, developers skills, calls etc.) and its effects aren't observed obviously.

> Commented-Out code (git exists for a reason)

Yes and no. Commented-out code is usually in-place saved draft of old code. If some bug is encountered with new code, you can comment-out new code and uncomment-out old code quickly to discover a bug in new code later. Git is less easy to accomplish this task.

> Inconsistent naming

Some rudiment not quickly refactored, may be involved. Example, I had a project when I had type "CardRef" which initially contained only a name (string) of card which was used as a reference for it. Then I decided to include some scores for a card to selecting cards smarter. I kept a name for type the same because it was used a lot in a project for related entities.

> Why comment a code that's mapped to problem trivially observably

"trivially observable" is not a defined measurement however. What a piece of my code does may be perfectly obvious to me, but mystifying to someone else, or even to myself in 4 months. I am not advocating for half a page of comments before each top level definition, or trivial comments like `// this requests data from server` , but a single line of comments above each function or typedef isn't too much to ask.

> If some bug is encountered with new code, you can comment-out new code and uncomment-out old code quickly

I can also check out the commit where it was last changed, stash the info, and apply it to my dev branch. Same functionality, and less cruft in the code.

I really think this list is too low level. What matters a lot more is how the data structures and architecture is designed. I can easily imagine a nightmare of a system that follows all the rules in the list. And I have worked on great solid maintainable code that didn’t follow that list at all. Also, semantic versioning doesn’t work. Developers typically have no clue whether a small change might make an update incompatible. Fixing a bug might make your code incompatible without you knowing it for example (since software using your code has fixes to fix your bug).
Basics matter.

Let's say the architecture of a house is beautiful, and its a marvel of structural engineering, but; If the floorplan is unintuitive, the stairwells alternate between 3 different parts of the building each floor, none of the rooms are labeled, all the access shafts are of different size, the original architect didn't even keep a copy of a blueprint, and some rooms are simply blocked off with a sign on the bricks saying "maybe need that later"...

...then the guy who has to rewire the cables in the damn thing later is gonna have a bad day, and the guy who has to pay him for all the extra hours is not going to be happy either.

Most of the issues you are discussing are architectual issus. Architecture is not just how the house looks. An architecture also decides on the floor plan, staircases etc. Also, houses are not a good anology for software. So I caution against drawing any conclusions from your perception on how houses are constructed and the building industry in general.
Houses are a great analogy, especially when it comes to my argument about pointless abstractions.

A house where the floorplan is decided before the needs of the tenants are even clear, will likely have to be reshaped later. The same happens in overarchitectured codebases, eg. a dependency injection for some subsystem set up which will never have more than one type to depend on, making the code less obvious, less readable, and harder to maintain.

> pointless abstractions

Of course anything that is "pointless" is indeed pointless. And "great abstractions" are indeed great. What's your point? And what has that got to do with your origial low level list?

> A house where the floorplan is decided before the needs of the tenants are even clear, will likely have to be reshaped later

Yes obviously. And a well thought out floorplan that takes into account the needs of the tenants is great. Again, what's your point? And what does that have to do with your original low level list?

entirely_too_long_variable_names -> I prefer very long variable names which are self-explaining and easy to grep. Abbreviations make code less consistent and less greppable. Also what's the point of making shorter variable names if you need a comment to explain what it's doing - or worse I need to email the author to find the answer. A variable name should be the minimal representation of the role of this variable, but without cryptic shortcuts or abbreviations.
I think the common denominator is, code that is harder to read and understand than it needs to be. (That also makes it harder to maintain and update than it needs to be.)
> Commented-Out code (git exists for a reason)

I tend to agree, but with some uneasiness.

I don’t often review the previous commit diffs when working, if I don’t know it is there why would I find it in git?

If I had to pick, Single responsibility principle is #1 for me.

If each class does one thing, anyone will eventually figure out how it’s doing it. If it’s doing multiple things, all bets are off. It’s also the exact type of code that will break something unrelated when changed.

Mind you, we don’t need classes like DictionaryConstructor and StringParameterParser, but you should - in about one sentence - be able to say what each class’ purview is.

I've seen a lot of worse code sticking strictly to single responsibility. As it is, we have a lot of classes with only one function, and that function calls another line of code. It makes sense from a theoretical perspective but YAGNI would make us thrice as efficient.

We have a class whose purpose is programmatically setting strings to be displayed. But displaying the string is done two levels further. And the strings are always static. So there's two layers of complexity and a ton of autogenerated code.

That's a problem with a particularly dogmatic (and kind of useless) misapplication of the principle. "Single Responsibility Principle" is better understood in relation to the same concept as my comment elsewhere in this thread: cohesion. The capabilities and state contained within a module (as an abstract, not strictly technical, concept) should be cohesive, that is: They belong together.

So a class with only one function may be appropriate, but it's not the logical conclusion of SRP itself. For instance, in a project you may want to draw something to the screen. It is not the concept of SRP that you should have one module per function, and have a module which draws circles and another which draws lines (that isn't to say this couldn't be a logical and sensible result, but it probably isn't). Instead you have a module (likely with submodules) which is responsible for drawing to the screen, period. You pass it a description of a shape (however that may be done: data objects, objects which can "draw" themselves in some abstract sense, dedicated procedures/methods for each shape that take shape parameters) and it draws to the screen. This is in contrast to spreading the screen drawing routines across a many thousands or millions of lines codebase.

The other thing SRP discourages is, say, that screen drawing module also being your input module. They could appear to be in the same module from a high enough level, but ideally only because a high level module has submodules that contain the separate responsibilities. For instance, ncurses does both output and input, but hopefully (I've never looked at the internals) it's divisible into two sections (each section composed of some number of modules, probably) that clearly are responsible for either input or output, but never (or rarely, and only where it is necessary due to some underlying technical reason) both.

You can always find examples of code using good ideas badly. Following the SOLID principles are more likely to improve your code than not. Assuming you understand how to use it of course.
often poor code is a symptom of poor data structure design. having the wrong data structure can make your code heinously complex. conversely designing the data structure correctly can make your code simple and clear, because it’s obvious how to manipulate the structure to your code.

i also see a lot of code where people effectively throw 30 responsibilities into the same class. Really difficult to update. strictly enforcing single responsibility solves a lot of problems.

Related to a comment I made in another thread: incoherent code. This is code where the organization is arbitrary, not considered. It's not even one of the many possible reasonable and logical organizations of the code.

As a great (but terrible) example, a project at my first job was a real-time system and the dev for it (I was in testing back then) had decided that since he only had 100 ms slots per task, but had the ability to use up to 10 tasks (or whatever the limit was), he'd distribute some logic across all 10 tasks, rather than making one task responsible for that logic entirely. So the system could have had (and ended up having) a reasonable dataflow, DAG structure relating the tasks (input task, fan-out to processing tasks, fan-in to output task), but instead had a much more complicated internal structure than necessary. It was incomprehensible because of its incoherence. Some of the processing tasks were 2-deep or had another layer of fan-out/fan-in, ultimately. But they should have been done that way to begin with. One of the worst bits of this is that it forced an ordering on the execution of the processing tasks that was not, strictly, necessary except for this distributed logic. Imagine a flow more like this:

  In -> Task A -> Out
          |
          v
     -> Task B ->
          |
          v
     -> Task C ->
Versus this:

  In -> Task A -> Out
     -> Task B ->
     -> Task C ->
     -> Task D ->
(With some of those tasks being composed of multiple tasks, but in a more coherent way now.)
These are some properties I see bad code having

  * insecure
  * doesn't work
  * lack of checks and validation leading to unhandled states
  * inconsistent style and method
  * difficult to understand for wrong reasons
  * spaghetti with too much coupling
  * poorly optimized and inefficient
These properties can be inherited from model/organization itself except "difficult to understand for wrong reasons" and "doesn't work" perhaps. So that's why I point to "respecting to model", a more generic take imho.
I ask myself. “How hard will this be to change?” Or, “does this promote a pattern which will evolve into something difficult to change?”

Bad code to me is code that is difficult to modify or extend. “Difficult” to me implies a higher likely hood of regressions or time spent working.

Probably the worst thing of all is the “wrong abstraction”. Concepts that have been shoehorned into other abstractions - bloating method signatures and variables - obfuscating the intent of the names. This issue becomes cancer that causes a ton of other problems.

Great code reads closer to English language. The bottom of the call stack has high level abstract language and the top of the call stack is highly specific. With module on the stack maintaining a consistent degree of abstract language.

Pfff

Didn’t believe in this cargo from the beginning. Instead I’m watching the world change. When you work with code and understand what it does and all of the implications, these rules are irrelevant, because you have your own sense of when it’s time to (…).

To me bad code is basically two things: (1) negligent/inconsistent formatting, (2) when you read it you can’t grasp what it does. Not because of complex algorithms, but because it is an abomination consisting of tens of files, 5 meaningful lines each, and logic mostly smeared across imports.

My latest favorite was an abstract factory of a single server constructed from specific implementations of property accessors which combine into a hierarchical non-documented method-structure. All it does is listening on a tcp port, regex-parsing messages like

  command; key:value; key:value; …; msg:…
replying “command OK” and calling back .on(“command”, (data_structure_class) => {}). Too bad it parses it wrongly when msg has “;” in it, which is okay according to the spec. 17 files, countless boilerplate lines only to fail in a trivial task, which is basically a one-pager.
After years of experience in the field, I don’t longer ask myself what’s good or bad code, I ask myself what’s good and bad modules. It’s all about modularity and decomposition: if you get to decompose a system in parts that fit together and fit well, then that’s a good system. Whether or not the module itself contains good or bad code, that’s of less importance.
Can you absorb or re-absorb what it is doing rapidly by looking it over for a few seconds?

If not, there's something wrong... if you can't divine the author's intention at a glance, you and anyone else looking at it will find it hard to spot bugs that deviate from that intention.

This kinda reminds me of Steve Yegge's writings about different kinds of programmers.

His main taxonomy is that there are programmers who prefer mistakes are caught as soon as possible, vs programmers who don't mind fixing things as mistakes are found.

Similarly, he says one extreme style is "I have full understanding of the domain and my code, I want as much code on the screen at once" ("expert") as opposite to "I want as much context/documentation, to make the code on screen as understandable as possible" ("newbie").

There's an ambiguity in the standard of "how rapidly can the code be understood?". By who? With what familiarity with the codebase, and with the domain?

By who? By those who are working on it (now and in the future).

Let's start with the author. If the author can't understand it in six months, that's bad.

Then, the author's co-workers, now and in the future. The author may be a genius who never forgets anything, but will his successor be as good? If the successor can't understand it, that's bad code.

> Can you absorb or re-absorb what it is doing rapidly by looking it over for a few seconds?

No. Not only because code is essentially poorly structured, but also because I'm not fluent in subject behind a code. Example is operating systems: when I look in Linux's source code for example, I can't understand which problem is solved by each routine. But does that make Linux source code "poor structured"?

You are too inexperienced to work on the code or understand its context, even if the code is perfect?

Then you are the problem detected by your being unable to absorb what it is doing.

Code without use case tests that are solid, fast, and cover all use cases. I am happy to work with any code written in any style as long as the tests are solid. Tests done well are basically a machine checkable specification of what the software is supposed to do. It allows you to jump in and make large changes without breaking anything. I am maintaining a very large system written in C++ and I routinely refactor and make system wide changes without fear. I haven’t had a bug in production for 5+ years. Imagine how productive you will be when you never have to spend time fixing urgent production bugs.
Code that does not look like what it's meant to do. The idea is that one business change means one code change. Everything else revolves around this. It's the awkwardly named things. It's the vagueness that you can't write tests for.

A subset of this is that the code should look like the language was written for it. Unsuitable use of design patterns is probably the biggest violation. And often you need to put a lot of thought into understanding both the language and the business domain.

I mean, the primary question is: "Does it meet its use case?"

Now, this does include the uses of the developers working on it throughout its lifestyle, but what makes code 'good' when it comes to a bank's system vs. a SaaS startup vs. a script somebody writes for themselves and themselves alone can be completely different.

Trying to set out concrete rules for what makes something good practice is difficult when it's something as varied as coding.

People come with all sorts of rules, but honestly I think sometimes you just know. I've seen code that breaks a lot of the rules people are going to list, that I've felt in practice was fine. This is what I've inherited. Coding style is often a bikeshed, but sometimes it needs more shedding.

https://ibb.co/JCmcsnF

This looks like one of my points: "non-neccessary complexity". Why using "echo" statements if PHP is already designed as procedural web-templating language? If it's not a function that dynamically renders template based on some params.
> non-neccessary complexity with regards to given model (problem, tools)
1. When it fails at being a function abstraction; though this is a language issue, that is, you can't write good code in most languages.

2. Incorrect, including such issues such as insecure, not filling the specification.

* and decomposing a function in smaller functions