Hacker News new | ask | show | jobs
by zahlman 2 days ago
> a "simple" and "beautiful" function that was mangled into incomprehensibility over the years, and where if a more expressive type signature had been written from the start, it would have restricted the damage caused over time.

...Can you give a concrete example? I've been programming literally since the 80s and that doesn't ring true at all for me.

3 comments

> ...Can you give a concrete example? I've been programming literally since the 80s and that doesn't ring true at all for me.

Even this week I stumbled upon legacy code that started off with a clean function, void DoSomething(Foo). Then a few years passed and someone started using Foo to handle two scenarios, let's call them Left and Right. They could have simply introduce two new types, FooLeft and FooRight. But no. Instead they kept Foo after adding a few extra optional fields, and extended DoSomething(Foo) as

    DoSomething(Foo foo, bool isLeft, bool isRight)
This took place during the mid 2010s.

Where have you been during all this time?

Have you been in static types the whole time? It's a really, really common failure state in dynamic programming languages, the Everything Function, that started out as something simple, but then someone added a flag to make it also do this other thing, and then you need a flag to only do that other thing sometimes, and someone needed to operate on multiple things so they made a string parameter also optionally an array, and later someone allowed it to also be an object with this one method, or maybe another method if it's present because some other team implemented that before the first one and can't switch now, and before you know it it's a free-for-all of people adding flags and options and type analysis and if statements and you have a complete mess. Especially if this function is shared by many disparate teams, each of whom isn't "allowed" to break the others, though a single team can fail this way plenty fine.

You can still do this in static languages, but they do push back a bit more because you don't get the flexibility that dynamic languages offer when it comes to accepting a huge variety of different input types.

I've torn a few of these apart over the years. Never fun. Haven't tried with AI but suspect that would only be a quantitative change rather than a qualitative change. The fundamental problem with fixing these is lack of information about the exponential complexity of possible call mechanisms and the AI will have the exact same information problems I will, just faster.

Edit: One of them that I tore apart ended up being two entirely separate functions slammed together into one by historical contingency. I don't just mean that I broke the functionality down into multiple functions, that's a basic tool of how you tear these down and is nothing of note. I mean that one of the "everything functions" I tore down had two distinct calling patterns that were distinct functions that not only shouldn't have been festooned with so many options, but never should have been one function at all because they weren't even conceptually the same thing or even particularly related.

Think of it as two stages of a straight-line process, that were just jammed together because of the fact they got called at similar times, and the original writers weren't clear on the unrelated nature of the tasks and nobody was able to see it through all the obfuscation until I sat down, very deliberately, and I realized this as I was tearing it apart. I don't remember the details, I tend to remember things very conceptually and thus I have a hard time remembering the details of functions with no conceptual purity, but you can get close by thinking of the function as validating incoming parameters, and then applying the parameters to a database. And people were so confused that despite the fact this function, when tickled correctly, could do it all in one shot, sometimes, kinda, with some caveats, there were places where this function was called first to validate (with flags to shut off the application), and then to apply (with flags to shut off the validation). And to be clear, I mean, I did not realize it either even from my contact with the function over the years. It was only when I sat down with it for hours and systematically tore it down that I figured that out.

> Have you been in static types the whole time?

In fact, Python has been my most-used language for 15+ years and I rarely use annotations.

> the Everything Function

This doesn't happen in my own code; 10 lines is unusually long for me. In others' code, it comes across that the problem is more to do with not properly splitting up the task (a lot of the time, a reluctance to extract loop bodies, in particular). The case logic does have to go somewhere and it isn't always practical to hide it with polymorphism (which has its own problems; I write my own classes quite a bit less than average I would say, and especially avoid inheritance). It's often better when you have one thing that lays out the case logic explicitly while delegating all the actual work.

But aside from Everything Functions, a lot of code bases have more of a problem with the Everything Class that just contains way too much state and still doesn't neatly refactor the work away (and where there are passing attempts to extract a few lines, they often end up in a "method" that doesn't actually touch `self`).

> I mean that one of the "everything functions" I tore down had two distinct calling patterns that were distinct functions that not only shouldn't have been festooned with so many options, but never should have been one function at all because they weren't even conceptually the same thing or even particularly related.

Yeah, sounds like you work in unusually unpleasant circumstances.

But I don't really see how a lack of type expression leads to this kind of thing. The default assumption for the type of a parameter, in untyped Python, should be: "an object that supports the operations currently used with it in the existing code". Going beyond that is like adding additional methods to a class that hasn't actually been written, and needs to be well considered.

A lot of people on HN don't seem to like dynamic typing. I think it's more that it's not for them, and that's fine. There will always be people with different mental models.

I can't, as my employer owns the code, not me, but there are several examples in one of the Ruby codebases I unfortunately maintain where I can see this degeneration happen via the git history. A small 8 line method with just two parameters slowly grows in complexity over time, until one day one of the original parameters supports two different shapes, and later on it's not that easy to understand which shape it should have in the specific conditional branch you're trying to fix, and the last person to touch that code left the company 4 years ago.

The fault, of course, ultimately lies with the people who wrote and approved this nonsense, but types, or at least type hints, help to avoid this issue.

Can you point to it? That doesn't sound like "the language forced all this extra baggage on it due to 'safety'" so much as the developers kept adding functions to the function without rethinking if and how they should.
My point was not about the safety of the code, it was about the expressiveness, which is also what the comment I replied to was about. If the parameter has an explicit type (instead of no type, as is normal in Ruby, or `void*`, which is the C equivalent), it forces the developer to consider the design of the function, instead taking the path of least resistance because they're inexperienced/incompetent/a large language model/burnt-out to the point where even the thought of opening the file makes them feel the not-anxiety of burnout/<insert reason here>.
> instead of no type, as is normal in Ruby, or `void`, which is the C equivalent*

“void *” is not the equivalent of “no type” from Ruby. “void *” says “I operate on raw memory”. It says exactly the same thing as “byte *”.

For sure you should generally not write a function that accepts a “void *” and then internally casts it to some concrete pointer type and operates on that type, but the problem there is the internal behavior, not the choice of byte vs void pointer.

Forcing developers to consider and more is harmful though. You're arguing to put all of the forethought upfront, when you have the least context and least understanding of what can go wrong, and carrying that complexity forward rather than starting simple and refactoring over time.
You’re really citing a mess in a Ruby code base caused by lack of typing as evidence for why void * is problematic in C/C++?

These are so wildly different cases that the comparison isn’t meaningful. This is like saying you should wear a helmet while playing tennis because sometimes helmets save bicyclists lives.

> You’re really citing a mess in a Ruby code base caused by lack of typing as evidence for why void * is problematic in C/C++?

If you read GP's post you'll understand it exemplifies exactly the issue that the likes of (void *) present in C.

I mean, read the message, particularly this:

> later on it's not that easy to understand which shape it should have in the specific conditional branch you're trying to fix

That is exactly the purpose of void *. By design. It's a pointer to an unspecified type. The unspecified type is exactly why this thing is used.

> later on it's not that easy to understand which shape it should have in the specific conditional branch you're trying to fix

This is not idiomatic C. I have no doubt that someone (likely many someones) have written a function that takes a void * and then internally does some insane half baked dynamic typing. But I’ve never seen it and it’s not common.

You also cannot fix this behavior by changing the pointer type. The type of the pointer is essentially meaningless in this case.

> That is exactly the purpose of void *. By design. It's a pointer to an unspecified type. The unspecified type is exactly why this thing is used.

This is also the purpose of byte * in the examples. Coercing an arbitrary pointer from void to byte doesn’t accomplish anything. It’s lipstick on a pig at best.