Hacker News new | ask | show | jobs
by mattsears 100 days ago
What is it about large untyped codebases that make it a nightmare?
3 comments

If you make a change to the return types of a function for example you have to manually find all of the different references to that function and fix the code to handle the change. Since there are no compile time errors it's hard to know that you got everything and haven't just caused a bug.
Yes, and the downsides cascade. Because making any change is inherently risky you're kind of forced not to make changes, and instead pile on. So technical debt just grows, and the code becomes harder and harder to reason about. I have this same problem in PHP although it's mostly solved in PHP 8. But touching legacy code is incredibly involved.
Especially with duck-typing, you might also assume that a function that previously returned true-false will work if it now returns a String or nil. Semantically they’re similar, but String conveys more information (did something, here’s details vs did(n’t) do something).

But if someone is actually relying on literal true/false instead of truthiness, you now have a bug.

I say this as a Ruby evangelist and apologist, who deeply loves the language and who’s used it professionally and still uses it for virtually all of my personal projects.

The best perspective I've seen is that statically typed enforcement is basically a unit test done at compile time.
Alan Kay's argument against static typing was it was too limited and didn't capture the domain logic of the sort of types you actually use at a higher level. So you leave it up to the objects to figure out how to handle messages. Given Ruby is a kind of spiritual ancestor of Smalltalk.
the problem is that nobody listened to Alan Kay and writes dynamic code the way they'd write static code but without the types.

I always liked Rich Hickey's point, that you should program on the inside the way you program on the outside. Over the wire you don't rely on types and make sure the entire internet is in type check harmony, it's on you to verify what you get, and that was what Alan Kay thought objects should do.

That's why I always find these complaints a bit puzzling. Yes in a dynamic language like Ruby, Python, Clojure, Smalltalk you can't impose global meaning, but you're not supposed to. If you have to edit countless of existing code just because some sender changed that's an indication you've ignored the principle of letting the recipient interpret the message. It shouldn't matter what someone else puts in a map, only what you take out of it, same way you don't care if the contents of the post truck change as long as your package is in it.

That's a terrible solution because then you need a bunch of extra parsing and validation code in every recipient object. This becomes impractical once the code base grows to a certain size and ultimately defeats any possible benefit that might have initially been gained with dynamic typing.
>then you need a bunch of extra parsing and validation code in every recipient object.

that's not a big deal, when we exchange generic information across networks we parse information all the time, in most use cases that's not an expensive operation. The gain is that this results in proper encapsulation, because the flipside of imposing meaning globally is that your entire codebase is one entangled ball, and as you scale a complex system, that tends to cost you more and more.

In the case of the OP where a program "breaks" and has to be recompiled every time some signature propagates through the entire system that is significant cost. Again if you think of a large scale computer network as an analog to a program, what costs more, parsing an input or rebooting and editing the entire system every time we add a field somewhere to a data structure, most consumers of that data don't care about?

this is how we got micro-services, which are nothing else but ways to introduce late binding and dynamism into static environments.

Is that a common issue? I guess I'm having a hard time imagining a scenario that would (a) come up often and (b) be a pain to fix.
Anything can return anything and you only realize it at runtime is a massive headache. When you can't keep the entire code base in your head it becomes a liability.

I never used Ruby, but Python code bases love mixing in strings that are actually enums and overloading functions that accept all kinds of types. You just had to hope that the documentation was correct to avoid a crash.

Java 1.7 to Python feels very freeing from all the boilerplate. Kotlin, or any other modern language with a well designed standard library, to Python just feels like a bunch of extra mental work and test to write to just avoid brackets.

When CI/CD becomes your compiler you are in a tough spot
It makes coming up to speed on an existing codebase a slog because you have to trace through everything back to its source. Oh, and because there are magic methods and properties galore, your normal introspection tools in e.g. RubyMine get frequently stymied.