Hacker News new | ask | show | jobs
by mattbillenstein 2514 days ago
Large Perl codebases are terrible - let it die.
3 comments

But the small perl codebases I have are great.

I have a crontab that runs every five minutes: 30 lines that use one CPAN module, no bugs in years. The code has the kind of hackiness that's okay when it fits on one screen, and it fits on one screen precisely because of that hackiness.

It's very hard to know which codebases will grow to be large - every codebase starts out small, and by the time you realise it's going to be big, rewriting into another language is expensive. Better to use a language that can scale to small codebases and large, that can accommodate hackiness and cleanness alike.
Nicely put, but I notice that you avoid names and numbers.

How expensive can it be to rewrite a hundred or a thousand lines, if a perl script keeps growing? And what are these languages (you use plural so I do too) that are optimal for every case?

> How expensive can it be to rewrite a hundred or a thousand lines, if a perl script keeps growing?

A hundred lines, or a thousand lines, is not so bad. But where do you draw your line in the sand that you're going to stick to? If you don't insist right at the start on coding to your maintainability standards, why would you make that change when your script went from 30 lines to 100, or 100 to 500 (no doubt always driven by an urgent functional need)? It's always easier to add one more change to the existing codebase; the only point where you'd rewrite is when the codebase becomes literally unmaintainable because no-one understands it, and at that point migrating it is extremely expensive.

> And what are these languages (you use plural so I do too) that are optimal for every case?

I'm not saying optimal (I doubt that's possible), I'm saying pick a language that's adequate for every case. I'm partial to Scala, which was explicitly designed as a "scalable language" that would work for large and small codebases; other ML-family languages (e.g. OCaml) are similar. In my book any language with the ML featureset is more than adequate for large codebases, and any language that has a REPL/interpreter and doesn't require Java-like explicit types is adequate for scripting tasks. (Side note: I've always found it odd that Perl didn't have a first-class REPL, since IME that's one of the biggest natural advantages of scripting languages). If you want to start from the scripting end, Python, Ruby or even TCL offer a level of consistency and least-surprise that let them scale to larger codebases than Perl, though I wouldn't really want to use any of them on a truly large codebase.

Next question: Code written in which of these languages tend to avoid needing major redesign when a thirty-line script grows unanticipatedly to much, much more?
I've seen plenty of Scala codebases grow without ever needing an overall redesign, and don't see why it would be different for any other language. Of course the end state is very different from the start state, and no doubt pieces end up getting rewritten many times over. But being able to incrementally evolve the codebase while keeping it all working is much easier and cheaper than having to rewrite in one go. In my experience you only need a "redesign" when you did too much design up front in the first place; if you always keep code changes driven by concrete use cases and avoid premature generalisation, you end up with a codebase that won't impede future changes any more than it has to.
Perl more than any other language I know depends on the self-restraint of the programmer to write readable code and not be overly clever with their syntax. I’ve worked on some large Perl codebases in the past that were as readable and maintainable as, say, Python.
Yeah, I've seen perl programs so broken down into simple procedural steps that non-coders were able to go in, understand what it was doing and even make some of the changes they needed after a little trial and error, but also other programs so concise and/or confusing that I'd rather rewrite it than try to figure out how it works.
That is not normal.
All the "big" Perl codebases I've worked with were as readable as I'd have expected any "big" codebase to be, if not better. What was your experience like?
25k lines of spaghetti garbage - all in one file.
Sorry you had to deal with that. Do you think it would have been easier to deal with 25k lines of spaghetti garbage in a file named .py?
I think it actually probably would have been - not an ideal way to write code, but Python is certainly easier to read than your average Perl.
I agree that they are terrible, but not all programming is large code-bases. I think Perl is good for its originally intended use as a high-level scripting language.