Hacker News new | ask | show | jobs
by forgotmypw17 1380 days ago
With function signatures and state variables added in 5.010, I consider Perl feature-complete and have not really missed anything from it for as long as I've been writing Perl.

What I do appreciate that's missing from many other languages and systems is the extreme committment to backwards compatibility. The knowledge that the next minor release won't break existing scripts is underrated, IMO.

Solo project with ~23K LOC of Perl and counting here. Bless you, Larry Wall and Perl maintainers. Keep it up!

11 comments

>Bless you, Larry

I see what you did there.

Edit: for those downvoting me because they don't understand my comment, it refers to the fact that bless is function of the standard library to associate an object with a class.

If you're downvoting me because you think it's a stupid and off topic comment, that is perfectly valid and acceptable to me.

Thanks for thinking me more clever than I may be.

It was a simple blessing, and perhaps a reference to this interview with Larry.

To me, the language creator's faith is one more reason to use it.

https://slashdot.org/story/28204

My suspicion (not in a negative way as I'm Christian as well) is that the function was named so because of Larry's faith.
One of my criteria for good tools is that they scale well from the smallest possible use case to absolutely massive. Git meets this criterion for example.

Anyway I manage a 250kline code base written over 20 years which is in surprisingly good shape consider it's age and how many people have touched it. Last time we upgraded the perl for the first time in a decade - going through the addition of many features and major internal changes (e.g. unicode, optimisation) the total number of lines of code we needed to change was at most 50. And very little having to fiddle with underlying cpan libraries.

Back to the point. Throwaway script - perfect candidate. Code capable of running the money pump for a billion dollar company. Also just as fine as any other similarly capable environment, better than some, trickier to manage the team than others.

"Perl makes easy things easy and hard things possible."
Getting strings to have the right encodings should be easy. On the last Perl codebase I touched it's proven impossible for all practical intents and purposes.
It's markedly easier than with Python, though. Here's a short script that will recode a file with mixed iso-8859-1 and utf8 data into proper utf8:

    #!/usr/bin/perl
    use strict;
    use warnings;
    
    use Encode qw( decode FB_QUIET );
    
    binmode STDIN, ':bytes';
    binmode STDOUT, ':encoding(UTF-8)';
    
    my $out;
    
    while ( <> ) {
        $out = '';
        while ( length ) {
            $out .= decode( "utf-8", $_, FB_QUIET );
            $out .= decode( "iso-8859-1", substr( $_, 0, 1 ), FB_QUIET ) if length;
        }
        print $out;
    }
Thanks for posting the happily ignorant code snippet that I have been waiting for.

The problem is that Perl internally encodes strings as sequences of numbers. Not even sequences of bytes, but sequences of numbers that could either be codepoints or bytes resulting from the encoding of such a sequence of codepoints. ...as a developer you are perfectly free to make this assumption any way you please at any given point in your codebase. It's not even clear that any one of those two is particularly "preferred" at large or a best practice or anything like that.

To make things worse, there is no way to know which is which, i.e. a string itself is happily ignorant about the assumptions that people will/should make about it. And Perl will happily concatenate strings making different kinds of assumptions, or double- or triple-encode them as you please, or decode something that hasn't been encoded in the first place.

This leads to jumbles of numbers that aren't anything in particular. They simply work well enough for sloppy programmers to not realize when they are making mistakes, but badly enough to almost guarantee that encoding errors will crop up on users' screens regularly.

Now, given that this is how the language works, be my guest jumping into a 100k loc Perl codebase that dozens of programmers have touched over a decade, passing around and munging together strings not just within their own codebase, but also using strings stored to and retrieved from elsewhere, in some case places where no one knows anymore where they initially came from or where they will ultimately go to.

> Thanks for posting the happily ignorant code snippet that I have been waiting for.

Thank you from being so civil. IMO displaying a badly encoded string beats crashing on a runtime error most of the time. I'd rather see "hôpital" than "Error 500", if you will. Maybe don't think your personal assumptions carry any validity out of your own choices, preferences, or uses.

I imagine the difficulty working with a huge codebase lacking refactoring and maybe even predating utf-8, but where would you be if it was written in Python 2.5 originally?

What we need from a programming language is to make medium complexity things, at worst, medium difficulty.

I don’t care about hard problems, and easy problems.

Erlang/OTP does medium difficulty things, i.e. very large applications with good fault tolerance and QoS, really well.

But it's a very different niche. Perl and Ruby scale to mid sized applications quite well, but above that fault tolerance and QoS become hard.

I think with languages like Perl, Ruby and Python you just need a static, compiled language to migrate to at a certain scale, preferably with similar features. Kotlin and Scala seem to be currently the best options for Ruby, Python and OO Perl. For procedural Perl maybe Golang.
Rust gets large amounts of inspiration from perl so don't forget that one.
Python has a number of similar, static, compiled languages that are embeddable in Python code (notably Cython and taichi).
> What I do appreciate that's missing from many other languages and systems is the extreme committment to backwards compatibility. The knowledge that the next minor release won't break existing scripts is underrated, IMO.

I don't write much Perl these days and haven't for some time, but it's still what I might reach for if I were tasked with writing something suitable for a scripting language that had to run with ~0 operational or upgrade/maintenance budget for a decade or more in environments I wouldn't necessarily be able to control or influence, or else [bad thing will happen].

Perl needs backwards compatibility given the `write once, read never` nature of the syntax.

Imagine trying to upgrade this to some new syntax...

https://github.com/schwern/AAAAAAA

The fact that it's possible to craft deliberately obfuscated code in a language doesn't actually tell you much about the language.
My guess is that for many people it's hard to write non-obfuscated Perl code. It has so many operators and ways of doing things that walking into someone else's code may feel like having to learn everything from scratch.

(Higher Order Perl and such not withstanding.)

In all honesty I have read more unreadable code in Python, than Perl. And I have written nearly equal of both over my career.

You can write bad code in any language. Bad variable names(sometimes single alphabet names), functions running pages long, duplicate code, algorithmically inefficient code, no error handling, master try/catch statements, OO abuse, functions with unpredictable side effects etc etc.

Early internet saw a flood of newbie programmers, and therefore a flood of badly written code too.

For that matter you also see badly written C/C++ code from those days. How do you think C++ got its reputation for being too bloated beyond practical use?

That looks more like write-only naming... but the syntax itself isn't bad at all.

Here's the main module: https://github.com/schwern/AAAAAAA/blob/aaaaaa/aaa/AAAAAAAAA...

gross. Look at all of those non-'A's
I’ve had very limited contact with Perl, but for scripting purposes it does seem like the best option, so I intend to sit down to it and learn it better. It looks like a great next step after sed and awk (perl -pe). I love the ability to write terse scripts, reasonable speed for a scripting language and, as you said, backwards-compatibility (such a stark contrast compared to Python).
The Modern Perl book is great, it explains a lot of the Perl-isms to a modern audience.

http://modernperlbooks.com/ is where you can read it for free.

I wrote ebooks on CLI one-liners featuring grep/sed/awk/perl/ruby/coreutils/etc. These are free to read online: https://github.com/learnbyexample/scripting_course#ebooks

Plenty of examples and exercises.

The "Learning Perl" book from O'Reilly is the best book to do this (and imho is one of the best written programming books).
Learning Perl is a classic of the form---the best intro programming book that I've ever seen, at least for smart readers who've done some kind of programming (any kind, with any language) in the past.

The contrast to Learning Python is noteworthy. The latter book is useful, too, but it's about ten times bigger and much less focused on introducing a language.

I find Mark Lutz's writing style extremely tedious, having forced myself to wade through "Learning Python".
Learning Perl is a great book but for anyone new make sure you get a copy of the 7th addition. It originally came out in '93. The 6th edition is 12 years old now.
Yup, I'd almost recommend it to a non-programmer.
I would argue that ruby as almost all the strengths of perl and conciseness with more coherency if you want a perl-like fluidity and terseness. I personally like python for anything that becomes more than a 100 lines of bash.
Due to poor coding practices (eg monkeypatching) and a weaker testing culture (by default Ruby does not run unit tests when installing libraries) I've found Ruby to be substantially less reliable than Perl.

However the world has moved on to Python. So I curse every time I again have to look up how subprocess works for what I'd do in Perl with backticks.

The documentation for that says, "Specifically, Windows is not supported."

My use case was for scripting git operations. And the list of target environments included Windows.

So no, that wouldn't have worked for me.

To be fair to Python subprocess is not simply a substitute for backticks in Perl or Ruby. It's supposed to protect you from some of the more obscure problems which can happen with shell expansion.
Check out plumbum: https://plumbum.readthedocs.io/en/latest/

It supports mac, Linux and windows.

TIL (as a light Perl user alternative to awk) CPAN runs test on install. Is there any other language package manager runs tests by default?
that is super dangerous, just like some other dangerous parts of perl where it can run code during the compilation phase
A lot of languages allow the running of arbitrary code on install, so it is not particularly dangerous that Perl allowed it.

However you have no idea how many bugs got caught because the test run uncovered platform specific bugs. This is exactly what gave Perl a good name for being portable. Doubly so given that Perl always did this, starting back in the 1980s.

Does Ruby come pre-installed on virtually every Unix-like system out there?
No, and if it is, who knows what version it is.

I think that's the main thing preventing one from using ruby like this. It is otherwise preferable in pretty much every way.

Perl is kind of pre-installed on virtually every Unix-like system for what are at this point historical/legacy reasons. It is unlikely any other language can ever achieve this at this point.

Indeed. As much as I would like to use Ruby or Raku for this type of stuff, I keep coming back to Perl because it's simply... there.

Sure, I could probably install Ruby on any machine I want it, but it's not just technical availability. Socially, Perl serves as a quite obvious Schelling point. I don't have to convince four other people to learn Ruby, because Perl is what everyone would gravitate to even in isolation, again because it's just... there.

(That said in recent years I've had to switch to Python for some things aimed at a younger audience. Oh well.)

AFAIR, Red Hat stopped including Perl by default since RHEL8 (
RHEL 9.0 provides the following dynamic programming languages:

    Node.js 16
    Perl 5.32
    PHP 8.0
    Python 3.9
    Ruby 3.0
Just a data point — Ruby, Perl and Python have all been deprecated [0] in macOS.

[0] https://developer.apple.com/documentation/macos-release-note...

If it's not installed already, Ruby will be installed ASAP on any machine I use.
You're confusing perl with ruby there.
For that, there is Next Generation Shell. Also works like a charm even before 100 lines. For example when structured data is needed. Some other advantages over bash are error handling, automatic command line arguments parsing (similar to Raku, btw), standard library with functions like warn(), log(), debug(), retry(), etc that you have likely implemented hundreds of times in your bash scripts.

Disclosure: I'm the author.

What is your project? Raku has some cool features and interesting syntax, I took a look at it before the rename, but find myself reaching for Python for basically everything.
Raku is a really nice language. I especially like the features that improve safety of the code: gradual typing, subsets, PRE and POST conditions, and defined/undefined checks. Junctions and multimethods are also nice. I have an article - not finished, and probably won't ever finish it, unfortunately - showcasing these features: https://klibert.pl/statics/raku/writeup.html

I think Raku deserves way more attention. Its implementation is suboptimal in many ways and it's improving slowly due to very small core team. With more exposure, more people would come, and hopefully some of them would consider contributing, accelerating the pace of development.

Link is in my profile.
While it's not my favorite language. I have to admit that function signatures go a long way towards making perl feel somewhat normal to work with. I have to imagine they are pretty big boons for IDEs.
Needs static typing.

I'll let myself out now.

When was the last time this caused a bug for you? My experience of moving from JavaScript to typescript is that it takes significantly longer to write many generic things because the types can’t really express the intended use as well. I will certainly admit that types help a lot at the boundaries between systems , or for catching errors introduced when changing code, but it’s not always a clear win for types, given how much more verbose, and less expressive the language becomes as a result.
We actually have science on specifically plain ECMAScript versus TypeScript, where the software written in the latter has 15 % fewer bugs. https://blog.acolyer.org/2017/09/19/to-type-or-not-to-type-q...

But of course, the study does not account for if increased development time cancel out revenue gained from lower bug count. (And this will be a difficult problem in general, due to first-to-market effects etc.)

This is my experience as well. I have a side project in its prototyping state and tried to use typescript in it. The result was exactly what I was afraid of - the first half of a weekend spent on fine-tuning tsconfig and tsserver integration, the second half on type acrobatics and investigating wrong narrowing issues. No code was written that day. I have a decent experience with both typed and untyped languages to see pros and cons of typing and am consciously choosing untyped for the “development” phase.
I agree with the first part (I generally hate the devops situation with JS, TS, nodejs, modules, etc). But I don't understand the part about type acrobatics. TypeScript's typing is robust to the point of being Turing complete, so you can express things that are generally not expressable in your typical typed language. And if TypeScript cannot pin down your types, it is probably a code smell. But you can revert to "any" at any time anyway, if you feel compelled to do some unidiomatic js trickery.
I’ve ran into situations where pretty obvious “if (typeof + condition)” chain over-narrowed a type so that further else-ifs were inferred as “never”. The logic was sound and triple-checked, I only struggled with convincing tsc that it’s okay. Sure, it was my fault somewhere in my types and not in typescript, but the progress doesn’t care which part of a development site fails or spends too much time in research.

unidiomatic js trickery

  type Foo = undefined | false | Unit | Foo[] | FooObject
This was a part of the issue, afaiu. FooObject being “partial” either fell through obvious typeof guards, or removed other essential types from a branch, depending on what I tried. I ~understand why the issue persisted, but had no clear way to tell tsc what I mean there. The perspective to meet a similar issue in a much more complex case feels unpleasant.

While types make intents formal (which is a pro), they require to specify irrelevant edge cases. Dynamic typing serves as “code is law”, and when you meet an edge case in the wild, it’s much easier to explain it than to formalize.

I also remember many cases of prototyping in other typed languages and it never felt “focused on the job” to me there either, even when (or despite?) a type system wasn’t turing-complete.

Also, my best hope for typescript was that it would allow me to create type-only “header files”, which would serve as a source of truth and a sort of auto-validated documentation. It turned out that forwards are not first class citizens in ts, and it was sunday evening already, so I gave up.

It's got static types. A scalar is a scalar is a scalar and never an array. Of course, scalar is a very broad type.
I make up for this by writing short subprocedures and many sanity checks.
Backwards compatibility for languages, libraries, environments, compilers etc. etc. is super super important. Without it no language/libraries/.. will be successful long term. Because people will simply switch to more modern offerings instead or not upgrade to newer versions. Making things “deprecated” is really bad. Just don’t. An API is forever. So ack accordingly.
Signatures were added in 5.020, and [edited:] were considered experimental until 5.036.
> and are still considered experimental

No, 5.36 stabilised them.

https://perldoc.perl.org/perldelta#Core-Enhancements

> The 5.36 bundle enables the signatures feature. Introduced in Perl version 5.20.0, and modified several times since, the subroutine signatures feature is now no longer considered experimental. It is now considered a stable language feature and no longer prints a warning.

Oh, that's a bummer. I was kind of excited to use those, but since I never know when I'm going to get a call from the Centos 7 guys I have to stay away from features newer than 5.016. Basically just new enough to get semi-sane UTF support.
Ah! Looks like I need to upgrade, then -- I was reading from "perldoc feature" in my install, which was still on 5.34.
Thanks for the correction. I don't use them anymore (instead doing a few lines of sanity-check boilerplate in each sub) so that's why I made that mistake.
are Perl function signatures the PHP/Python equivalent of type hints?
This may shock you, but: historically, Perl functions did not have declared arguments. Like, at all. The function just got arguments implicitly passed to it as an array (@_), and parsing that into individual variables was up to the function -- so most functions would start like this:

    sub myFunction {
        my ($self, $arg1, $arg2) = @_;
        ...
    }
Or, in older code, you might even see this:

    sub myFunction {
        my $self = shift; # implicitly get the first value from @_
        my $arg1 = shift;
        my $arg2 = shift;
        ...
    }
Notably, this didn't even check the number of arguments you passed. If you passed too many or too few arguments to a function which worked this way, the extra arguments would silently be ignored, or missing arguments would silently show up as undef.

Function signatures turn this into something that'll look much more familiar to users of other languages:

    sub myFunction ($self, $arg1, $arg2) {
        ...
    }
which includes checks to make sure that all the expected arguments (and no more) were passed.
> If you passed too many or too few arguments to a function which worked this way, the extra arguments would silently be ignored, or missing arguments would silently show up as undef.

This part probably won't shock anyone given the prevalence of JS today. Which is kinda sad, given that it's 35 years since Perl was first a thing.

Lol there were places, perhaps times, where Perl subroutines would "shift" as they go along, to create a prose effect. Sometimes you'd get a comment about it, but mostly not. Maddening for any longer function.
Anyone remember smart match? Oh dear - a real low point for Perl.
I used to think you had to be a hashref to be blessed but The Damian put me straight on that.
> What I do appreciate that's missing from many other languages and systems is the extreme committment to backwards compatibility.

Is that even something they're doing on purpose? Didn't they completely botch their attempt at a new major version?

When Python moved from Python 2 to Python 3, introducing breaking changes, there was enough velocity and acceleration in the Python ecosystem for people to be willing to take the pain.

With Perl it's inertia. So even if the language designers and package maintainers wanted to make breaking changes, they couldn't, because people would just stop updating. All that's left to do is to drag out the process of this ecosystem rotting away.

It doesn't really matter to me whether it is on purpose or not.

All I know is that I can depend on my Perl scripts running a year from now, and I cannot do that with Python.

It feels alive and well to me in that sense, while the stability and dependability of Python does seem to rot pretty quickly.

> When Python moved from Python 2 to Python 3, introducing breaking changes, there was enough velocity and acceleration in the Python ecosystem for people to be willing to take the pain.

Paint me unconvinced. Python2 is still a thing, 14 years later.