Hacker News new | ask | show | jobs
Perl: When DWIM Doesn't (bits.shutterstock.com)
27 points by douglashunter 5255 days ago
6 comments

re: open() creating temporary file if file arg is undef

Fortunately both IO::File (core module) & Path::Class don't do this:

  use IO::File;
  my $fh = IO::File->new( $config->{file_path}, 'r' ) 
             or die "can't open $config->{file_path}: $!";
  
  
  use Path::Class qw<file>;
  my $fh = file( $config->{file_path} )->openr;
Both above spot the undef gotcha. NB. And ->openr() throws an exception.

ref: https://metacpan.org/module/IO::File | https://metacpan.org/module/Path::Class

> Both above spot the undef gotcha.

The builtin open() even tries to spot when it's not intentional. The documentation puts "undef" in quotations in the section where it introduces the feature (suggesting it has to be the literal), and the implementation goes to some effort to agree with it. For example, the following code dies:

  my $config = { file_paht => "/opt/config" };
  my $path = $config->{file_path};

  open (my $fh, "<", $path)
    or  die "can't open $path: $!";
In the case in the article it's a bug in the implementation, combined with a typo, which introduces the little known feature.

Using Path::Class obfuscates the open, to my eyes. Using IO::File is a nice suggestion. Your lock_keys suggestion below will probably be our update to the code for future protection, though.

re: Path::Class - Each to their own :)

However you can take your original full example of this:

  open (my $fh, "<", $config->{file_path})
    or  die "can't open $config->{file_path}: $!";

  my $data;
  {
    local $/ = undef;
    $data = <$fh>;
  }
And turn it into just this with Path::Class

  use Path::Class 'file';
  
  my $data = file( $config->{file_path} )->slurp
  
When dealing with lots of files/directories especially across different platforms I find Path::Class a god send.

NB. One good habit I currently do have is using Path::Class more ubiquitously :)

!!! Supplementary for future readers of this comment thread !!!

The pragma autodie also spots this problem!

So....

    use strict;
    use warnings;
    use autodie;

    our $config;
    $config->{file_paht} = "somefile";

    open my $fh, '<', $config->{file_path};
Will throw an exception at open(). NB. Bizarrely the error is Use of uninitialized value $file in sprintf... but hey the problem is still caught :)
Maybe they should use classes and objects instead of hashes. I.e.

    $session->ab_variation_overrides();
This would prevent them from mistyping ab_variation_overrides everywhere they need it... But of course, it's a lot easier to solve everything with hashes in Perl. Been there, done that!
Another solution could be to lock the keys of $config after its been configured:

  use Hash::Util 'lock_keys';

  our $config;
  $config->{file_paht} = "/opt/app/data_file";
  lock_keys( %$config );
Now any attempt to use an undefined key will throw an error. So on the open() it would now throw error: Attempt to access disallowed key 'file_path' in a restricted hash at...
This is what Kris Arnold (https://twitter.com/wka), one of the other hackers here at Shutterstock recommended. I like it.
Me to, especially as it covers both problems you've given in the post.

Now I just need to get into the habit of using Hash::Util::lock_keys more often myself :)

While that can be helpful, that comes with its own set of problems. It's harder to say $session->ab_variation_overrides ||= {inital => "values"}. I think it's possible, but I've been warned off by the huge EXPERIMENTAL warning around the lvalue subroutine return documentation [1] and the fact that it's been labelled "experimental" for a long time now, so I don't dare put it in production code, or use it where I can't count on someone knowing everything in those docs. I've been in tight loops where the cost of a function call was noticeable vs. the hash lookup. You can't avoid the fact that lots of other libraries and existing code expects hashes in one form or another (the calling convention that essentially puts a hash in a list is just another variant of that).

You're not "wrong", but it doesn't solve very much of the problem in practice. Still, it can be useful where appropriate, and there are libraries that use such approaches for safety; here's one example off the top of my head [2].

This is where I really prefer Python over Perl; Perl nominally has bullet-point feature compatibility with Python in most respects, but where in Perl operator overloading, tying, and the various other ways of overloading things always come with caveat lists a mile long and often go together poorly, in Python they Just Work, and actually can be used together. In this case, properties. They work. I've even made them dance and sing before with nonstandard behaviors and they still work.

[1]: http://perldoc.perl.org/perlsub.html#Lvalue-subroutines

[2]: http://search.cpan.org/~mlehmann/JSON-XS-2.32/XS.pm

Well, that's embarrassing. I fat-fingered those while I was transcribing code into the article. "Surprise Two" was supposed to be about the autovivification surprise, not me mistyping things (article updated).

Even still, your point about using objects and encapsulation stands. If the session were being assigned to via setters, and read from with getters, the grep aliasing wouldn't do its damage.

Cheers.

I basically gave up on perl after discovering this:

  foo foreach @bar;

  sub foo {
    s/foo/bar/
    print;
  }
This clobbers the content of @bar due to mutating the $_ variable. Which is a useful feature. But imagine that foo, instead of directly mutating the value passed to it, calls into a complex set of other code. Now you have a bomb where working code can break in crazy ways if any of it gets changed to modify $_.

Having to audit code for this kind of thing is a real pain. I still maintain existing perl projects, but won't be starting any new ones.

I think it's a pretty weak reason to give up on a language because you were using a global variable and expecting it to automatically be treated as a local variable within your function.

As others have mentioned, you are using a global variable ($_) inside a function. This is generally considered a bad practice.

The standard perl idiom of getting your function arguments via shift would make this a non-issue.

This is a feature, not a bug. And this is not _requirement_, it is up to you to use it. You are always free to pass parameters the way it would not happen..
This is not much different than mutating a global variable inside a function.

I would write your code as:

    foo($_) foreach @bar;

    sub foo {
        my $var = shift;
        $var =~ s/foo/bar/;
        print $var;
    }
You can also fix this is to localize $_ inside the function, if you don't want to pass $_ as a parameter.

    foo foreach @bar;

    sub foo {
        my $var = $_;
        $var =~ s/foo/bar/;
        print $var;
    }
You can even use $_ instead of $var (although it looks a little weird assigning $_ to itself):

    foo foreach @bar;

    sub foo {
        my $_ = $_;
        s/foo/bar/;
        print;
    }
You're funny. If this really pushed you over the edge, you were born -on- the edge, and didn't give Perl a fair shake.
Joey's example is bad (or, at best, fragile) Perl, but he's also written a fair amount of useful and popular code in Perl too.
Yes thanks.. to be clear, I have programmed in perl since 1995.
Both OPs examples would be spotted by unit tests and finding actual bug after that is quite trivial.

So I agree, you can trigger some features unpredictably, but this just shows power behind tool and remind you to be well self-organized =)

Frankly, the first one is a clear explanation of why you should have a config object not a hash.

Don't use a freeform data structure for data that isn't actually freeform if you want the perl5 VM to be able to catch errors for you.

Shutterstock are legendary round the perl community for being a hive of badly written ancient code that never gets significant refactoring effort though; combine this with a significant tendency not to use libraries due to NIH syndrome and their experience of gotchas becomes more an inevitability than bad luck.

Or: You might love your programming language, but if your relationship with it tends towards the abusive, the problems end up cutting both ways.

(though it's only in the past half decade that the perl community as a whole has seriously moved to trying to make the sensible techniques standard - and any project started during the dot-com era is almost certainly a hive of badly written now-ancient code; there's similarly timed spikes of awful PHP and then rails code due to their popularity spikes since)

> Shutterstock are legendary round the perl community for being a hive of badly written ancient code that never gets significant refactoring effort though

I like to think that we're legendary for choosing a sane upgrade path to a services based approach built on Modern Perl. For our help debugging Dancer's (https://github.com/sukria/Dancer/issues/642) and Feersum's (https://github.com/stash/Feersum/issues/12) header parsing troubles. For our continuous sponsorship of YAPC (http://yapcna.org/sponsorship/our-sponsors).

Heh, we're legends in my own mind! [grin]

   use strict;
   use warnings;
I love Perl. I wrote it professionally for 10 years. I have great friends who still do. The community is full of smart, fantastic people.

Five years ago, however, I learned Python for a job and never looked back. In terms of programming languages they're practically the same thing. But Python is easier to read, easier to write, and easier to hire for. And stuff like this (dereferencing, "array context," worrying about how to create my objects) makes me sigh.

Perl, I love you, but I'm not looking back.

(apologies for the language flame war bait)

Assuming you're not a troll, you have bad timing who left just before Moose and dependable grammar extensions. I've been doing some Python lately and regularly scream when I lack usable anonymous funs (>1 instruction) or can't find a lib I know exist on CPAN, etc.

ymmv.

Not a troll, real human, but I guess I was unconsciously trolling. Oh well.