Hacker News new | ask | show | jobs
by saba2008 1582 days ago
Most other languages would require `eval` and having incorrect code as syntactically correct string literal.

But because Perl is parsed as it is executed, incorrect code raises syntax error only when it is reached by execution.

5 comments

Right, the beauty of this is that it's a syntax error he's produced. Sure any language can write code to produce a runtime error under any condition at all. And a lot of languages have features that allow them to run code at compile time, and similarly produce errors then too. But those languages don't allow you to ruin the syntax of the language. I guess it's something that requires macros. Would it be safe to say that any language with macros can do this? Or is there something even more interesting with perl going on here?
Common Lisp is another language in which this should be possible. Common Lisp allows you to run arbitrary code at compile-time, and that code is allowed to modify the language syntax (*READTABLE*, SET-MACRO-CHARACTER, etc). So code could make itself syntactically invalid on Fridays by changing the language syntax depending on the day of the week.

This goes beyond mere Lisp macros, in that ordinary Lisp macro invocations still look like Lisp lists, while with this you can make arbitrary changes to the syntax, you could even make Common Lisp look like Pascal (if you really wanted to)

The designers of Scheme intentionally left this feature out (which was also found in some of Common Lisp’s ancestors, such as Maclisp), but some Scheme implementations/descendants included it anyway (as an extension), such as Racket, Guile and Chicken Scheme.

i can do something like this with macros in pike:

the following code produces a syntax error if the time is less than 10 seconds after a full minute.

    #define T __TIME__  
    #if T[6]-48
      #define X 1
    #else
      #define X /
    #endif

    void main()
    {
      write("%d, %c, %O\n", X, T[6], T);
    }
Bash parses as it runs so something as simple as this works

  if [ $(date +%u) -ne 5 ]
  then
    exit
  fi
  (
Bash even reads the file as it goes, so if you run a "long-running" script (a sleep is enough), edit far enough down, and write the file again, the previously started bash will end up running the new content once it gets up to reading where the change happened.
You can exploit it do distinguish whenever script is `curl | bash`'ed.

Add `sleep 1`, and detect pause on server. Then, if pause detected - serve attack payload. If not - somebody is careful enough to download and audit, so serve just the script.

You could check syntax of the whole file (even the unreachable parts) with the -n option.

But that's not bulletproof; consider this code (adapted from <https://hal.archives-ouvertes.fr/hal-01513750/document>):

  if [ $(date +%u) -eq 5 ]
  then
      alias maybe=''
  else
      alias maybe=:
  fi
  maybe for x in; do :; done
"sh -n" always reports syntax error, even thought the script syntax is correct on Fridays.
It is formally proven that Perl can't be parsed!

https://www.perlmonks.org/index.pl?node_id=663393

It can be externally parsed with ambiguities. The article is written by someone who authored such a parsing toolkit about three years later. Despite what you may have learnt, an ambiguous parse tree is still a useful thing to have, we can build tools taking it into account, also most existing tools can be modified in a straightforward fashion to make use of the extra nodes.

The real Perl parser disambiguates with heuristics and run-time hints.

There exists an unambiguous subset of Perl syntax that is expressible with a BNF grammar, and such is amenable to all parsers. http://p3rl.org/standard#DESCRIPTION

Yeah, I always found the linked post to not be all that serious about it anyway, just a funny, satisfying hacker thing.
It has been said that "the Perl compiler is based on yacc, lex, smoke, and mirrors."
But because Perl is parsed as it is executed, incorrect code raises syntax error only when it is reached by execution.

Sorry, but you're dead wrong. Perl is not parsed as it is executed, which can be verified easily by writing a program with a syntax error at the end, and seeing that it doesn't run code at the top. Try it with the following program.

    print "Hello, world\n";
    This line raises a syntax error before the previous line tries to execute.
What is going on is that BEGIN blocks are special, they are executed as soon as they are parsed. With them we can interleave parsing and execution.

In this case we're assigning to a symbol. And then the parsing of the final line is dependent on whether or not that symbol has a prototype. See https://perldoc.perl.org/perlsub#Prototypes for what Perl prototypes are, and to see why they would affect parsing.

>incorrect code raises syntax error only when it is reached by execution.

That's not generally true for Perl. The BEGIN block is used to get in that state here. "Some incorrect code raises syntax error only when it is reached" is true.

It's generating this on Fridays:

  &f() / 1;
And this on other days:

  f(/1;#/+);
If you run the same code, but without BEGIN blocking the assignment to *f, it isn't incorrect code. It evaluates as:

  'f' / 1;
What happens if you run this code in a loop Thursday night just before Friday?

Does the parse that happens on Thursday take precedence or is it reparsed every single time through the loop?

The BEGIN block only runs once. That is:

  while (1) {
    BEGIN {print "hello\n"}
    sleep 1;
  }
Will only print "hello" one time.

You could loop inside the BEGIN block and then drop out of the loop at some point. If you dropped out on friday, after the code assigning *f, it would run correctly. So:

  BEGIN {
       sleep 86400;
       *f = (localtime->wdayname eq 'Fri')
        ? sub() {}
        : sub {};
  }
  f/1;#/+
Would run correctly if you started it on Thursday.
Perl is not parsed as it is executed. It is compiled to opcodes then run. The BEGIN block, however, explicitly runs code at compile time. It is literally compiling different code on different days, then attempting to run it.
Technically not opcodes, but internal Data Structures. A serializer was written for that, to permit ".pmc files".
Oh, hey, Randal. Thanks for the correction.

Looking forward to the next time we can have steaks and Scotch.

Ruby has that latter quality in many scenarios. For example:

    p 'Friday!' if Time.now.wday==5 || h