Hacker News new | ask | show | jobs
by JohnDeHope 1590 days ago
Right, the beauty of this is that it's a syntax error he's produced. Sure any language can write code to produce a runtime error under any condition at all. And a lot of languages have features that allow them to run code at compile time, and similarly produce errors then too. But those languages don't allow you to ruin the syntax of the language. I guess it's something that requires macros. Would it be safe to say that any language with macros can do this? Or is there something even more interesting with perl going on here?
4 comments

Common Lisp is another language in which this should be possible. Common Lisp allows you to run arbitrary code at compile-time, and that code is allowed to modify the language syntax (*READTABLE*, SET-MACRO-CHARACTER, etc). So code could make itself syntactically invalid on Fridays by changing the language syntax depending on the day of the week.

This goes beyond mere Lisp macros, in that ordinary Lisp macro invocations still look like Lisp lists, while with this you can make arbitrary changes to the syntax, you could even make Common Lisp look like Pascal (if you really wanted to)

The designers of Scheme intentionally left this feature out (which was also found in some of Common Lisp’s ancestors, such as Maclisp), but some Scheme implementations/descendants included it anyway (as an extension), such as Racket, Guile and Chicken Scheme.

i can do something like this with macros in pike:

the following code produces a syntax error if the time is less than 10 seconds after a full minute.

    #define T __TIME__  
    #if T[6]-48
      #define X 1
    #else
      #define X /
    #endif

    void main()
    {
      write("%d, %c, %O\n", X, T[6], T);
    }
Bash parses as it runs so something as simple as this works

  if [ $(date +%u) -ne 5 ]
  then
    exit
  fi
  (
Bash even reads the file as it goes, so if you run a "long-running" script (a sleep is enough), edit far enough down, and write the file again, the previously started bash will end up running the new content once it gets up to reading where the change happened.
You can exploit it do distinguish whenever script is `curl | bash`'ed.

Add `sleep 1`, and detect pause on server. Then, if pause detected - serve attack payload. If not - somebody is careful enough to download and audit, so serve just the script.

You could check syntax of the whole file (even the unreachable parts) with the -n option.

But that's not bulletproof; consider this code (adapted from <https://hal.archives-ouvertes.fr/hal-01513750/document>):

  if [ $(date +%u) -eq 5 ]
  then
      alias maybe=''
  else
      alias maybe=:
  fi
  maybe for x in; do :; done
"sh -n" always reports syntax error, even thought the script syntax is correct on Fridays.
It is formally proven that Perl can't be parsed!

https://www.perlmonks.org/index.pl?node_id=663393

It can be externally parsed with ambiguities. The article is written by someone who authored such a parsing toolkit about three years later. Despite what you may have learnt, an ambiguous parse tree is still a useful thing to have, we can build tools taking it into account, also most existing tools can be modified in a straightforward fashion to make use of the extra nodes.

The real Perl parser disambiguates with heuristics and run-time hints.

There exists an unambiguous subset of Perl syntax that is expressible with a BNF grammar, and such is amenable to all parsers. http://p3rl.org/standard#DESCRIPTION

Yeah, I always found the linked post to not be all that serious about it anyway, just a funny, satisfying hacker thing.
It has been said that "the Perl compiler is based on yacc, lex, smoke, and mirrors."