Hacker News new | ask | show | jobs
by Rangi42 4072 days ago
This is a great idea! It didn't behave the way I expected with some URLs as input, though:

    http://example1.org/path/index.html
    http://www.example2.org/path/index.html
    http://www.example3.org/
    https://www.example4.org/a/b/c/d/e/f/g/hijklmnop
The pattern I gave was:

    example1.org, path/index.html
So I expected to get:

    example1.org, path/index.html
    www.example2.org, path/index.html
    www.example3.org, 
    www.example4.org, a/b/c/d/e/f/g/hijklmnop
Instead, I got:

    example1.org, path/index.html
    wwwexample.2, org/path.index
    wwwexample.3, org/.
    wwwexample.4, org/a.b
A few feature requests: allow downloading the output as a text file; show a pseudo-code formula of how transformy interpreted the transformation, like "s/.+:\/\/(.+?)\/(.*)/\1, \2/"; and add support for common arbitrary transformations like "November"↔"NOV"↔"11", or "2"↔"2nd".
2 comments

I think it's trying to be too magical. At this point it either seems to work, or something triggers it's pattern matching wrong and it's really hard to figure out what or why. I think giving back a little of the simpleness in favor of more control is worthwhile. For example, if the example portions that were formatting were differentiated from the data matching, it's not too complicated but intent is much clearer.

For example, if the rules were: example content must be contained within braces, and any braces within the example content need to be escaped, it's clear. At that point, your example becomes:

  {example1.org}, {path/index.html}
It would still probably just return "wwwexample.4, g/hijklmnop" for the last example though, because it's ambiguous as to whether you want just the end of the url, or the whole thing. Allowing regex markup for more explicit matching would make it clearer still, but your example still causes problems until you go all the way to positive lookbehind assertions. At that point I need to learn all that, I might as well just use perl:

  # perl -pe 's{.*https?://([^/]+)(/\S*).*}{$1, $2}' /tmp/foo
  example1.org, /path/index.html
  www.example2.org, /path/index.html
  www.example3.org, /
  www.example4.org, /a/b/c/d/e/f/g/hijklmnop
TXR language:

   @(repeat)
   @proto://@domain/@path
   @(do (put-line `@domain, @path`))
   @(end)