Hacker News new | ask | show | jobs
by ninkendo 420 days ago
Writing one correctly is pretty complicated task if you’re trying to write a simple tutorial… off the top of my head, you’d need:

    (
      (
      25[0-5] # 250-255
      |
      2[0-4][0-9] # 200-249
      |
      1[0-9]{2} # 100-199
      |
      [1-9][0-9] # 10-99
      |
      [0-9]
      )
      \.
    ){3}
    (
    25[0-5] # 250-255
    |
    2[0-4][0-9] # 200-249
    |
    1[0-9]{2} # 100-199
    |
    [1-9][0-9] # 10-99
    |
    [0-9]
    )
    
… but without all the nice white space and comments, unless you’re willing to discuss regex engines that let you do multi-line/commented literals like that… I think ruby does, not sure what other languages.

The problem is that expressing “an integer from 0-255” is surprisingly complicated for regex engines to express. And that’s not even accounting for IP addresses that don’t use dots (which is legal as an argument to most software that connects to an IP address), as other commenters have pointed out.

2 comments

Regex can be good but you need to be willing to bail out when it’s not appropriate.

For something like locating IP addresses in text, using a regex to identify candidates is a great idea. But as you show, you don’t want to implement the full validation in it. Use regex to find dotted digit groups, but validate the actual numeric values as a separate step afterwards.

> I think ruby does, not sure what other languages.

You're right that Ruby has it. Perl also has /x, of course (since most of Ruby regex was "inspired" directly by Perl's syntax), as well as Python (re.VERBOSE). Otherwise, yeah, it's disappointingly rare.

.net also supports verbose regex.