Hacker News new | ask | show | jobs
Randexp.js: Create random strings that match a given regular expression (github.com)
82 points by benwithem 3941 days ago
9 comments

Warp [1] implements this, and also allows you to generate a list of all possible values matching a particular regex (the number of possibilities of course grows quite quickly for many regexes!).

[1] https://itunes.apple.com/us/app/warp/id973942134?mt=12

Exrex does the same in Python and much more:

   - Generating all matching strings
   - Generating a random matching string
   - Counting the number of matching strings
   - Simplification of regular expressions
https://github.com/asciimoo/exrex (AGPL licensed)
That's cool, thanks. I've discovered sre_parse thanks to that link, so cool too.
I've made a node package that does the same thing, a few years ago: https://github.com/arcanis/pxeger

I mainly used it to generate test mail addresses. For example, the following will generate 36 test email addresses:

    (blue|red|yellow|green).(oak|cedar|willow)@(yahoo.co.uk|google.com|example.org)
Do you think it’s safe to generate ticket numbers with such tools? I'm currently this PHP lib to do it: https://github.com/icomefromthenet/ReverseRegex and since we do not sell thousands of tickets a day I think it should be okay to use it for this purpose. What do you think?
It's practical, if you check the new one is not a duplicate - the lib info does not state how distinct the values would be, or how does it uses the RNG.

Depending on what you really need (output string format, amount of numbers, non-additive links), you should find a hash function for your fit. You can always start by looking for youtube url hash.

My new favourite password generator is now http://fent.github.io/randexp.js/#r=%5B%5E%5Cs%5D%7B16%7D&i
I'd be surprised if it had the amount of entropy you need. I haven't looked at the source, but javascript in a browser is not a good source of random numbers.
Safari loved that.
Would be great to integrate it into a live regex editor such as http://www.regexr.com/
So what happens when you pass something like `(?!.?|(..+?)\\1+)

Or does it not support lookahead?

What's the practical use for this...? Input validation?
Offhand, I think it could be useful for ensuring your regex is doing what you want it to do. Perhaps I'd write a regex to validate a xxx-xxx-xxxx telephone number:

    \d{3}\-\d{3}\-\d{3}
Oops! I made a typo! That last 3 should be a 4. If I were tired, I could very easily see myself doing something like this. If I could put it into a tool and see 100 things which match it, I'd see very quickly that I messed something up.
Agreed. I see this as a sort of fuzz tester for your assumptions.
According to the readme:

"Regular expressions are used in every language, every programmer is familiar with them. Regex can be used to easily express complex strings. What better way to generate a random string than with a tool you can easily use to express the string any way you want?"

An example where this is useful (I guess) is generating data for tests. You can easily define an email regexp to generate random "valid" values for each model.

For testing you'd want to generate strings non-uniformly, i.e. you want to hit edge cases rather than test 1 billion similar 20 character length e-mails.
That's true. What I meant was that this tool is good to support testing the same way factories work: generating random valid emails, phones, addresses, etc., for each record.
But don't you generate the test input from what your test ends up accepting? I don't see the point.
That's true if your validation exists in a vacuum. But if you have a picky black box backend and the input validation you quickly made up allows strings that the backend chokes on, this is the way to validate the validation regex.
Testing.
If anyone misses a demo: http://jsfiddle.net/n3dgjLo5/
This is great for regex crossword generators!