Hacker News new | ask | show | jobs
by eru 3268 days ago
There's also redgrep (https://github.com/google/redgrep) that supports intersection and complements of regular expressions.

I am toying the idea of writing a little game where player A thinks of a regular expression, and player B tries to guess. If B guesses right, they win. If B guesses wrong, A has to provide a false positive and a false negative (if they exist), and B gets to guess again.

Can you think of ways to automate the roles of A and/or B?

2 comments

In computer science academia, this kind of game is called grammar induction (of which inferring regular expressions is a special case).

A classic algorithm for inferring regular expressions was given by Angluin: https://people.eecs.berkeley.edu/~dawnsong/teaching/s10/pape...

(This isn't quite the same setup as you're thinking of but there are a ton of variations on the basic idea)

Thanks. I had figured out that grammar induction was the right word to look for a while ago. (But took me a bit to find it.) I know the paper you linked to, but yes, it's not quite the right setup.
There's a conference on grammar induction called ICGI; might wanna browse through the proceedings to see if there's anything closer.
Thanks!

I'm basically interested in the equivalent of the "guessing game" for regular expressions. (See eg https://stackoverflow.com/questions/5440688/the-guess-the-nu... for a rational number solution.)

With a fixed guesser, that would encode all regular expressions / finite automata as sequences of binary digits. (But in a interestingly different way from just serializing the table for a DFA, or writing down the regular expression in ASCII characters.)

So I do AI research on something pretty related to the guessing game -- I'll shoot you an email.
I think AutomataDotNet can do all that :

Automation of regular expression generation, it seems easy : use RE fragments and aggregate them, or walk the type hierarchy of the RE AST and generate them randomly.

B needs to guess A's RE so we need to generate examples of strings belonging to it to gives hints : this is exactly the use case of AutomataDotNet.

Also if B guess a RE that is equivalent to A's RE it seems unfair to not attribute a win, so we need to tell if 2 RE belong to the same equivalence class. AutomataDotNet does have a AreEquivalent method.

You can automate the generation of false positive and a false negative with the method Minus to creates an automaton that accepts A-B or B-A and generate an example.

Thanks, I'll have a look.

You are right about the equivalence classes: for that you want to talk about the corresponding DFA (which have a unique normal form in the shape of the minimum DFA).

I am not sure about the rest of what you are saying: in general even just minimizing regular expressions is EXP-SPACE complete, if I remember right.

Yes, generation of false negative and false positive ain't so hard---theory agrees with you. But automating the guesser is, as far as I know.

I was talking about generating hints to give an opportunity to the RE guesser to find the secret RE. You seem to talk about automating the RE generation based on the pattern of the hints, yeah I didn't thought about that, the other answer seem to talk about it.