Hacker News new | ask | show | jobs
by somacert 2668 days ago
Scary, that is almost the exact code I came up with.

The funny thing is balanced parens can be recognized with with a state machine it is just ugly.

        state_data = {
                '0p':{
                        '(':'1p',
                        ')':'fail',
                        'END':'pass',
                        'OTHER':'0p',
                        },
                '1p':{
                        '(':'2p',
                        ')':'1p',
                        'END': 'fail',
                        'OTHER':'1p',
                        },
                '2p':{
                        '(':'3p',
                        ')':'2p',
                        'END': 'fail',
                        'OTHER':'2p',
                        }

                 ...
            }
Just make sure the state machine has as many states as you need paren levels.

I admit the moment that the math vernacular appears my brain seems to shut down, however after reading the proof that balanced parens cannot be described several times. It seems to depend on a couple things.

1 The string is allowed to be infinite the state machine not.

2 "But state(B, start, p) is the same state as state(B, start, q)!" why? the strings are defined to be different I see no reason they would be in the same state.

EDIT added analysis of article proof.

1 comments

The proof rest on feeding a nested parenthesis string to a hypothetical DFA called “B.” Since these are nested parentheses, a string that is recognized will be of the form `pp’`, where `p` is a string of open parentheses, and `p’` is a string of closed parentheses the same length as `p`.

e.g. (((((((((())))))))))

In that example, `p` is length ten. After reading `p`, the recognizer must be in one of its states. It can’t have halted, and it can’t have recognized.

Let’s say that the recognizer has `n` states. Consider the strings (), (()), ((())), ...

After (, the machine is in some state. Let’s give that a number, 0.

If we start all over and give it ((, it must be in another state, state 1, or we must accept that two different strings end up in the same state. Then we can start all over and give it (((, and we must end up in state 2. And so we can go, adding a parenthesis each time, and the machine must end up in a different state after reading a different number of opening parentheses.

Eventually we reach state n-1, after starting from scratch and giving it n opening parentheses. We must now know that all the strings from 0 to n-1 in length lead to n different states in an n-state DFA.

So what happens when we give it n+1 opening parentheses?

It must end in one of its states, but we already know that the strings with one through n opening parentheses have ended up in each of its n states. So the string with n+1 opening parentheses must end up in the same states as one of the shorter strings with opening parentheses, we just don’t know which.

Back to the proof. The shorter of the two strings is p from the proof. The longer is q.