| HN Mirror

The proof rest on feeding a nested parenthesis string to a hypothetical DFA called “B.” Since these are nested parentheses, a string that is recognized will be of the form `pp’`, where `p` is a string of open parentheses, and `p’` is a string of closed parentheses the same length as `p`.

e.g. (((((((((())))))))))

In that example, `p` is length ten. After reading `p`, the recognizer must be in one of its states. It can’t have halted, and it can’t have recognized.

Let’s say that the recognizer has `n` states. Consider the strings (), (()), ((())), ...

After (, the machine is in some state. Let’s give that a number, 0.

If we start all over and give it ((, it must be in another state, state 1, or we must accept that two different strings end up in the same state. Then we can start all over and give it (((, and we must end up in state 2. And so we can go, adding a parenthesis each time, and the machine must end up in a different state after reading a different number of opening parentheses.

Eventually we reach state n-1, after starting from scratch and giving it n opening parentheses. We must now know that all the strings from 0 to n-1 in length lead to n different states in an n-state DFA.

So what happens when we give it n+1 opening parentheses?

It must end in one of its states, but we already know that the strings with one through n opening parentheses have ended up in each of its n states. So the string with n+1 opening parentheses must end up in the same states as one of the shorter strings with opening parentheses, we just don’t know which.

Back to the proof. The shorter of the two strings is p from the proof. The longer is q.