It's a great first step to hiring. But I'm surprised that he didn't have any university graduates pass--the problem itself is a freshman-level bozo test used to apply the concept of a stack. The rest (log in to ssh and install a CGI script) is a set of useful skills that any competent programmer could (hopefully) teach themselves within the timespan of the problem.
If you only have 3 out of 30 candidates pass the bozo test, that limits your choices when it comes to testing them on having any real talent.
As you and the original author both know, you don't need to use a stack data structure if you're just matching a single type of parens. (Your algorithm is the server-side solution employed by the author.) Though if you used {([])} with rules against them side-straddling like ([)], the stack would be necessary to distinguish what type of matching character you're using.
But look at what you have: an integer (n) that counts the number of unmatched open parens, that is not allowed to drop below 0, and that can only be incremented and decremented. Your solution there is essentially a pushdown automaton, with n representing your stack of open parentheses (and the value of n representing the size of the stack). If you approached the problem thinking of it as a stack problem, you'd probably end up optimizing to your solution when you realized that actually stacking parentheses could be efficiently simulated with an integer. But the restrictions you've placed on that integer effectively simulate the restrictions of pushing and popping parens off a stack. You're even going through the string once without backtracking.
I'm not sure if it's useful or just comp-sci wankery to think of it this way, but there's an equivalence between that solution and the freshman stack solution after all.
Which is interesting, as a proper regular expression is incapable of testing for a parens match. The set of strings with properly matched parens is a context-free language, not a regular language. But modern "regex" engines are capable of matching non-regular languages, and iterative regex-based substitutions (as his client-side code does--his server-side code is equivalent to the stack-based approach) can approximate the effect.
If you only have 3 out of 30 candidates pass the bozo test, that limits your choices when it comes to testing them on having any real talent.