The generalizing power of deep convolutional neural nets is much greater, but (1) it doesn't really matter if you don't have anything similar in your training dataset, and (0) CNNs are used for face _recognition_, and for it you still have to detect face first. And in detection, VJ is still king.
Granted my domain of knowledge is more aligned with the recognition side, i'm pretty sure Viola-Jones has been replaced as 'king' with more robust approaches to detection. Here is a paper published just last year in CVPR: