Indeed, CA can be represented by simple combinations of boolean functions, obviously by NN also, which is a combination of similar nonlinear functions.
What do you mean by 'the best'? Deeper architectures are popular because they quiet easy to train. They do work well in practice on many tasks (especially vision) but they have their limits.
Infinite wide networks are a newly active field and has recently shown some promising results, theoretically [1, 2] and empirically [3].