What about keras? Keras was originally billed as the torch API in python, so same API design inspiration. I've used both but generally still use keras because I don't see enough of a difference in pytorch to switch.
The graph does need to be compiled, you're right. But you can still do everything interactively through the REPL. I don't see the difference in practice.
Also, is there any reason to think useful optimizations are being made during compilation in theano or tf that don't get made in torch because it is more strictly dynamic? Anecdotally, pytorch seems quite fast, but I'm wondering.