I recommend starting with ECMA-35. The actual structure of escape sequences is explained there. Otherwise it gets lost that it's the ESC [ that is the entire escape sequence, an alternative form of CSI, and that this actually is one part of an entire mechanism of escape sequences with intermediate and final bytes. It's a control sequence that CSI then introduces.
The important part is that the escape prefix is just an alternate way to represent each of the 32 C1 control characters with a pair of 7-bit characters.