It's not. What I have in mind is TLS handshake mediated ESP SA pair keying and policy. Why? Because ESP is much much simpler to implement in silicon than TCP+TLS.
ESP is stateless if using IPv6 (no fragmentation), or even if using IPv4 (fragmented packets -> let the host handle them; PMTUD should mean no need for fragmentation the vast majority of the time). Statelessness makes HW offload easy to implement.
ESP is stateless if using IPv6 (no fragmentation), or even if using IPv4 (fragmented packets -> let the host handle them; PMTUD should mean no need for fragmentation the vast majority of the time). Statelessness makes HW offload easy to implement.