It requires access to hardware counters you don't normally have in EC2, and at a privilege level I wouldn't want to enable in a multi-user compute system.
Ho, ho. There are assorted free performance tools that do, though -- at least on POWER and ARM64 for various HPC-focussed ones. I don't know much about VTune, but it's not clear to me what it does that I can't do with other tools on x86_64, and others allow me to measure serial and communication metrics together.