The issue with Attention essentially is that it is used to relate all token of the input sequence with each other. The need to do that somehow makes sense no matter how much one understands about the internals of a transformer. The naive way to do that boils down to matrix multiplications, and a lot more people understand the performance issues implied by them.
If you can understand attention and transformers, how can you not understand that population numbers can rise, reach a peak, fall, and then level out (all w/out any genocidial actions)?
How can you claim that it is "absurdism" to imagine something that can be seen in data across the plant and animal kingdom?