Right but... we've been cropping images in web applications since... y'know, pretty much ever. Using ML to do this was always pretty ridiculous overkill. Give the users an image cropper, and be done with it.
I can't see why this is overkill. You're eliminating a step from the image posting process, and making it so users don't have to crop an image twice (once for the full image, and a second time for the preview). That makes sense when you're writing a CMS or blogging platform like Wordpress, but for Twitter it adds some friction.
So, previously, the preview was just cropped in the center. But this made some images look funny, since people's faces would get chopped off.
Coming up with a workable solution to this with ML is not especially hard. You can get things like face detection off the shelf, maybe just tell your autocropper, "crop closer to the face" and have a demo within a couple days (and then much more effort to productionize it). From there, you can start introducing ML models to improve on your basic face detection. (I'm not counting face detection as ML.)
This is not a case where some massive ML model is being brought in to save two seconds of your time. This is a very natural and obvious application of ML, at a company which already does ML at scale, in a way that sounds like it has a good chance at improving the appearance of the site without introducing additional friction.
Instragram gets around this by encouraging everyone to take square photos.
I don’t think anyone is saying “I will always prefer to crop every photo and everyone else should too”. I think the point is closer to, if I may borrow a Simpsons line, “I liked your half-assed underparenting a lot more than your half-assed overparenting”. It’s actually impressive that Twitter didn’t object “but I was using my whole ass”, which is basically their default trope when they address user complaints.