It's not too hard to build a decent detector. At one of my previous gigs (a niche social network) we trained a fairly OK NN to detect all sorts of undesired behaviours to flag content/account for moderation. We didn't even need that much data and training time for it to get OK.
In case of X its scale is the issue. Running the detector for every message, or even for every posting account once a month might be very expensive. This might be the primary reasoning behind the deliberation: make bots a little bit more expensive and finance the detector operation.
Likely not. Here's a paper that provides a decent argument that you just can't be sure that two Twitter accounts represent two different real people.
«We posit the Ghost Trilemma, that there are three key properties of identity -- sentience, location, and uniqueness -- that cannot be simultaneously verified in a fully-decentralized setting. […] We sketch a proof of this trilemma and outline options […]» https://cs.paperswithcode.com/paper/sok-the-ghost-trilemma
In case of X its scale is the issue. Running the detector for every message, or even for every posting account once a month might be very expensive. This might be the primary reasoning behind the deliberation: make bots a little bit more expensive and finance the detector operation.