I don't think you need to go that deep. This technology is literally dehumanizing: it's replacing individual human aspects of someone's voice with a computer-generated facsimile.
By that same argument, taken naively, film and video are dehumanizing, but not deplorably so: certainly the intensity of emotion and experience through film is far less present than say immersive theater, but we may be more comfortable with this modality, and also, benefit from the economies of scale.
Similarly, a call center worker may not care about having their accent being heard, but wants to get their numbers up, without struggling with a customer that isn't familiar with their accent, and enjoys the ease of speaking in their own accent than having to use one that distant customers are accustomed to. Likewise a customer probably just wants their problem fixed, without the effort of getting accustomed to an accent that they rarely encounter. This meets your definition of deplorable, but analogous to the former scenario, perhaps not deplorably so.
Similarly, a call center worker may not care about having their accent being heard, but wants to get their numbers up, without struggling with a customer that isn't familiar with their accent, and enjoys the ease of speaking in their own accent than having to use one that distant customers are accustomed to. Likewise a customer probably just wants their problem fixed, without the effort of getting accustomed to an accent that they rarely encounter. This meets your definition of deplorable, but analogous to the former scenario, perhaps not deplorably so.