Multiple encoding vendors won't give you much. They're just different people writing different shell scripts that run ffmpeg (and not donating anything to the original project).
I guessit lets you avoid a little price lock-in, but you could also just write your own shell script.
Do you need 20 microservices for that? If you choose to swap a video vendor, I guess you need to write wrapper API calls for the new vendor?
Even if the new vendor provides that out of the box, NYT amortizes those costs one way or the other...
It may be more or less, depending on how you count the, but what I'm really trying to figure out is whether the over engineering and the extra abstraction layers are justified in this case and in general.
My experience so far is the opposite - Microservices, orchestration etc hide the complexity but don't remove it, so if the shit hits the fan, you still have to look under the hood and with Microservices, probably under 10 different hoods or a hood under the hood of a hood :)
I guessit lets you avoid a little price lock-in, but you could also just write your own shell script.