GPU:s are generally terrible for video encoding. I think you're confusing the GPU with a dedicated "hardware encoder" ASIC (which may be located on the same die as other silicon components such as a CPU or a GPU).
An off-the-shelf GPU can encode / decode thousands of channels of video and audio in real-time.
There are some cloud providers that offer this as a service, allowing you to have the end-points of, e.g., video calls to use different audio and video codecs.
This is quite useful, e.g., when some people join the call only using audio via a cell phone in a different country using a different audio standard (or a land line, etc.). Or when somebody joins the video call from laptop tethering from a phone on a train. Or for switching video codecs depending on whether somebody is sharing their desktop or using a webcam to record their face.
The client can picks the codecs that are the best fit for the current situation (content, bandwidth, latency, etc.) and a could server transcodes the video from everyone else in the meeting to their clients format.
And that point is wrong, since there are some cloud providers using normal GPU cores for exactly this.
The codecs that NVEnc supports are just a few bunch, there is no real-time audio, no nothing.
All of this is implemented in CUDA, using normal CUDA implementations of audio and video codecs, and running on normal GPUs using normal GPU cores. In real time. Supporting thousands of audio and video channels concurrently.
Also, even for NVEnc and NVDec themselves, in some of the GPUs they do not use any specialized hardware and use normal GPU cores instead (e.g. see the older GM20x GPUs).
None of the ones I know are. They are all proprietary.
I'm not sure they are for sale either (probably for the right price), since they sell these "as a service", which pays better. I also don't think these are on sale for "small" customers.
There are some cloud providers that offer this as a service, allowing you to have the end-points of, e.g., video calls to use different audio and video codecs.
This is quite useful, e.g., when some people join the call only using audio via a cell phone in a different country using a different audio standard (or a land line, etc.). Or when somebody joins the video call from laptop tethering from a phone on a train. Or for switching video codecs depending on whether somebody is sharing their desktop or using a webcam to record their face.
The client can picks the codecs that are the best fit for the current situation (content, bandwidth, latency, etc.) and a could server transcodes the video from everyone else in the meeting to their clients format.