Apple will keep up with anything that SOTA, just with a bit of a lag - so just expect they will be better soon if not already
Word of warning from someone who built an SDK that filled in a processing gap that Apple had (6DOF Monocular SLAM)[1] Apple will eventually make your technology obsolete and their version will be way better. See: ARKit
We open sourced it once ARKit came out because there was no way to monetize it further
Whisper is a game changer in terms of accuracy. It makes Zoom, YouTube, Zoom, Office/Azure, Descript, and Otter.ai transcription look like jokes in comparison.
The step change in transcription accuracy here is significant enough to cross an important threshold for usefulness.
Word of warning from someone who built an SDK that filled in a processing gap that Apple had (6DOF Monocular SLAM)[1] Apple will eventually make your technology obsolete and their version will be way better. See: ARKit
We open sourced it once ARKit came out because there was no way to monetize it further
[1] https://github.com/Pair3D/PairSDK