What does "inferencing models at scale" mean?
The models produced have to be fast and efficient to support capacity/cost, which has some detriment to quality/accuracy.