No dude, the latest versions of the models it routes to are markedly poorer in performance than their predecessors.
I’m observing a law that states: There appears to be a direct relationship between model performance and cost, such that whenever a company claims to have reduced inference costs, customers immediately notice a corresponding decline in model performance.
I’m observing a law that states: There appears to be a direct relationship between model performance and cost, such that whenever a company claims to have reduced inference costs, customers immediately notice a corresponding decline in model performance.