|
|
|
|
|
by fpdavis
1013 days ago
|
|
That is a very high volume, especially for being in development. Here are a few things we have done... * Use local machine learning models wherever possible.
* Summarize and consolidate calls whenever possible (i.e. reduce token sizes using language analytics).
* Log all calls/responses so it is possible to reuse them and/or to train your ML models. This can cut down on duplicate calls.
* Monitor your API call logs to make sure the system isn't making calls it shouldn't.
* Throttle your calls by introducing delays/bottlenecks in the user interface (by far my least favorite).
* Charge more for your services to decrease demand.
* Contact your account rep and see what options they have to offer with a higher price tier.
|
|