|
|
|
|
|
by simonw
462 days ago
|
|
> If you have the resources to fine tune, you have the resources to run inference on fine tuned model. I don't think that's true. I can fine tune a model by renting a few A100s for a few hours, total cost in the double digit dollars. It's a one-time cost. Running inference with the resulting model for a production application could cost single digit dollars per hour, which adds up to hundreds or even thousands of dollars a month on an ongoing basis. |
|
That may or may not be true for use-cases that require asynchronous, bulk inference _and_ require some task-specific post-training.
FWIW, my approach towards tasks like the above is to
1. start with using an off-the-shelf LM API until
2. one figures out (using evals that capture product intent) what the failure modes are (there always are some) and then
3. post-train against those (using the evals)