|
|
|
|
|
by gwd
106 days ago
|
|
OK, so a while back I set up a workflow to do language tagging. There were 6-8 stages in the pipeline where it would go out to an LLM and come back. Each one has its own prompt that has to be tweaked to get it to give decent results. I was only doing it for a smallish batch (150 short conversations) and only for private use; but I definitely wouldn't switch models without doing another informal round of quality assessment and prompt tweaking. If this were something I was using in production there would be a whole different level of testing and quality required before switching to a different model. |
|