|
|
|
|
|
by bpp
702 days ago
|
|
I work in AI product eng for a larger company. The honest answer is that with good RAG and few-shot prompting, we can consider actual incorrect output to be a serious and reproducible bug. This means that when we call LLMs in production, we get about the same wrong-answer rate as we do any other kind of product engineering bug. |
|