|
|
|
|
|
by lmeyerov
1159 days ago
|
|
Prompt testing, especially when for q/a pairs where there are multiple right answers, has been bugging me a lot The article is reasonable, but also shows a big gap in tooling, as the techniques there feel closer to linting & typing then testing once you do more interesting prompts. They don't check the interesting parts.. |
|
can you elaborate a bit more on what those interesting parts are?
It could just be a limitation of computation.