|
|
|
|
|
by apinstein
83 days ago
|
|
I am playing around with building my own similar and am faced with the question you pose. How can you tell if your prompt process works? I feel like the outputs from SDLC process are so much more high level than could be done with evals, but I am no eval expert. How would you benchmark this? |
|