Hacker News new | ask | show | jobs
by uaas 231 days ago
I am curious, what’s the point of re-running these interactions on a UI?
1 comments

Reproduction I suppose. I would like the same things as OP too.

LLM outputs are qualitative; they can't really be automatically scored and prompt enhancements tend to multiply the bug. It can solve a problem, but introduce a new one. It's practical just to do it manually.