Hacker News new | ask | show | jobs
by vladholubiev 1163 days ago
Does anybody know if there is a similar playground, but for evaluating a single model by comparing different prompts with different temperatures?

I am building this by myself using Streamlit, but was wondering is there a ready solution.