Hacker News new | ask | show | jobs
by pamelafox 304 days ago
I ran bulk evaluations on a RAG scenario and wrote-up the results - discovered interesting differences (gpt-5 loves lists, smart quotes, and admitting it doesn't know).