Hacker News new | ask | show | jobs
by syndacks 132 days ago
How do people evaluate creative writing and emotional intelligence in LLMs? Most benchmarks seem to focus on reasoning or correctness, which feels orthogonal. I’ve been playing with Kimmy K 2.5 and it feels much stronger on voice and emotional grounding, but I don’t know how to measure that beyond human judgment.
2 comments

I am trying! https://mafia-arena.com

I just don't have enough funding to do a ton of tests