Hacker News new | ask | show | jobs
IRL 25: Evaluating Language Models on Life's Curveballs (alignedhq.ai)
4 points by pmmucsd 688 days ago