Hacker News new | ask | show | jobs
by robrenaud 888 days ago
The TinyStories[1] paper has an interesting solution for how to evaluate stories. They ask GPT-4 to grade them on grammar, consistency, and creativity.

This seems like it would be extremely hard to figure out how to do automatically though.

[1] https://arxiv.org/pdf/2305.07759.pdf