Hacker News new | ask | show | jobs
by CrypticShift 873 days ago
Someone should make a censorship/alignment (whatever you want to call it) benchmark for LLMs.
1 comments

https://tatsu-lab.github.io/alpaca_eval/

Such a leaderboard exists, AlpacaEval Leaderboard ranks LLMs on the ability to follow user instructions.