Hacker News new | ask | show | jobs
by thomastay 907 days ago
I'd recommend anyone who's interested in testing Chatbots to checkout https://chat.lmsys.org/

It lets you test out two random different chatbots with the same prompt and compare them. Best thing is, your votes are used to rank LLMs on a public leaderboard, which helps AI researchers.

Here's my prompt I was playing with, which basically only Claude 2 and GPT4 answers well:

  How many legs do ten platypuses have, if eleven of them are legless? Platypuses have 3 legs. Walk it through step by step