Hacker News new | ask | show | jobs
by botro 559 days ago
I made https://aimodelreview.com/ to compare the outputs of LLMs over a variety of prompts and categories, allowing a side by side comparison between them. I ran each prompt 4 times for different temperature values and that's available as a toggle.

I was going to add reviews on each model but ran out of steam. Some users have messaged me saying the comparisons are still helpful to them in getting a sense of how different models respond to the same prompt and how temperature affects the same models output on the same prompt.

2 comments

Hey, this is pretty insightful! Wonder if, in the course of researching to build this website you reached any conclusions as to what’s the AI assistant currently ahead.
I can confirm, it's still very helful, thank you!