Hacker News new | ask | show | jobs
by nirga 705 days ago
I tend to find classic NLP metric more predictable and stable than "LLM as a judge" metrics so I'd try to see if you rely on them more.

We've written a couple of blog posts about some of them: https://www.traceloop.com/blog

1 comments

for your blog can i offer a big downvote for the massive ai generated cover image thing? its a trend for normies but for developers its absolutely meaningless. give us info density pls
roger that! I like them though (am I a normie then?)