Hacker News new | ask | show | jobs
by typpo 889 days ago
Congrats on the launch!

I've been interested in automatic testset generation because I find that the chore of writing tests is one of the reasons people shy away from evals. Recently landed eval testset generation for promptfoo (https://github.com/typpo/promptfoo), but it is non-RAG so more simplistic than your implementation.

Was also eyeballing this paper https://arxiv.org/abs/2401.03038, which outlines a method for generating asserts from prompt version history that may also be useful for these eval tools.

1 comments

Thanks! I've been following promptfoo, so I'm glad to see you here. In addition to automatic evals I think every engineer and PM using LLMs should be looking at as many real responses as they can _every day_, and promptfoo is a great way to do that.