| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by resiros 872 days ago
	Shameless promotion, I might have the tool for that :) https://github.com/agenta-ai/agenta We're building a platform for evaluating prompts (and more complex LLM workflows). From what we've seen from users, the results for prompts are highly stochastic. It's hard to make generalizations. For example, a user building a sales assistant discovered that by simply changing the order of the sentences in the prompt, the accuracy improved significantly.