| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by krawfy 1095 days ago
	For now, we just aggregate those across the models / prompts / templates you're evaluating so that you can get an aggregate score. You can export to CSV, JSON, MongoDB, or Markdown files, and we're working on more persistence features so that you can get a history of which models / prompts / templates you gave the best scores to, and keep track of your manual evaluations over time.