Benchmarking code LLMs using user preference

Y	Hacker News new \| ask \| show \| jobs

	Benchmarking code LLMs using user preference (arena.glaive.ai)
	2 points by sahil_chaudhary 1006 days ago

1 comments

This is neat. I feel like it should be blind by default though.