| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by padolsey 357 days ago
	Agreed! FWIW I am attempting to create an open-source wiki/watchdog eval platform -- weval.org -- , so we can all keep an eye on LLMs, their biases, and their general competencies without relyong in the AI providers marking their own homework. I really believe this needs to exist to express our needs and hold model creators to account. Especially as model drift and manipulation becomes a risk.