Hacker News new | ask | show | jobs
by th0ma5 358 days ago
People should be demanding consistency and traceability from the model vendors checked by some tool perhaps like this. This may tell you when the vendor changed something but there is otherwise no recourse?
1 comments

Agreed! FWIW I am attempting to create an open-source wiki/watchdog eval platform -- weval.org -- , so we can all keep an eye on LLMs, their biases, and their general competencies without relyong in the AI providers marking their own homework. I really believe this needs to exist to express our needs and hold model creators to account. Especially as model drift and manipulation becomes a risk.