|
|
|
|
|
by jampekka
2 hours ago
|
|
I think there's gonna be (or perhaps already is) a huge demand for evaling individual systems. Many countries are starting to adopt some criteria for LLM usage for public use, and I doubt govs are gonna develop in-house knowhow for this. These will likely form some kinds of "independent auditor" models, as the system provider has too strong conflicts of intetest. It's probably not gonna be exactly glorious work, but designing expert evals settings and collecting and crunching the data for quality assurance and control is going to be needed. |
|