Y
Hacker News
new
|
ask
|
show
|
jobs
by
neelm
1049 days ago
Something like this is going to be needed to evaluate models effectively. Evaluation should be integrated into automated pipelines/workflows that can scale across models and datasets.
1 comments
krawfy
1049 days ago
Thanks Neel! We totally agree that automated evals will become an essential part of production LLM systems.
link