Y
Hacker News
new
|
ask
|
show
|
jobs
by
zhisbug
1173 days ago
but it is indeed difficult to eval chatbots and LLM esp. considering most of them have actually seen the Internet data at least once.
1 comments
zhisbug
1173 days ago
and I do think this is a good effort
link