Hacker News new | ask | show | jobs
by israrkhan 319 days ago
That series of questions will measure only a particular area. I am concerned about destorying model capabilities in some other area that that I do not pay attention to, and have no way of knowing.
1 comments

Isn’t that a general problem with LLMs? The only way to know how good it is at something is to test it.