|
|
|
|
|
by EvgeniyZh
677 days ago
|
|
Given the high scores, I guess it was an easy one. I've taken the longer one, and got the following > You scored 28/100. The best language model, gpt-4, scored 32/100. The unigram model, which just picks the most common word without reading the prompt, scored 28/100. Assuming complexity averages out on N=100, small test with LLM score above ~5 is "easy" |
|