|
|
|
|
|
by e12e
677 days ago
|
|
you: 8/15
gpt-4o: 2/15
gpt-4: 4/15
gpt-4o-mini: 4/15
llama-2-7b: 5/15
llama-3-8b: 5/15
mistral-7b: 6/15
unigram: 5/15
> You scored 8/15. The best language model, mistral-7b, scored 6/15. The unigram model, which just picks the most common word without reading the prompt, scored 5/15.(In I think 120 seconds - didn't copy that part). Interesting that results differ this much between runs (for the LLMs). Surely someone did better than me on their first run? Ed: I wonder if the human scores correlate with age of hn account? |
|