|
|
|
|
|
by tylermarques
129 days ago
|
|
In the same vein, we recently released a version v0.1 of our humor benchmark. [1] We use human answers from a cards against humanity style game call Bad Cards [2] as ground truth for what is funny. The models get to choose a card from a hand of 3-6 cards, so not quite de novo joke creation. [1] https://goodstartlabs.com/leaderboards/lol-arena [2] https://bad.cards/ |
|