|
|
|
|
|
by SomaticPirate
53 days ago
|
|
This seems to be testing the models on leetcode style prompts that also require the model to implement TCP calls to send the results. Interesting but probably not a apples to apples comparison. The fact only Grok qualified for the first one seems suspect |
|