|
|
|
|
|
by edg5000
5 days ago
|
|
Wow, looks like you've found a massive flaw indeed. I was skeptical about the results because in my experience both recent GPT and Opus modules are strong. Everything else is B or C tier. This is just artisanal vibe testing though. It's very hard to eval them properly. |
|