Hacker News new | ask | show | jobs
by SparkyMcUnicorn 286 days ago
When it comes to "real-world development scenarios" they claim to be closer to Sonnet 4.

This is the data for that claim: https://huggingface.co/datasets/zai-org/CC-Bench-trajectorie...