Hacker News new | ask | show | jobs
by filterfiber 908 days ago
In their second sentence they have the most honest response I've seen so far at least: " averaged across 4 diverse customer tasks, fine-tunes based on our new model are _slightly_ stronger than GPT-4, as measured by GPT-4 itself."