Hacker News new | ask | show | jobs
by potatoman22 616 days ago
I wonder how much of the performance gains can be attributed to their improved dataset rather than their architecture. That would be an expensive experiment.
1 comments

The ablation studies and the dataset can be found here: https://www.zyphra.com/post/building-zyda-2