I think the old report you're referencing is this [1] from July 2025, but I can't find a new report. This [2] links to a new dataset at the bottom (that maybe shows improvements?) but it seems like they chose not to write it up because of perceived flaws in their study. Is there a more relevant report I'm missing?
https://arachnemag.substack.com/p/the-metr-graph-is-hot-garb...