> However, the developers were soon to clarify that the 100x claim applies to just a single function, “not the whole of FFmpeg.”
So OP is correct. The 100x speed up is according to some misleading micro benchmark. The reason is that that transform is a huge amount of code and as OP said this will blow out the code cache while the amount of data you’re processing results in a blowout of the data cache. Net overall improvement might be 1% if even that.
And it's probable that the developer is comparing code compiled at -O0 (no optimization) against hand-coded assembler, like they did the last time they claimed a 90x speed up.
So just to to summary: either a 100x, or a 100% speedup (depending on which source)
- comparing hand-coded assembler vs. unoptimized C code.
- on a function that was poorly written in the first place.
- in code that's so rarely used that nobody could be bothered to fix it for decades.
- and even then, a tiny function whose overall CPU cost was about 2% of CPU cost to perform the obsolete task that nobody cared about enough to fix.
- so basically code that fails the profile before optimize rule, and should never have been optimized in the first place.
> The 100x speed up is according to some misleading micro benchmark.
Honestly though, nobody who has any idea how anything works would have expected ffmpeg to suddenly unearth a 100x speedup for everything. That's why the devs did not clarify this right away. It's too laughable of an assumption.
So OP is correct. The 100x speed up is according to some misleading micro benchmark. The reason is that that transform is a huge amount of code and as OP said this will blow out the code cache while the amount of data you’re processing results in a blowout of the data cache. Net overall improvement might be 1% if even that.