| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by heipei 331 days ago
	Counterpoint: I once wrote a paper on accelerating blockciphers (AES et al) using CUDA and while doing so I realised that most (if not all) previous academic work which had claimed incredible speedups had done so by benchmarking exclusively on zero-bytes. Since these blockciphers are implemenented using lookup tables this meant perfect cache hits on every block to be encrypted. Benchmarking on random data painted a very different, and in my opinion more realistic picture.

5 comments

atiedebee 331 days ago

Why not use real world data instead? Grab a large pdf or archive and use that as the benchmark.

link

bluGill 331 days ago

A common case is to compress data before encrypting it so random is realistic. Pdf might allow some optimization (how I don't know) that is not representative

link

jbaber 331 days ago

Or at least the canonical test vectors. All zeros is a choice.

link

almostgotcaught 331 days ago

There are an enormous number of papers like this. I wrote a paper accelerating a small CNN classifier on FPGA and when I compared against the previous SOTA GPU implementation and the numbers were way off from the paper. I did a git blame on their repo and found that after the paper was published they deleted the lines that short-circuited eval if the sample was all 0 (which much of their synthetic data was ¯\_(ツ)_/¯).

link

michaelcampbell 329 days ago

I'm sure I'm getting this wrong, but I think I remember someone pointing out that Borland's Turbo Pascal "compiled lines per second" figure no one could even come close to replicating, until someone wrote a legitimate Pascal program that had essentially that number of lines containing only ";", or something along those lines.

It WAS still a great compiler, and way faster than the competition at the time.

link

Marazan 331 days ago

Back when Genetic Algorithms were a hot topic I discovered that a large number of papers discussion optimal parameterisation of the the approach (mutation rate, cross-over, populations etc) were written using '1-Max' as the "problem" to be solved by the GA. (1-Max being attempting to make every bit of the candidate string a 1 rather than 0)

This literally made the GA encoding exactly the same as the solution and also very, very obviously favoured techniques that would MAKE ALL THE BITS 1!

link

imtringued 330 days ago

This reminds me of this paper:

https://link.springer.com/article/10.1007/s10710-021-09425-5

They do the convergence analysis on the linear system Ax = 0, which means any iteration matrix (including a zero matrix) that produces shrinking numbers will converge to the obvious solution x = 0 and the genetic program just happens to produce iteration matrices with lots of zeros.

link

rdc12 331 days ago

Do you have a link to that paper?

link