Hacker News new | ask | show | jobs
by kazinator 3276 days ago
A string match for the stdout can still be cheated.

For instance, if you ask me to write a program that computes the first 100 digits of pi, I can just have it print a string literal.

Anything that has no inputs, or that has a small input space, or that is known to be tested with only a few known input cases, can be cheated by cooking the output.

A "cheat-resistant" way to verify that something is working is to choose problems that have a large input space, and randomly probe the space.

Famous examples of this kind of cheating have occurred in compiler benchmark. A compiler can recognize that the program being fed to it is a known benchmark, and produce an optimization of the benchmark as a whole. I.e. "if the abstract syntax tree of 279 nodes is exactly this particular one, spit out this canned piece of code which 'translates' it."