The Cornell experiment, however, contained a hidden wild card. The
specification required that an output data stream be formed through
a series of manipulations on numbers in the input data stream. For
example, participants had to shift each number two digits to the
left and then divide by one hundred and so on, perhaps completing
a dozen operations in total. Although the specification never said
it, the net effect of all the operations was that each output number
was necessarily equal to its input number. Some people realized
this and others did not. Of those who figured it out, the overwhelming
majority came from the quiet room.
I think the point was that the programmers working in the environment with music produced solutions with more complexity. Over the lifetime of a project that complexity is going to add up.
Only the median was the same. The way I interpret this is that a quiet work environment sort of "unlocked" the ability to see the shortcut for a small number of workers. However, most still used the normal method and had the same performance as the workers with music, thus the same median.