I see, thanks for the context, it seems like a PITA.
But given that each database system has its own flavor of SQL, vanilla TPC benchmarks may not work out of the box so one needs to tweak them a bit and this might be what actually disqualifies the published results from all of the clauses from above being applicable.
I can also anticipate that combination of clause (2) and (3) is what some that publish the results are also taking advantage of.