5. run them long enough to stabilize thermally so you don't have bursts of performance (potentially minutes), and/or use something like perflock to lock down your cpu speed: https://github.com/aclements/perflock (general tactic, this is just the last tool I heard of)
6. for similar reasons, run them in a random order each time
Or frameworks helping you do so - like "benchmark" for C++ (though I wonder if there is a cross-language one - that does the same amount of "pre"-warming, same counters, etc.)
6. for similar reasons, run them in a random order each time