So these hashes are distributed. How do we know they haven't been altered since future reveal? How does one find the "true" dataset? Trust random people on the internet?
no, since it's public, if you care so much about it, you keep a periodic snapshot of the hashes. So you're trusting yourself (and math). What's missing to be a "blockchain" in (what should be) the generally accepted definition is byzantine fault tolerance.
Right so save some time and put the hashes in a Merkel tree and publish on the Bitcoin blockchain. Now the proof can be found trustlessly and the longest PoW points to the true dataset