If you want to use untargeted metagenomics to detect novel human viruses you're going to be generating petabytes all by yourself: https://arxiv.org/pdf/2108.02678.pdf
I can't see any reason why you would need to save petabytes. Remember- at that scale, people think really hard about whether to pay the long-term storage and associated costs (the value of having this system should exceed its costs). The case for this already exists in (for example) cancer and other pharma.
The storage is massively cheaper than the sequencing. At some point it could be worth going back and trying to figure out how much of the raw data you can safely discarded, but at least at first there are so many more other things that are more urgent.
(The paper I linked describes more or less what I'm currently working on)