DVC would definitely be a good fit, and we have a ticket on our roadmap to integrate Replicate with DVC, Tecton, etc. https://github.com/replicate/replicate/issues/294
We also have a roadmap ticket for grouping experiments: https://github.com/replicate/replicate/issues/297, but for now we're recommending params for tags as well.
If you have ideas for the design of these features, we really appreciate feedback and comments on these Github issues!