What's the acceptable amount of test data in these suites? I feel like grinding through 20GB of GIS is something that happens to me with certain frequency, but it might not fit in your repo.
If there's a public mirror of the data-set available, not too much concern given that the downloads only happen when the user requests running said test.