Thanks for the feedback! I'm thinking about ways to prevent loading broken websites. I'm not sure it's possible to filter for only a certain type of website though, I think there are way too many sites for that.
I would say 75%+ of all the working sites were parked or expired pages. I would suggest to remove or re-redirect any sites that resolve to known registrar parking page IPs (perhaps only assuming if these IPs are distinct from their webhosting cluster IPs, where actual webhosting customer websites might live). That might be a good start to at least prune a lot of the parked sites.