This. The thing to remember is that you always have to set beresp.ttl in vcl_fetch in a hit-for-pass situation. Varnish caches the decision to hit-for-pass (or lookup or whatever), so if you do a hit-for-pass and your TTL is 1 hour, Varnish will hit-for-pass that cache key for the next hour without running your VCL logic again.
Except we didn't have that. This was 99.9% content pages dynamically generated and nearly static. They all had roughly the same headers that haven't changed really, and they were not even aware of Varnish.
Response headers is the first thing we checked, there are only a few of them that affect Varnish: