(And yes I have no idea what Github's backend does, but I can't imagine "duplicate all data for every commit" would be a feasible implementation strategy.)