This is weird. The Smithsonian page [1] says "Data hosting provided by AWS Public Dataset Program", and there's the Amazon blog post you linked, but the data set seems to be missing from the registry Amazon publishes. [2] I guess no one submitted a pull request yet? [3]