Hacker News new | ask | show | jobs
by wfunction 3237 days ago
Oh I see, thanks, I didn't know. But man, 300 GB per game sounds completely nuts!
2 comments

No, total. For comparison they quote the replay files at what was it, 5GB? It's a classic space-time tradeoff, but in deep learning right now, hard drives are far cheaper than CPUs/GPUs. Playing out the games as you need individual datapoints would probably be at least twice as slow, while anyone can easily store 300GB these days.
I believe the 400GB is the total amount for the 65000 different game replays
@wfunction: yes, TorchCraft includes a serializer that compresses the useful game state into a relatively small struct. That is then further compressed with other tricks and zstd.
Oh but how does that work? That's ~6 MB per game which sounds like just a list of actions rather than precomputed data per frame. Is it compressed somehow?
"The full dataset after compression is 365 GB, 1535 million frames, and 496 million player actions." - Yes