|
|
|
|
|
by zifnab06
2437 days ago
|
|
Github is primarily Rails/MySQL (or was last time I paid attention to any of their blogs), I'm guessing they're storing dates as a TIMESTAMP and not a DATETIME (4 bytes vs 8 bytes). GitHub's BigQuery public data set has 234,759,841 unique commits, and it appears there's 2 dates per commit (author and committer dates). So an extra ~1.8GB per master/shard group. Entirely doable but I have no idea what their scale actually is or how that translates to network throughput or anything else really. |
|