I missed that he's not storing activations for those graphs, he's storing activations+batch norm. See my edit.