|
|
|
|
|
by sillysaurusx
2337 days ago
|
|
The link is talking about per-core memory. A TPUv2-8 has 300GB system memory, which you can use for training. You can verify this using the notebooks above. (If a TPUv2-8 has 64GB memory, how can it fine tune GPT-2 1.5B using Adam with batch size 4? That requires almost 300GB.) |
|
Are you paying on-demand or preemptible prices? Have you tried larger pod slices to see if they have even more of this “system memory”?