|
|
|
|
|
by JonChesterfield
1431 days ago
|
|
Slightly, the older tech is 64 threads/lanes per warp/wavefront. Newer ones are 32 by default but 64 if desired. Bigger differences are the instruction counter per thread since volta on nvidia (which I think is a terrible feature) and that forward progress guarantees are stronger on nvidia (those are _really_ helpful but expensive). |
|
> which I think is a terrible feature <> those are _really_ helpful but expensive
Guaranteed forward progress is a direct consequence of having an instruction counter per thread???
Or so I thought. How else would an SM be able to know the PC of a group of threads that wasn’t stuck?