|
|
|
|
|
by e-kayrakli
889 days ago
|
|
This is a good question, and I completely agree that programming them can feel... strange. We approach this as two separate problems: 1. Can Chapel utilize tensor cores under the hood? We have LinearAlgebra and BLAS (and LAPACK) modules in Chapel. They have not been integrated with the GPU support so far. But we want them to be able to use libraries like cuBLAS for GPU-based array under the hood for GPU-allocated arrays. That model can enable Chapel to exploit tensor cores efficiently and seamlessly. Of course, you can extrapolate this to ML/AI APIs and potential Chapel modules correspond to them for example. 2. Can Chapel enable general-purpose tensor programming? This is definitely more challenging, where the main challenge is whether we can make tensor programming portable in such a way that the same code can be used on a non-GPU (and non-vector) processor. Probably a relatively shorter-term solution is to provide a low-level interface that's not much different than CUDA's Warp Matrix Functions or ROCm's rocWMMA interfaces and live with that for a while even though it is not portable. One of the thoughts that I can ignore is whether general-purpose tensor programming is _that_ common vs using tensor cores under the hood through some library as I described in (1). |
|