It’s vendor agnostic, so HLSL instead of CUDA or Triton. Here’s the compute shaders implementing inference of Mistral-7B model: https://github.com/Const-me/Cgml/tree/master/Mistral/Mistral...