Hacker News new | ask | show | jobs
by mathisfun123 1204 days ago
basically all correct but

>You can then hypothetically lower (compiler terminology) it to a TensorRT MLIR dialect that then in turn runs on the Nvidia GPU.

there's no tensorrt dialect (there are nvgpu and nvvm dialects) nor would there be as tensorrt is primarily a runtime (although arguably dialects like omp and spirv basically model runtime calls).

2 comments

Good catch and good point. What I was thinking was NVVM dialect. You're right on TensorRT being mostly a runtime.
TensorFlow is also a runtime, yet we model its dataflow graph (the input to the runtime) as a dialect, same for ONNX. TensorRT isn't that different actually.