|
|
|
|
|
by melihelibol
6 days ago
|
|
Yes! That's exactly the spirit. The readable, single-binary, kernels-included codebase is a big part of what makes Grout fun, and the antirez parallel is accurate. There's a parallel to ThunderKittens too, on the kernel-authoring side: tile-based abstractions for writing fast kernels. The twist with cuTile Rust is the safety layer on top, carrying Rust's ownership model into the kernels: it's a safe, high-performance programming model, not just a perf DSL. The safe surface API is fairly domain-specific today (dense tensor/tile ops), and the Tile IR compiler is still maturing, but it's showing real promise for sparse and multi-GPU. Excited to see where those go. :) |
|