Y
Hacker News
new
|
ask
|
show
|
jobs
by
qeternity
196 days ago
PyTorch is only part of it. There is still a huge amount of CUDA that isn’t just wrapped by PyTorch and isn’t easily portable.
1 comments
svara
196 days ago
... but not in deep learning or am I missing something important here?
link
qeternity
196 days ago
Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere.
link
Scene_Cast2
196 days ago
Yep. MoE, FlashAttention, or sparse retrieval architectures for example.
link