Hacker News new | ask | show | jobs
Show HN: Pure CUDA C Inference for Qwen3 0.6B in One File, No Dependencies (github.com)
1 points by yb0000 328 days ago