Hacker News new | ask | show | jobs
by logotype 61 days ago
Try it out and see how fast inference can be for agentic workflows. Works on CUDA and ROCm. Feedback appreciated!