Hacker News new | ask | show | jobs
A complete Llama2 inference engine that fits in 1356 bytes of x86 assembly (github.com)
27 points by monax 40 days ago