Hacker News new | ask | show | jobs
1.58-bit LLMs: thousands of tokens/sec, reduced computing footprint (huggingface.co)
1 points by Rabot 847 days ago