Hacker News new | ask | show | jobs
Cassandra: Enabling Reasoning LLMs at Edge via Self-Speculative Decoding (arxiv.org)
4 points by chrsw 17 days ago