Hacker News new | ask | show | jobs
by oezi 424 days ago
I didn't realize that the context would require such so much memory. Is this KV caches? It would seem like a big advantage if this memory requirement could be reduced.