|
|
|
|
|
by bluegatty
41 days ago
|
|
'A million tokens of context' is literally Terrabytes of KV cache VRAM on very expensive Nvidia silicon - on the model. On the Agent, yes, the context window does relate to RAM, because the 'entire conversational history' is generally kept in memory. So ballpark 1M 'words' across a bunch of strings. It's not that-that much. Claude Code is not inneficient because 'it's not Rust' - it's just probably not very efficiently designed. Rust does not bestow magical properties that make memory more efficient really. A bit more, but it's not going to change this situation. 'Dong it in Rust' might yield amazing returns just because the very nature of the activity is 'optimization'. |
|
Of course any seemingly idiomatic rust is going to run circles around TS transpiled into JIT-compiled JS.