|
|
|
|
|
by SwellJoe
6 days ago
|
|
As far as I can tell you'll have a context limit of about 64k, which is also prohibitive for serious work. (My benchmark maxes out at 90k in context when running, so I'm giving the self-hosted models 128k to leave plenty of wiggle room.) But, still, it's cool that the work is happening. For some classes of problem it might be an option, and when the 192GB Strix Halo comes out, DS4 will probably become a real contender for self-hosting champ, as that leaves enough memory for a big context. |
|
Source? The author has demoed a 100k ctx already, and I can't think of a reason why more wouldn't be supported. RAM is a bit tight but that only matters with really long contexts on DeepSeek V4, and proper support for SSD streaming would address this anyway.
BTW, the official support is now merged too.