Hacker News new | ask | show | jobs
by NitpickLawyer 26 days ago
M2.7 is no longer open source, it's been changed to a NC license. It's an OK model, but IME out of the big 5 chinese models (ds, glm, kimi, minimax and qwen), DS models have generally shown better generalisation and real-world usage than all the others, even if the benchmark scores were lower. Less benchmaxxxing, basically.

DS4 also has some neat new arch improvements, giving it a lot of context at lower VRAM usage. So it will be cheaper to serve, B for B than previous models.

1 comments

M2.7 was never open source, only open weight, which fulfills a lot of the spirit of open source, but isn't really the same thing as a whole. The noncommercial license is basically impossible to enforce if you're self-hosting anyway, because it's essentially impossible to prove that any individual commit was made by Minimax M2.7 in an environment where multiple self-hosted models are being run side-by-side. Besides that, you're not obligated to abide by terms you never agreed to in the first place, and you don't need to agree to anyone's terms to download open weights from a peer or over a torrent. These weights amount to public information that freely exists and is shared in the commons; not a scarce, rivalrous good; not copyrighted works; not sensitive intellectual property.

The weights may nominally be legally copyrighted, but the rightsholder certainly doesn't seem to be making anything resembling a serious effort to actually assert or defend those rights; on the contrary, they are doing the exact opposite by maximizing the gratis distribution, including knowingly and willingly via third parties, with no copy protection whatsoever, and no reasonable expectation of non-distribution.

They are not behaving like an entity trying to protect valuable intellectual property, they are behaving like an entity trying to reap the reputational and network effect benefits of maximizing the free distribution of a public good.

Less memory usage by the KV cache doesn't mean cheaper to serve overall. Once you've acquired hardware (for which you need more to serve DS4L than Minimax M2.7, the former being ~54B total params larger model to begin with, and which KV cache memory efficiency does nothing to address), the capex cost is basically fixed and opex just comes down to power draw, which will be marginally higher per token with DS4L than with M2.7 owed to the slower speeds that result from 13B active params vs 10B active params on forward passes during TG.

KV cache size is the main constraint on batching (for any given ctx length), that's a huge deal for efficiency both locally and in the data center. DeepSeek V4's reduced KV requirement is a real game changer, it definitively unlocks batching requests together for local inference, not just at scale.
This may be relevant for parallelizable workloads. For reference on my perspective: I come at this as someone who is exclusively concerned with sequential, non-parallelizable, single-user, single-system workloads.
If you have multiple chats going at the same time in your LLM web interface, that's already a parallelizable workload wrt. batched inference. And this broadly describes the more sophisticated users of LLMs (who are using it for more than just casual chit-chat), especially wrt. the largest "pro" models. Parallelism is also quite applicable to agentic workloads.
As to the 2nd part of your message, it's really easy to verify yourself (on openrouter).

DSv4-flash is currently being served at 0.14/0.24 $/MTok by most of the providers (8 as of writing this) and even a bit cheaper by 2 providers.

Minimax2.7 is being served at 0.30/1.20 $/MTok by most providers (4 providers as of writing this) and double that price by 2 providers.

As for the first part of your message, this is actually a good illustration of the miss-understanding of licensing LLMs. There are open-source models out there (Apache 2.0 and MIT) and there are also source-available (i.e. open weights) in llamas, minimax2.7 and something in between with the latest kimi (MIT w/ attribution). Open source in the context of LLMs means that you get a license to run, inspect, modify and re-release a model. It was never about data or training. But that's a very common interpretation, that's wrong IMO. But I get that it's contested, so anyway. Sorry for the tangent.

Third party inference costs are a moot point for people running these models locally.

I am currently serving Minimax M2.7 to myself at ~$0.015/1M blended tokens worth of electricity on my own local hardware, where I get all of the confidentiality, integrity, and availability benefits that are lost when choosing to run open weight models on someone else's API.

Open source means that all of the information necessary to recreate the final product is public, which in the context of LLMs, would include all of the training material, and build instructions (scripts to do the training). Very few models actually achieve this - Nemotron family is the only one that comes top of mind. A license to run, inspect, modify, and re-release is a good improvement on open weight models, but does not alone amount to the model actually being open source.

You are welcome to an alternative understanding of the definition of open source - as you correctly note, it's a contested term - just know that your definition is not the more widely accepted one that people think of when they hear "open source".

Your version of the term is much more aligned with the OSI, which was a federation of anti-FLOSS industry bodies created with the intent to capture, redefine, and weaken the original spirit of the FLOSS movement, which predates the OSI by almost a decade - the GPL was first released in '89, compared to the OSI's formation in '98 by members of the $10B for-profit Netscape Corporation, who's flasgship product was originally proprietary and was only open sourced after commercial failure against proprietary competitors.

None of this should be construed as an implication that I'm anti-open-weight. As I mentioned earlier, I think open weight models fulfill a lot of the spirit of open source. While a world where truly open source models are the norm is obviously preferable to a world where only open weight models are the norm, a world where only open weight models are the norm is still vastly preferable to a world where proprietary models running on other people's hardware is the norm.

I just think that we should be careful to avoid watering down terminology in ways that serve proprietary commercial interests over the interests of the public and of users. Open-washing is real, and it harms the intersts of users.

> Open source in the context of LLMs means that you get a license to run, inspect, modify and re-release a model. It was never about data or training.

eeeh? what?

the whole reason "open-weights" phrase got coined was because corps started sharing weights, but no way to replicate the training that created it

it was viewed the same as sharing compiled binary, but no source code - against the whole point of open-source