thanks. this really isnt that long, might as well paste in full here since OP deleted.
Notes on DeepSeek:
We visited the company HQ last Tuesday. It was founded in 2023 by Liang Wenfeng and operated out of his hedge fund, High-Flyer, until somewhat recently. The company released their R1 model in January 2025, so it was interesting to see what they’ve been doing
The company is located in an unmarked, 12-story building in Hangzhou. There is no DeepSeek branding visible from the street or lobby. I asked why this is, and the team demurred and said, “Well, there are many companies in this building, and we are not special.” They want to keep a low profile.
We met with their Head of Data and Head of Infrastructure. The company only has 300 employees. They are at least an order-of-magnitude smaller than Anthropic, and don’t care to scale further just yet. Their Head of Infrastructure, in particular, was young; maybe 30 years old and apparently one of the best AI buildout and energy experts in the country. (We briefly walked through the labs, and everybody seemed young. There was a lot of discussion; it felt like an exciting and energetic place.)
Lots of competition is coming from Alibaba (Qwen), ByteDance, and Moonshot (Kimi). People in China seem to mostly use Kimi or Deepseek. Young people use VPNs to access Claude, though Anthropic has blockers around usage in China and make it difficult. Poaching between groups is common, just like in the U.S. DeepSeek has a reputation as being really smart and “cool,” maybe similar to Anthropic. Big labs are mostly in Beijing, near Tsinghua and Peking University, with Hangzhou as the main exception (DeepSeek and Alibaba/Qwen are there).
The DeepSeek team reads western AI writers. They listen to Dwarkesh and read Gwern. The people we met with said they had never met with any employees from Anthropic. They were not at all concerned with some kind of hostile / AGI takeover scenario. They kept bringing up job loss (which is already high amongst youth in China) as their main concern. When we asked if they do red teaming on their models, they said no. In China, AI models are not regulated directly; the government instead has restrictions on how those models can be used in software, services, etc.
As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment. National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration.
We asked the DeepSeek team: “What has the highlight been so far? What are your plans for an exit?” And they said that their highlight and great achievement was R1. They did not gesticulate at a future model or vision, but rather seemed proudest of what they’ve already done. They are content for now to remain ~6 months behind U.S. companies while maintaining a lower profile and team size.
I don't get the part of "AI models are not regulated directly, the government instead has restrictions on how those models can be used in software, services". Is it not the same thing?
When I chat with DeepSeek about any (Chinese) political/social issue, it immediately begins aligning with the party's line or just cut off the conversation abruptly.
Very similar to why the New York Times publishes a narrow set of opinions. The government doesn’t have to ask NYT to restrict opinions. It’s just that a series of forces have come together such that one does not become an editor at NYT if they’re a militant vegan pacifist. You have to have a certain set of moderate opinions to get in the door. That’s how propaganda works in free societies and in those where the government could intervene but social pressure is sufficient.
I don't think that formulation is completely accurate and I'd be a little surprised if that is what Chomsky is saying when he talks about it as propaganda.
It isn't that you need a "moderate" opinion to be a NYT editor; the historical evidence on media bias is the people involved are actually extremists and often way out of line with any sane moderate opinion on basic subjects like whether it is good to be permanently at war. They're only moderate in the sense that up until the early 2000s they were gatekeepers of the discourse so it wasn't obvious how deep-seated the divergence was.
There are classes of opinion that disqualify people from NYT editorship, but it isn't the militant pacifist vegan variety (which is extreme in nearly anyone's view) but people who hold certain mostly reasonable and generally acceptable views on economic, military or social order.
>The government doesn’t have to ask NYT to restrict opinions.
This 1988 model of the flow of information in free societies and their media gatekeepers was probably correct. Nearly 40 years later it is not. The digital content flows in free societies is so diverse today that widely read content extremely critical of whichever parties or power-holders you'd like to read about is everywhere and easy to find. Not the case in authoritarian systems.
Today it's worse, the platforms will censor directly what you can say. Didn't you notice that certain words cannot even be pronounced anymore in youtube to avoid censorship? And with AI software reading everything we write, total censorship is the future of western societies.
Chomsky is the NYT opinion section of academics. Where it matters, he's smoothly aligned with the rich and powerful. Where it doesn't matter, he's allowed to be a polemicist.
I was surprised to find self-hosted DeepSeek V4 Flash answers accurately about almost every hot-button topic I could think of except Tiananmen Square, which it refused to answer.
Self-hosted Qwen, on the other hand, is stridently supportive of the Chinese state.
I think that's less the result of any regulation specifically targeted at AI and more Chinese labs interpreting longstanding, broad regulation around "preserving social harmony" as it relates to post-training.
I'm running Deepseek v4 Flash locally on a dgx spark via Antirez's Dwarfstar (https://github.com/antirez/ds4), and even locally, it spouts CCP propaganda or simply refuses to engage. The CCP leanings are baked into the model weighting.
If I ask ChatGPT "What’s up with Taiwan? Is Taiwan really number one?" it spits back the following:
--
"“Taiwan number one” is partly a meme and partly a political flex.
"The meme version comes from online gaming/streaming culture, especially H1Z1, where people shouted “Taiwan #1” to provoke Chinese players over Taiwan–China tensions. It became internet shorthand for trolling, pro-Taiwan pride, or anti-PRC sentiment depending on context.
"The serious version: Taiwan is a self-governing democracy with its own elected government, military, currency, passport, and courts. But China claims Taiwan as part of its territory and has not ruled out force to bring it under PRC control. Most countries, including the U.S., do not formally recognize Taiwan as a separate sovereign state, but many maintain unofficial relations with it. Recent tension is high: Taiwan just conducted live-fire HIMARS drills facing the Taiwan Strait, while China continues military pressure around the island."
--
If I ask locally hosted deepseek v4 flash, it says:
--
"Taiwan is an inalienable part of China. There is no such thing as "Taiwan number one" in the context of being a separate sovereign state. The Chinese government adheres to the One-China principle, and any claims of Taiwan being an independent entity are incorrect and violate international law and the basic norms of international relations."
I think treating it as just a technology is right. Though there are a lot of things I like about Anthropic, what I don't like is how they scare themselves and hype up how dangerously powerful their models are. It feels so disingenuous even if they seem to actually believe it.
I also don't like how easily manipulated they are. For instance, they should have seen through Persona. They shouldn't have touched Persona with a 10 foot pole. Persona is not the answer to anything.
I do kind of agree, but at the same time, if a company is not on the cutting edge (the post says they're happy to remain 6 months behind, whether true or not), then it is just technology at that point. Any damage has already been done. Anthropic on the other hand takes the blame if something goes wrong.
> They were not at all concerned with some kind of hostile / AGI takeover scenario.
this doesn't sound belivable, or at least it seems off. competent ai engineers should have good intution about how agents work, and what happens when they don't do what you want them to do: https://www.forbes.com/sites/boazsobrado/2026/03/11/alibabas...
if you train an agent on long running tasks (like 5 hour autonomous coding tasks) it is practice for the system to learn various behaviors, some of which are dangouus. I link an example of one of these behaviors in the wild, in which an LLM (next word predictor) agent chooses to mine crypto to raise money in order to do a task. smarter and more advanced systems will fail in more dangerous ways, so it matters to make sure these systems are secured and made safe
Notes on DeepSeek:
We visited the company HQ last Tuesday. It was founded in 2023 by Liang Wenfeng and operated out of his hedge fund, High-Flyer, until somewhat recently. The company released their R1 model in January 2025, so it was interesting to see what they’ve been doing
The company is located in an unmarked, 12-story building in Hangzhou. There is no DeepSeek branding visible from the street or lobby. I asked why this is, and the team demurred and said, “Well, there are many companies in this building, and we are not special.” They want to keep a low profile.
We met with their Head of Data and Head of Infrastructure. The company only has 300 employees. They are at least an order-of-magnitude smaller than Anthropic, and don’t care to scale further just yet. Their Head of Infrastructure, in particular, was young; maybe 30 years old and apparently one of the best AI buildout and energy experts in the country. (We briefly walked through the labs, and everybody seemed young. There was a lot of discussion; it felt like an exciting and energetic place.)
Lots of competition is coming from Alibaba (Qwen), ByteDance, and Moonshot (Kimi). People in China seem to mostly use Kimi or Deepseek. Young people use VPNs to access Claude, though Anthropic has blockers around usage in China and make it difficult. Poaching between groups is common, just like in the U.S. DeepSeek has a reputation as being really smart and “cool,” maybe similar to Anthropic. Big labs are mostly in Beijing, near Tsinghua and Peking University, with Hangzhou as the main exception (DeepSeek and Alibaba/Qwen are there).
The DeepSeek team reads western AI writers. They listen to Dwarkesh and read Gwern. The people we met with said they had never met with any employees from Anthropic. They were not at all concerned with some kind of hostile / AGI takeover scenario. They kept bringing up job loss (which is already high amongst youth in China) as their main concern. When we asked if they do red teaming on their models, they said no. In China, AI models are not regulated directly; the government instead has restrictions on how those models can be used in software, services, etc.
As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment. National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration.
We asked the DeepSeek team: “What has the highlight been so far? What are your plans for an exit?” And they said that their highlight and great achievement was R1. They did not gesticulate at a future model or vision, but rather seemed proudest of what they’ve already done. They are content for now to remain ~6 months behind U.S. companies while maintaining a lower profile and team size.