| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mohsen1 87 days ago

Cursor Composer 1 was Qwen and this is Kimi. IDE is based on VSCode. The entire company is build on packaging open source and reselling it.

Ollama is also doing this.

There is so much money to be made repackaging open source these days.

So funny to see Twitter go wild saying "a 50 person team just beat Anthropic" blah blah.

8 comments

miroljub 87 days ago

> Cursor Composer 1 was Qwen and this is Kimi. IDE is based on VSCode. The entire company is build on packaging open source and reselling it.

The question is, where's the outrage? Why are there no headlines "USA steals Chinese tech?" "All USA can do is make a cheap copy of Chinese SOTA models".

> So funny to see Twitter go wild saying "a 50 person team just beat Anthropic" blah blah.

Well, if it's an American company, then it's a noble underdog story. When Chinese do it, they are thieves leeching on the US tech investment.

It's all so predictable, even the comments here.

link

hakunin 87 days ago

Do you think Chinese LLMs acquired training data legitimately? I think the whole situation is a bit funny, but I don't think the US "started it" to be fair.

link

geocar 86 days ago

> Do you think Chinese LLMs acquired training data legitimately?

I think they probably acquire it in accordance with Chinese law.

> but I don't think the US "started it" to be fair.

Who are you quoting with those marks? Started what? To be fair to whom?

link

hakunin 86 days ago

> I think they probably acquire it in accordance with Chinese law.

You can easily look up[1] how China struggles with effective enforcement of IP laws.

And specifically for LLMs, Anthropic recently claimed that Chinese models trained on it without permission.[2]

> Who are you quoting with those marks?

Double quote marks have other uses besides direct quotes, such as signaling unusual usage.[3] In this case, talking about countries like they're squabbling kids.

> Started what?

Fishy use of others' IP, packaging others' work without attribution.

> To be fair to whom?

To US companies using Chinese LLMs without attribution.

---

[1]: https://en.wikipedia.org/wiki/Allegations_of_intellectual_pr...

[2]: https://www.reuters.com/world/china/chinese-companies-used-c...

[3]: https://en.wikipedia.org/wiki/Quotation_marks_in_English#Sig...

link

satvikpendem 86 days ago

They said Chinese law, which is not the same as American law, and presumably using IP the way they have is legal there, if indeed they actually did, as allegations of IP theft are just that, allegations, and even if they weren't, all nations in the history of mankind have been "stealing" "intellectual property" since forever, including the US from Britain, literally with the good graces of the fledgling US government [0].

As to what Anthropic said, it's quite specious as this analysis shows [1], ie the amount of "exchanges" is only tantamount to a single day or two of promoting, not nearly enough to actually get good RL training data from. Regardless, it's not as if other American LLM companies obtained training data legitimately, whatever that means in today's world.

[0] https://theworld.org/stories/2014/02/18/us-complains-other-n...

[1] https://youtu.be/_k22WAEAfpE

link

hakunin 86 days ago

The linked wikipedia article specifically talks about China struggling to enforce Chinese law. Here's a quote:

> Despite making efforts in intellectual property protection in China, a major obstacle in prosecution is corruption in courts; local protectionism and political influence prohibits effective enforcement of intellectual property laws. To help overcome local corruption, China established specialized IP courts and sharply increased financial penalties.

> all nations in the history of mankind have been "stealing" "intellectual property" since forever

You can't use 100-400 years ago as the counterexample to what happens today. It's like justifying Russian invasion of Ukraine with colonists invading Native American territories. We're in a different world order, things that were normalized that far back shouldn't be normalized today.

link

geocar 86 days ago

> You can easily look up[1] how China struggles with effective enforcement of IP laws.

I didn't see anything in there about Chinese companies violating Chinese law.

Can you so easily look up how American companies struggle with effective enforcement of Chinese IP laws? I think it should be pretty easy to see how American companies struggle with effective enforcement of European IP laws, and I can tell you it is similar.

From here, it is not so clear that the US can even enforce its own laws at the moment.

> signaling unusual usage

Thank you!

> In this case, talking about countries like they're squabbling kids.

> > Started what?

> Fishy use of others' IP, packaging others' work without attribution.

I see. I guess if China is 3000 years old then maybe obviously, because the US is such a young country by comparison.

So you think it is "fair"[1] to violate Chinese Law because there were people in China who violated US law first?

If so, I think that is pretty childish.

[1]: I am trying it out!

link

hakunin 85 days ago

> So you think it is "fair" […]

Maybe fair in a tit-for-tat sort of way, but not okay. That's why I called the whole situation funny. The rest of your post is answered in the sibling comment.

link

ywvcbk 84 days ago

> claimed that Chinese models trained on it without permission

That's extremely rich coming from Anthropic, though? Well they would know all about it of course...

link

hakunin 84 days ago

> That's extremely rich coming from Anthropic

And funny.

link

fooster 87 days ago

I mean as if anthropic and openai did.

link

muzani 85 days ago

If American policies stay this way, we'll see "Made in USA. Designed in Beijing."

link

Tostino 87 days ago

I mean, I (and a ton of others) were pretty outspoken about ollama being a pack of grifters. The thing they are good at is marketing though, so it drowns out other projects in the area.

link

MangoCoffee 86 days ago

yup. fully agree. American cry and bitch about Chinese copy and steal their tech then an American company (Cursor) use/steal open source tech from China and everyone is silence.

link

chzblck 87 days ago

because its open source.

link

miroljub 87 days ago

A license doesn't matter if the perpetrator doesn't comply with it.

link

elashri 87 days ago

Open source licence requires attribution which obviously it is not done in this case.

link

iknowstuff 87 days ago

No it doesn’t? Depends on the license

link

elashri 87 days ago

I doubt that there is any open source license that don't require attribution but we are talking about a specific case and the license require it [1]

[1] https://huggingface.co/moonshotai/Kimi-K2.5/blob/main/LICENS...

link

thefounder 87 days ago

Like licenses are worth anything in the AI world…

link

NitpickLawyer 87 days ago

> packaging open source and reselling it.

It's a bit more than that. They have plenty of data to inform any finetunes they make. I don't know how much of a moat it will turn out to be in practice, but it's something. There's a reason every big provider made their own coding harness.

link

pbowyer 87 days ago

Can anyone enlighten me how having a coding harness when for most customers you say "we won't train on your code" helps you do RL? What's the data that they rely on? Is it the prompts and their responses?

link

rubymamis 87 days ago

I guess they rely on many people not toggling privacy-mode on?

link

doctorpangloss 87 days ago

It doesn't matter what your privacy setting is, with any savvy vendor. Your data is used to train by paraphrasing it, and the paraphrasing makes it impossible to prove it was your data (it is stored at rest paraphrased). Of course the paraphrasing stores all the salient information, like your goals and guidance to the bot to the answer, even if it has no PII.

link

happyopossum 87 days ago

That's an interesting accusation there! You're essentially accusing every "savvy vendor" of large-scale fraud... DOn't suppose you'd have any actual citations or evidence to back that up?

link

josho 87 days ago

The meta data is useful.

Eg, When a prompt had a bad result and was edited, or had lots of back and forth to correct tool usage that information can be distilled and used to improve models.

And now imagine if you are focused on this for weeks you can likely come up with other ideas to leverage the metadata to improve model performance.

link

victorbjorklund 87 days ago

I doubt the majority does that. I bet the majority is using the defaults.

link

__mharrison__ 87 days ago

Does "code" include the prompt? Seems like the prompts would be the goldmines. Hook those up to rl an open weight model...

link

dmix 87 days ago

Cursor’s integration is much deeper than just plugging an LLM into VSCode

That said I have a feeling both VSCode and Claude code will catch up to their integration. But neither comes close yet (I say that as someone who mainly uses Claude Code).

link

bearjaws 87 days ago

As a command line junkie, what is the main thing Claude Code needs to catch up with cursor?

I haven't dove into using a LLM in my editor, so I am less familiar with workflows there.

link

lubujackson 87 days ago

I use both pretty heavily. Cursor has an "Ask" mode that is useful when I don't want it to touch files or ask a non-sequitur. Claude may have an easy way to do this, but I haven't seeked it.

Cursor also has an interesting Debug mode that actively adds specific debug logging logic to your code, runs through several hypotheses in a loop to narrow down the cause, then cleans up the logging. It can be super useful.

Finally, when making peecise changes I can select a function, hit cmd-L and add certain ljnes of code to the context. Hard to do that in Claude. Cursor tends to be much faster for quicker, more precise work in general, and rarely goes "searching through the codebase" for things.

Most importantly, I'm cheap. a If I leave Cursor on Auto I can use it full time, 8 hours a day, and never go past the $20 monthly charge. Yes, it is probably just using free models but they are quite decent now, quick and great for inline work.

link

nsingh2 87 days ago

The majority of Ask/Debug mode can be reproduced using skills. For copying code references, if you're using VS Code, you can look at plugins like [1], or even make your own.

Cursor's auto mode is flaky because you don't know which model they're routing you to, and it could be a smaller, worse model.

It's hard to see why paying a middleman for access to models would be cheaper than going directly to the model providers. I was a heavy Cursor user, and I've completely switched to Codex CLI or Claude Code. I don't have to deal with an older, potentially buggier version of VS Code, and I also have the option of not using VS Code at all.

One nice thing about Cursor is its code and documentation embedding. I don't know how much code embedding really helps, but documentation embedding is useful.

[1] https://marketplace.visualstudio.com/items?itemName=ezforo.c...

link

dmix 86 days ago

Mostly saying "include this line from x file and this block from y file" which keyboard shortcuts. Claude's VSCode plugin only does one selection. Claude Code requires explicitly telling it what to reference.

That plus Cursor's integration into VSCode feels very deep and part of the IDE, including how it indexes file efficiently and links to changed files, opens plans. Using Claude Code's VScode extension loads into a panel like a file which feels like a hack, not a dedicated sidebar. The output doesn't always properly link to files you can click on. Lots of small stuff like that which significantly improves the DX without swapping tabs or loading a terminal.

I also use Code from terminal sometimes but it feels very isolated unless you're vibecoding something new. I also tried others: Zed is only like 50% of the way there (or less). I also tried to use (Neo)Vim again and it's also nowhere close, probably 25% of the UX of Cursor even with experimental plugins/terminal setups.

link

physicles 86 days ago

You’re not missing much.

I used Cursor for the second half of last year. If you’re hand-editing code, its autocomplete is super nice, basically like reading your mind.

But it turns out the people who say we’re moving to a world where programming is automated are pretty much right.

I switched to Claude Code about three weeks ago and haven’t looked back. Being CLI-first is just so much more powerful than IDE-first, because tons of work that isn’t just coding happens there. I use the VSCode extension in maybe 10% of my sessions when I want targeted edits.

So having a good autocomplete story like Cursor is either not useful, or anti-useful because it keeps you from getting your hands off the code.

link

MintPaw 86 days ago

In cursor:

You can copy/paste or drag code snippets the chat window and they automatically become context like. (@myFile.cpp:300-310)

You can click any of the generated diffs in the assistant chat window to instantly jump to the code.

Generated code just appears as diffs till you manually approve each snippet or file. (which is fairly easy to do with "jump to next snippet/file" buttons)

These are all features I use constantly as someone who doesn't vibe but wants to just say "pack/unpack this struct into json", "add this new property to the struct, add it to the serialization, and the UI", and other true busywork tasks.

link

satvikpendem 86 days ago

This all happens in VSCode now too and it's half the price for way more usage compared to Cursor. That Microsoft money sure does subsidize things.

link

rvz 87 days ago

> Cursor Composer 1 was Qwen...

We know Composer 2 is Kimi K2.5 from that tweet. Where is the evidence for Composer 1 being based on Qwen?

> So funny to see Twitter go wild saying "a 50 person team just beat Anthropic" blah blah.

In this case, it will be the other way round: Anthropic will see Cursor as a competitor AI lab using open weight models for Composor 2 (actually Kimi K2.5) which was allegedly distilled from Opus 4.6, and would be enough for Anthropic to cut off Cursor from using any of models.

That's where it is going.

link

PUSH_AX 87 days ago

> There is so much money to be made repackaging open source these days

These days? Almost every tech offering in existence is 1000+ OSS dependencies gaffer taped together with a sprinkling of business logic.

Cursor isn't a shocking bit of software to pay for, its investment however...

link

faangguyindia 86 days ago

It just means Kursor is sharing data with Chinese llm which enables them to improve their LLM by training on outputs and input of all data which cursor collects.

It's a two way street.

link

satvikpendem 86 days ago

No, they self host the Chinese open source LLMs, not use their APIs.

link

rubymamis 87 days ago

Do you know what Qwen model Composer 1.5 used?

link

simplyluke 87 days ago

> a 50 person team just beat Anthropic

How does this blow that narrative up? A 50 person team likely broke a license to have a product that's competitive on output at a fraction of the costs of one of the most well capitalized companies on the planet. Claude code and anthropic are certainly the darlings of the space today, but to me this just reinforces the idea that their moat is razor thin on the model front, even compared to OSS that can be run on independent hardware.

The application layer play is also suspect to me. In the medium to long term I _want_ tools that'll let me run whatever models I want vs being tied to an expensive, proprietary, and singular provider. For personal work I care about costs, and eventually my employer will care both about costs _and_ enterprise features/governance that a company like Anysphere is extremely well positioned to provide.

More and more, I see the future of the application layer being model agnostic, most enterprises hosting models on their own cloud for data security concerns, and the models being fully commoditized.

link

torginus 87 days ago

Considering how AI companies incestously RL on each other's models, I would not be surprised if any number of behavioral patterns and (claims to be ChatGPT/Claude/Deepseek or whatever) just popped up on new models constantly.

I would also not rule out that since K2 is an 1T model, this is a distill, as I don't think they're serving expensive models just like that, which would not be a licensing violation?.

link

simplyluke 87 days ago

There's a now-deleted tweet from a Kimi dev claiming that they verified the tokenizier was the same, which would imply it going at least beyond RL. Could still be a distill I think.

link