Not everyone is working with state secrets or user personal data (or even more closely guarded, company secrets) on a daily basis, most of what I hack on is either FOSS already, or will be, not much to keep secret here.
Obviously, if you do deal with any sort of secrets, then using local LLMs over OpenAI, Anthropic, DeepSeek or whoever is obviously preferred, and in the case of personal data of users, probably a requirement.
either this or you work on software that even if copied won't get you far since the business relies on network effects or pure networking.
Getting the source code of facebook or instagram doesn't mean you could compete with them.
I work for a company that has built relationship with event organizers over the past 10 years. The code I maintain could be written from scratch in maybe 2-3 months even though it was built over the past 10 years but besides that you have frontend / DB / hardware / logistics etc
I actually agree with you, for the most part. The code I work with actually does contain some valuable algorithms, but Im pretty sure the effort of integrating them into a larger system is pointless without the data. It’s almost like stealing half-life 2 source code without any assets.
Still, “Getting the source code of facebook or instagram doesn't mean you could compete with them.” I think to giants like that, having access to their source code could open up some very interesting loop holes for manipulating the ranking algorithms, or even security vulnerabilities.
True, haven't thought of that. However very few actual projects / companies are in a situation where the chinese GOVT would be interested to spend resources to hack your platform. For the ones that are afraid of that there's always self hosting of course
I used to work with HVAC companies, and I noticed that many of their customers mistakenly believed they were purchasing air conditioners. They didn’t consider these devices, which they connected to the internet, as computers. Despite being systems that required user names, passwords, updates, monitoring, and other maintenance, the prevailing attitude among these customers was, “This is an appliance, and why would anyone care about my air conditioner?”
All this to say, not even subject matter experts necessarily appreciate the risk involved in their work
You’re not a novice, there are a lot of us who know exactly what we are doing and see this as a huge downside. We are just being told to go faster, faster, faster lest we miss out on… something?
there's laws on the books in China that says that every company operating in China must aid and abet the Chinese government in espionage against the rest of the world. given those facts, I find it deeply troubling to be using anything coming out of China, especially a program that runs in the context of a Linux terminal on a machine that might have something important on it. I'd argue it's a back door waiting to happen, if not sooner than obviously later.
As a European I have to admit I am these days more worried about the US than China. See yesterday's article about the US government forcing Microsoft to give them lists of Dutch government officials. Utter madness. At least the Chinese mainly care about the money and power levers, the US about strange worlds of revenge and manipulation, trying to change or influence your government. E.g. which of the two countries has put crippling personal sanctions on staff of the international criminal court?
Honestly I'd love to love the US again, but basically after Obama things have just gone down and down and no soul will trust the US again in the next generation or two.
The situation you reference is related to a specific investigation by US congress requesting documents about potentially illegal censorship actions by EU officials from a specific company (microsoft). The difference is that the laws in china are broadly defined to include giving all intellectual property of anyone back to the government with no oversight, for the purposes of espionage.
The former relates to a specific investigation about potential criminal activity, the latter relates to broad illegal activity committed by the government itself unrelated to any specific case.
The US has no laws on the books forcing companies to wantonly give intellectual property and other espionage level material back to the government. If they did, no one would use cloud providers.
To avoid this, you can run your own hosted machine in a colocation facility, because in the US, people do have reduced rights when their data is controlled by a third party versus being controlled by themselves. Its the same as if the data was in your house, they would need a search warrant to obtain it, but when its at a Azure or AWS datacenter not controlled by you, your privacy rights are reduced by doing this.
I think many are trying to move away from US providers actually. FISA section 702 and the current administrations liberties taken towards international law are not helping. The trust problem is real.
Not sure I’d trust China with anything onshore. But offshore, it does seem they play by the rules, because it pragmatically serves the stability of the people. China has not started wars in the past 50 years or so. By that logic one may assume they’d not abuse the arguably broad powers over Chinese firms abroad to risk one now.
In a world where rules are increasingly less important how states use power matters more to me than how they claim to be monitored.
Besides the language barrier it’s actually also just simpler to do business with the Chinese. There are issues like censorship but they are known & can be routed around. It’s best to just ignore the US and move your business elsewhere.
so govt forcing a private coroporation being a big deal that a its on the worldwide news is more scary to you than an implicit mandate that china forces on its companies?
The four biggest (obvious) backdoor countries in the world in no particular order the United States, Israel, Russia, China. Honorable mentions, North Korea, Ukraine…
I forbidden from working on the company code with DS, but if I have a private something that looks pretty much like one of the thousands repositories put there, it doesn't matter that much.
Yeah, but it's miles better than giving Anthropic and OpenAI your data. At least Deepseek is releasing open-weight models and a lot of open-source libraries.
If you're concerned about espionage then the only solution is host the models yourself, which again, only open-weight models like Deepseek enable you to do this.
Obviously, if you do deal with any sort of secrets, then using local LLMs over OpenAI, Anthropic, DeepSeek or whoever is obviously preferred, and in the case of personal data of users, probably a requirement.