Hacker News new | ask | show | jobs
by nomdep 104 days ago
In this emerging reality, the whole spectrum of open-source licenses effectively collapses toward just two practical choices: release under something permissive like MIT (no real restrictions), or keep your software fully proprietary and closed.

These are fascinating, if somewhat scary, times.

6 comments

The latter will become MIT sooner or later with Ghidra plus LLM-assisted reverse engineering.

https://reorchestrate.com/posts/your-binary-is-no-longer-saf... https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

Even SaaSS isn't safe from that type of process:

https://news.ycombinator.com/item?id=47259485

I see you submitted that as a link, it deserves a lot more than the current 4 upvotes I see. What a fascinating article. It gives me much hope that dead old games are not in fact dead. If there is still a binary somewhere and current trends continue then they can probably be resurrected cheaply and with relatively unskilled people.
If you got access to a working prototype of a software, you can use it for differential testing. So you got unlimited tests for free.
We will need ... software patents!
No, lawyers will want software patents as that's the only group that would benefit from them, apart from large litigation-happy companies that want to squash any competition.
Not sure I can follow your reasoning. Wouldn't the developer of the software who got a patent for an invention embodied in the software she developed benefit as well?
Not if the developer is employed at the time as contracts will usually mean that the company owns the patents, even if the developer was working on their own time.

The bigger issue is patent abuse - file or buy a few poorly specified patents and then use them along with litigation to shut down competitors. This generally leads to bolstering the bigger companies at the expense of smaller companies due to the costs of litigation.

Basically, software patents can turn developing software into a minefield. It can end up that only people with access to legal departments will be able to sell software.

If you listen to the people who believe real AI is right around the corner then any software can be recreated from a detailed enough specification b/c whatever special sauce is hidden in the black box can be inferred from its outward behavior. Real AI is more brilliant than whatever algorithm you could ever think of so if the real AI can interact w/ your software then it can recreate a much better version of it w/o looking at the source code b/c it has access to whatever knowledge you had while writing the code & then some.

I don't think real AI is around the corner but plenty of people believe it is & they also think they only need a few more data centers to make the fiction into a reality.

>Real AI is more brilliant than whatever algorithm you could ever think of

So with "Real AI" you actually mean artificial superintelligence.

I wrote what I meant & meant what I wrote. You can take up your argument w/ the people who think they're working on AI by adding more data centers & more matrix multiplications to function graphs if you want to argue about marketing terms.
I was just thinking that calling artificial superintelligence "Real AI" was funny.
Corporate marketing is very effective. I don't have as many dollars to spend on convincing people that AI is when they give me as much data as possible & the more data they give me the more "super" it gets.
They’re looking for AI that’s só good it’s unreal
What you describe is essentially what happened, the AI result working from specs and tests was more performant than the original. The real AI you describe just rewrote chardet without looking at the source, only better.
How do you know it didn’t look at the source?
It was instructed to look at the source...
It was instructed NOT to look at the source, with the one exception that it was told to look at this single file full of charset definitions: https://github.com/chardet/chardet/blob/f0676c0d6a4263827924...
Is there any visibility or accountability to record exactly what it did and not look at? I doubt it. So we're left with a kind of Rorschach test: some people think LLMs follow rules like law-abiding citizens, and some people distrust commercial LLMs because they understand that commercial LLMs were never designed for visibility and accountability.
There should exist a .jsonl file somewhere with exactly that information in it - might be worth Dan preserving that, it should be in a ~/.claude/projects folder.
Real AI will never be invented, because as AI systems become more capable we'll figure out humans weren't intelligent in the first place, therefore intelligence never existed.
Don't worry, just 10 more data centers & a few more gigawatts will get you there even if the people building the data centers & powerplants are unintelligent & mindless drones. But in any event, I have no interest in religious arguments & beliefs so your time will be better spent convincing people who are looking for another religion to fill whatever void was left by secular education since such people are much more amenable to religious indoctrination & will very likely find many of your arguments much more persuasive & convincing.
I mean, it sounds kinda like you're the one making religious arguments. My response is one mocking how poorly egotistical people deal with the AI effect.

Evolution built man that has intelligence based on components that do not have intelligence themselves, it is an emergent property of the system. It is therefore scientific to think we could build machines on similar principles that exhibit intelligence as an emergent property of the system. No woo woo needed.

>It is therefore scientific to think we could build machines on similar principles that exhibit intelligence as an emergent property of the system.

Sure, but this ain't it.

Actually, I think LLMs are a step in the wrong direction if we really want to reach true AI. So it actually delays it, instead of bringing us close to true AI.

But LLMs are a very good scam that is not entirely snake oil. That is the best kind of scam.

>Actually, I think LLMs are a step in the wrong direction if we really want to reach true AI.

Any particular reason beyond feelings why this is the case.

We already know expert systems failed us when reaching towards generalized systems. LLMs have allowed us to further explore the AI space and give us insights on intelligence. Even more so we've had an explosion in hardware capabilities because of LLMs that will allow us to test other mechanisms faster than ever before.

Me & a few friends are constructing a long ladder to get to the moon. Our mission is based on sound scientific & engineering principles we have observed on the surface of the planet which allows regular people to scale heights they could not by jumping or climbing. We only need a few trillions of dollars & a sufficiently large wall to support it while we climb up to the moon.

There are lots of other analogies but the moon ladder is simple enough to be understood even by children when explaining how nothing can emerge from inert building blocks like transistors that is not reducible to their constituent parts.

As I said previously, your time will be much better spent convincing people who are looking for another religion b/c they will be much more susceptible to your beliefs in emergent properties of transistors & data centers of sufficient scale & magnitude.

>friends are constructing a long ladder to get to the moon

Congratulations, you're working on a space elevator. A few trillion dollars would certainly get us out of the atmosphere, and the amount of advances in carbon nanotube and foam metal would rocket us ahead decades in material sciences. Couple this with massive banks of capacitors and you could probably generate enough electricity for a country by the charge differential from the top to the bottom.

Oh, I get it, you were trying to be clever by saying something ignorant because it makes you feel special as a human rather than make realistic statements for the progress currently being made in the sciences.

> b/c whatever special sauce is hidden in the black box can be inferred from its outward behavior.

This is not always true, for an extreme example see Indistinguishability obfuscation.

> or keep your software fully proprietary and closed.

I guess it depends on your intention, but eventually I'm not sure it'll even be possible to keep it "fully proprietary and closed" in the hopes of no one being able to replicate it, which seems to be the main motivation for many to go that road.

If you're shipping something, making something available, others will be able to use it (duh) and therefore replicate it. The barrier for being able to replicate things like this either together with LLMs or letting the LLM straight it up do it themselves with the right harness, seems to get lowered real quick, massive difference in just a few years already.

I completely agree.

Right now you can point claude at any program and ask it to analyse it, write an architecture document describing all the functionality. Then clear memory and get it to code against that architecture document.

You can't do that as easily with closed source software. Except, if you can read assembly, every program is open source. I suspect we're not far away from LLMs being able to just disassemble any program and do the same thing.

Is there a driver in windows that isn't in linux? No problem. Just ask claude to reverse engineer it, write out a document describing exactly how the driver issues commands to the device and what constraints and invariants it needs to hold. Then make a linux driver that works the same way.

Have an old video game you wanna play on your modern computer? No problem. Just get claude to disassemble the whole thing. Then function by function, rewrite it in C. Then port that C code to modern APIs.

It'll be chaos. But I'm quite excited about the possibilities.

> You can't do that as easily with closed source software. Except, if you can read assembly, every program is open source. I suspect we're not far away from LLMs being able to just disassemble any program and do the same thing.

I have successfully created a partial implementation of p4 by pointing it at the captured network stream and some strace output. It's amazing how good those things are.

You don't even need to go down to assembly - most commercial software is trivial to disassemble calling a few EXEs. In theory this is largely forbidden by licenses, but good luck enforcing them now.
I suspect there’s a middle ground that involves either keeping tests more proprietary or a copyright license that bars using the work for AI reimplementation, or both.

I think it’s entirely reasonable to release a test suite under a license that bars using it for AI reimplementation purposes. If someone wants to reimplement your work with a more permissive license, they can certainly do so, but maybe they should put the legwork in to write their own test suite.

Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

And if anything can be reimplemented and there’s no value in the source any more, just the spec or tests, there’s no public-interest reason for any restriction other than completely free, in the GPL sense.

>Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

It doesn't if Dan Blanchard spends some tokens on it and then licenses the output as MIT.

Who are you talking about? I can't find reference to this person.
He is the maintainer of chardet. The main topic of the article is the whole LGPL to MIT rewrite and relicense done by this person.

https://github.com/chardet/chardet/releases/tag/7.0.0

I think the “I maintained this thing for 12 years” weighs a lot heavier than the “and then I even went through the trouble of reimplementing it” before changing it to a license that is more open. Seriously…
There were two other posts about this today on the HN front page:

https://news.ycombinator.com/item?id=47257803

https://news.ycombinator.com/item?id=47259177

I highly recommend read the post in question first before commenting.
I'm sorry, I don't understand this. I read it in full. If you're referring to the author dismissing GPL, my comment is, I think in converse they have missed something and the GPL is the best license, for the reasons I noted.
> Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

LLM companies and increasingly courts view LLM training as fair use, so copyright licensing does not enter the picture.

I don't think it changes much about licensing in particular. People are going on about how since the AI was trained on this code, that makes it a derivative work. But it must be borne in mind that AI training doesn't usually lead to memorizing the training data, but rather learning the general patterns of it. In the case of source code, it learns how to write systems and algorithms in general, not a particular function. If you then describe an interface to it, it is applying general principles to implement that interface. Its ability to succeed in this depends primarily on the complexity of the task. If you give it the interfaces of a closed source and open sourced project of similar complexity, it will have a relatively equal time of implementing them.

Even prior to this, relatively simple projects licensed under share alike licenses were in danger of being cloned under either proprietary or more permissive licenses. This project in particular was spared, basically because the LGPL is permissive enough that it was always easier to just comply with the license terms. A full on GPLed project like GCC isn't in danger of an AI being able to clone it anytime soon. Nevermind that it was already cloned under a more permissive license by human coders.