Hacker News new | ask | show | jobs
by 8organicbits 770 days ago
We're seeing an uptick in open source projects getting relicenced to non-open licenses. Some projects are successfully forked and the userbase shifts, other times not.

One theory of mine is that we can measure the risk that a project will be relicensed by looking at things like diversity of contributors, trademark ownership, contributor agreements, and license terms. Low risk projects include the Linux kernel (GPL, DCO) [1]. High risk projects include Kubernetes (Apache, CLA) [2].

If this trend continues developers will need to get a better understanding of how relicencing works and may decide to avoid contributing to projects with elevated risk.

[1] https://alexsci.com/relicensing-monitor/projects/linux/

[2] https://alexsci.com/relicensing-monitor/projects/kubernetes/

7 comments

I'm not sure how Kubernetes is high risk, given the CLA is to CNCF. Similarly, CLAs to the Apache Foundation, the FSF or similar are probably pretty safe (in that they have a long term interest to be good custodians for the IP), and could be safer than projects that lack a CLA but don't have (or only a few) outside contributors.

To me, the obvious questions are who owns the IP, and what are their incentives to maintain the current licensing.

This is a good critique. Measuring intent of an organization may be difficult to do methodically and impartially, so it's not currently covered. Personally I was surprised to see Redis change license after Redis Labs promised not to change the license. I think that promise was made with good intent but overwhelming financial pressure that emerged later on swayed them.
I'm pretty sure most instances of relicensing have had a previous claim that wouldn't happen, so I wouldn't assign too much weighting to that (if anything, it should be a red flag to look into what the IP situation is).

I think there are a bunch of questions you can ask:

* Why is the software open source (if licensing/contractual requirements make it so, that's more likely to keep the status quo vs. corporate claims of "we <heart> open source")?

* Who owns the copyright/IP (and what's their reputation)?

* What would happen if the the license changes (is there an ecosystem that relies on it being open source, or is it a black box)?

* Who cares what the license is (e.g. BerkeleyDB was relicensed, which got old versions frozen in linux distributions, so no-one upgraded to newer versions, and replacements were written)?

I think you need to rework your algorithm. Kubernetes is no way a high risk project, its IP is owned by the CNCF/Linux Foundation.
Dual licensing also makes it IMHO less likely that a project "continues as proprietary". Example: Qt.

I think "contributor agreements" are the biggest red flag. Though I like them for potentially upgrading a license (say from GPLv2 to v3), not that this always is a good thing.

It's also worth mentioning the specific agreement between KDE and Qt (https://kde.org/community/whatiskde/kdefreeqtfoundation/ and https://www.qt.io/faq/3.2.-why-do-you-have-an-agreement-with...), which shifts the incentives as well.
You don't need contributor license agreements for upgrading to future versions of the GPL. You can just license the code under 'GPL version N or later': https://www.gnu.org/licenses/license-compatibility.html
Regarding Kubernetes and the Apache license, Apache license 2.0 has to be one of the most business friendly licenses around? It's widely used and understood, no requirement to open source changes, automatic patent license for any patents the software uses included. If the corporate lawyer says no to that, what do they say yes to?
> Apache license 2.0 has to be one of the most business friendly licenses around?

Yes, in my experience it is.

Permissive licenses like Apache, MIT, and BSD are easiest for the corporate lawyers to approve but also easy for the project owner to relicense. Relicencing Monitor isn't measuring how easy it is for companies to use the software; risk is solely measuring how easy it is to relicense the software.

Copyleft licenses are lower risk than permissive licenses in this specific context as they are viral. A CLA or a very small number of contributors can negate that, as happened with Emby [1].

SourceGraph is probably the best example here (I need to add them still). They switched off Apache 2 and prompted this [2] helpful blog post.

[1] https://alexsci.com/relicensing-monitor/projects/emby/

[2] https://drewdevault.com/2023/07/04/Dont-sign-a-CLA-2.html

I wonder how accurate this assessment is since the Linux Foundation is a non-profit.
The good news is that projects that prevent forking from happening usually don't have huge OSS communities of contributors because of their attitude towards outside contributors. You need an outside community to be able to step up and take over for a fork to happen.

Mostly things like copyright ownership transfer is not a thing with OSS communities because it strongly discourages third parties from contributing. Copyright transfers are only needed with some licenses (GPL style licenses that insist everything else is licensed the same way) and cannot prevent a retroactive fork even if you have them. Other licenses allow distributing mixed licensed code and you can just create a commercial source distribution for those because the license explicitly allows that. Either way, anyone with the pre-license change version of the code can fork. That's why Elastic, which used the Apache license and had copyright transfers, got forked.

The more widely used an OSS project is, the more likely it is that somebody will fork it if it is re-licensed. Because that usually means lots of external contributors and plenty of interest from wealthy companies that depend on it. Meaning there are skills and money needed to fund the fork. Copyright transfers don't stop this from happening. Unless you specifically want to fire most of your user base, this just doesn't make any sense from a business point of view.

A failure to fork basically indicates the project didn't have a strong developer community and big companies simply didn't care about the project.

I consult some clients on Elasticsearch and Opensearch. Most of my recent clients now default to Opensearch. Because it's the OSS option. They are clearly spending money to get support (from me and others) but Elastic isn't getting any. As far as I can see, Opensearch now represents the vast majority of new users and is becoming a significant source of money for hosting, training, and consulting. But Elastic is getting none of that.

My guess is that the industry will learn from the repeated re-licensing and forking and subsequent community split that has been happening. Elastic, Redis, OpenTofu, Centos, etc. The pattern is the same every time: 1) project gets relicensed 2) a few weeks later a consortium of companies pools resources together and forks 3) most users stick with open source and the company cuts themselves off from those users.

Long term, I would not be surprised to see some of those companies offering support for their OSS forks (in addition to their commercial offerings) or even reverting the license change. This would make a lot of sense for e.g. Elastic as there's a lot of duplicated effort between them and Amazon. And Amazon gets a lot for free from outside contributors.

I’m not sure why you’re being downvoted, this is happening a lot right now, and it’s a real risk when contributing or using an open source project.

I also had the same thought to create some sort of risk metric that could be applied to projects, but I do think your initial metric is lacking some criteria. Foundations like the CNCF and ASF have to be among the lowest risk, and CLAs can be more or less harmful depending on their specific content. I think a big red flag has to be if they’ve taken any VC or PE funding.

However I think the principle of taking this risk more seriously is good and important.