Hacker News new | ask | show | jobs
by abharya 2022 days ago
We have to start somewhere. It is understandable that this is not complete, so welcome your ideas to discover such projects. Please think of any metrics/ways to find such projects.
4 comments

I strongly recommend you use the packages maintained in Linux distributions as a means for discovery. They're well-organized and maintained and easily accessible programmatically - you can even parse the package dependencies programmatically, as well as have full access to the original source code.
good advice. "apt-get install apt-rdepends" and it becomes possible to work out the reverse-dependencies of packages.

by counting the numbers it becomes pretty blindingly obvious what the critical dependencies are. as mentioned in another post above, bash and glibc6 are blindingly-obviously high on the list... yet the GNU Project receives an unbelievably low amount of funding despite their critical importance.

likewise, this particular bug in binutils ld, which centres around the incredibly short-sighted "4GB should be enough for anyone" removal of Dr Stallman's memory-resident algorithms in the late 90s, is having some very serious consequences:

https://sourceware.org/bugzilla/show_bug.cgi?id=22831

yet because there's no money not even from redhat nobody's looking at it.

likewise: PAM no longer has a proper maintainer, and hasn't had for... a decade?

these are projects that people are relying on yet completely forgetting they're a critical part of the infrastructure!

why? because, just as rhencke said above: they're not on github, they've not got "unnecessary changes" which are counted as "activity to be glorified and worshipped".

abharya: i heard on slashdot the intent to start from github, to exclusively focus on github. this will turn out to be a serious mistake.

There is no exclusive focus, we are just starting somewhere where we can see the various metrics. Plan is to expand to non-github projects and other places (like custom issue trackers), but this is not straightforward as it sounds. Ideas welcome!. https://github.com/ossf/criticality_score/issues/29
I would like to see a measure of criticality that takes the following into account:

* Critical projects may have very little activity/maintenance. For example, Bash 4.0 to Bash 5.0 was only 123 commits over 8 years. But, Bash is a absolutely a critical project (ask any org about how much work they had to do when affected by https://en.wikipedia.org/wiki/Shellshock_(software_bug) ).

* A measure of criticality should understand _as many of the various forms of dependence on software_ that may occur that it can. Dependencies can take many forms, such as:

a package manager resolving a dependency

a user purchasing a mobile phone with software pre-installed

a user visiting a website (react/jquery/etc)

etc.

* Criticality should understand if, how, and when dependencies are updated. For example, fixing a bug in Chrome and distributing that fix to 80% of users in 1 week is feasible. Fixing a bug in Bash and distributing that fix to 80% of users in 1 week is not so feasible.

people. as a computer scientist you're probably thinking, "this can be solved by analysing a source code forge" or, "this can be solved by running an algorithm". it can't (or, more to the point: it can tell you quantities, but not quality or value).

i mentioned in another post: github "glorifies" the person and the changes that they make. "look at mee! look at mee! i'm making a commit! i'm wiping my backside now! aren't i great!" which gets you precisely zip in terms of actual strategic value.

changes measure change.

people will tell you - if you let them - by providing you with the information needed to make a qualitative assessment.

so.

provide a type of wiki/website that allows qualitative assessments to be made, on a per-library / per-project basis. then put the metrics (the "criticality value") onto that.

pre-seed that wiki/website with stuff from github if you feel so inclined but DO NOT limit the wiki/website to exclusively github. i repeat again: doing so would be a disastrous mistake.

one of the things that's important is not to include google's lawyers bias against the GPL license. i always wondered where the bias against the GPL came from, within google, and learned of the existence of the in-house legal team. it turns out that they have been advising google employees for some considerable time, "avoid the GPL, avoid the GPL".

unfortunately, as legal advice, those google employees (right the way to management) do not have the backbone to say, "err no actually, GPL-licensed code is the critically strategically important leveller that forces aberrant companies to collaborate rather than sponge off of underfunded projects".