Hacker News new | ask | show | jobs
by kondbg 2010 days ago
Devil's advocate: why should I choose this yet-to-exist distribution over something already existing, such as Oracle Linux?

The most common argument (Oracle is evil and litigious. Therefore, using Oracle Linux will result in me being sued) honestly seems like FUD.

All RHEL downstream distributions rebuild the same SRPMs that RHEL provides. Doing a quick comparison over some common packages (kernel, httpd, openssl, etc.) between CentOS 8.3 (https://vault.centos.org/8.3.2011/BaseOS/Source/SPackages/) Oracle Linux 8.3 (https://yum.oracle.com/repo/OracleLinux/OL8/baseos/latest/x8...) shows that they are indeed byte identical (with the exception of certain spec files including debranding patches).

What is the value of having a separate RHEL derivative? It isn't as if the "community" can propose/submit any changes, since any changes will cease to make the downstream distribution a "bug for bug" compatible RHEL derivative. If I actually wanted to participate in the larger RHEL-derivative community, I would need to actually submit my changes to the CentOS stream project.

9 comments

> Devil's advocate: why should I choose this yet-to-exist

Devil's response: nobody cares if you do. A lot of people know why they want it; the answer will in many cases be that it will fill the same niche and not be controlled by a shitty company. (If you think calling Oracle shitty is FUD, unprofessional or similar, that's fine: see 'Devil's response', above.)

It will stand or fall on its own, as a result of many different peoples' choices. For now, it is enough that something is growing in the niche from which Centos was uprooted.

> Devil's advocate: why should I choose this yet-to-exist distribution over something already existing, such as Oracle Linux?

Because there's a whole ecosystem (HPC and Scientific computing to be exact) which depends on CentOS (not RHEL, not Oracle, not Ubuntu, not Debian) primarily. A CentOS compatible distribution is not some FOSS pride thing.

IBM and RH really blew a sucker punch in this regard.

When you say that they depend on CentOS, are they using something CentOS-specific. Centos is supposed to be compatible with RHEL (minus the logos/trademarks) and shouldn't have additional fixes or features. ("bug for bug, feature for feature" <= centos wording :)). No?
CentOS don't have to have a specific feature to be preferred over RH. Being free in both beer and speech is important enough. People (incl. us) install 1000+ server clusters with CentOS. The absence of licensing fee allows us to buy more servers. The absence of licensing fee allows "small researchers" to have a verified platform to work with. If you don't have a verified platform, you cannot trust your results.

CentOS carries a legacy from Scientific Linux (which was RH compatible too) and has a lot of software packages developed for/on it. It might be a regular .tar.gz or RPM distribution but, they're validated and certified on CentOS. This is enough. Some middlewares used in collaborative projects (intentionally or unintentionally) search for CentOS signature. Otherwise installations fail spectacularly (or annoyingly, it depends).

I have to run my own application on every platform with a relatively simple test suite which checks results with 32 significant digit ground truth values. If these tests fail for a reason, then I can't trust my application's results for a particular problem. My code runs fast and it's relatively simple (since it's young). Some software packages' tests can run for days. It's not feasible to re-validate a software every time after compilation on a different set of libraries, etc. CentOS provides this foundation for free.

Thanks for your explanation.

I think I understand a little better your point of view. CentOS became so important for the HPC community that most software is now validated against it. So even if RHEL itself were to become free (as in beer), people won't switch to it (or at least be reluctant).

Exactly.

My all personal systems are Debian, however when I install something research related, it's always CentOS. There's no question. I even manage a couple of research servers at my former university. They're CentOS as well.

Moreover service (web, git, documentation, etc.) servers are CentOS too to keep systems uniform even if there's no requirement. So it powers the whole ecosystem, not the compute foundation. That's a big iceberg.

In 2020 why aren't you packaging your apps as containers? Yeah, it sucks that ibm killed centos, but depending on some single distro's version of libm or libc or whatever is not their fault it's yours. Doing your job properly in this case means shipping you deps with your application, and the easiest way to do that these days is with containers..

Christ; it's either the 90s or kindergarten..

Assuming based on GP that this is in a HPC environment, there is often a delineation between the people writing HPC software and the people maintaining the clusters and the software installed on them. Telling a brand-new graduate student with zero software development experience to just throw everything into a container results in running code that is not optimized for the hardware it's running on, which in turn negatively impacts the other users competing for compute time on HPC clusters.

There is a movement to incorporate technologies like Singularity into the HPC workflow but for established projects, it often looks like a lot of bikeshedding for negative results compared to just running the code on bare metal.

Because a cluster doesn't work like a normal computer.

Your users don't see the nodes. They submit jobs and wait for their turn in the cluster. A sophisticated resource planner / job scheduler tries to empty the queue while optimizing job placement so the system usage can be maximized as much as possible.

Also, users' jobs work in under their own users. You need to isolate them. Giving them access to docker or any root level container engine is completely removing UNIX user security and isolation model and running in Windows95 mode. This also compromises system security since everyone is practically root at that point. Singularity is user-mode and its usage is increasing but then comes the next point.

Performance and hardware access is critical in HPC. GPU and special HBAs like Infiniband requires direct access from processes to run at their maximum performance or work at all. GPU access is much more important than containerizing workloads. Docker GPU is here because nVidia wanted to containerize AI workloads on DGX/HGX systems. These technologies are maturing on HPC now.

In performance front, consider the following: If main loop of your computation loses a second due to these abstractions, considering this loops run thousand times per core on many nodes, lost productivity is eye-watering. My simple application computes 1.7 million integrations per second per core. So, for working on long problems, increasing this number is critical.

Last but not the least, some of the applications run on these systems are developed for 20 years now. So, these applications are not some simple code bases which are extremely tidy and neat. You can't know/guess how these applications behave before running them inside a container. As I've said, you need to be able to trust what you have too. So, we scientists and HPC administrators tend to walk slowly but surely.

Doing my job properly on the HPC side means my cluster works with utmost efficiency and bulletproof user isolation so people can trust the validity of their results and integrity of their privacy. Doing my job properly on the development side means that my code builds with minimum effort and with maximum performance on systems I support. HPC software is not a single service which works like a normal container workload. We need to evolve our software to run with minimum problems with containers and containers should evolve to accommodate our workloads, workflows and meet our other needs.

The cutting edge technology doesn't solve every problem with same elegance. Also we're not a set of lazy academics or sysadmins just because our systems work more traditionally.

I find it interesting that the argument that "X is FUD" is supposed to carry weight.

It's a bit like if I'm in a party, and I briskly walk up to five people and each time I hit them in the face, and then it's your turn and you move away, and I say "what? the idea that I would hit you is FUD".

It's not FUD. It's a pattern of behaviour.

Avoiding overly litigious companies - where other as-good or better choices exist - is not overly cautious, it's just good sense. Where other as-good choices do not exist, it seems perfectly reasonable (depending on your risk profile) to work with others to create the better choice.

Of course, I say all this as someone who has worked in massive multinational corporations and now work in small startups. I'm now likely never going to use Rocky Linux for exactly the reason you've hinted to - in effect, it is not a usecase either of us care about. But for those people who do need this, I'm very happy that someone has championed the cause.

I haven't seen "using Oracle Linux will result in me being sued".

What I've seen is "Oracle is evil", "don't trust Oracle", and something like "my prior history around Oracle has left such a lasting bad taste that I throw up a little in my mouth every time I touch something with Oracle in it, so I'd rather do almost anything but use something from Oracle, since using it on the daily would inevitably lead to permanent esophagus damage."

I mean... Oracle buying up MySQL was enough for MariaDB to be created and move to being the default. (well, and some of what Oracle did right afterwards).

In an earlier thread, some Oracle guy (not in the Oracle Linux team) mentioned that Oracle 8 actually builds from CentOS 8, rather than RHEL 8. I was a bit skeptical, since OL 8 usually releases much earlier than CentOS 8, but couldn't verify things either way. Someone else mentioned that RH actually only releases RHEL8 sources through CentOS8 sources. Again, I don't know how to verify, but if true they raise a lot of new questions about Oracle Linux 8 given the recent CentOS 8 announcement.
RHEL sources can be retrieved in four ways:

1. On an entitled system, enable the source repos and download the packages.

2. In your account online, you can download the SRPMs for individual packages.

3. In your account online you can download a minor version release iso of the SRPMs.

4. You can use https://git.centos.org to clone the actual RPM patches/spec files, and use the get_source.sh script from the centos-git-common repo to pull the package source tarballs from dist-git (useful for projects like the kernel that don’t use actual upstream as their source).

With CentOS stream (particularly C9S that will be launching mid 2021) and the switch over to GitLab which will happen in the future, everything will be out in the open in git form.

RHEL is open source and always will be.
Oracle Linux might change the rules in the future. It kind of just happened with CentOS :)
Well, the same thing might happen with Rocky Linux as well :)
“Anything can happen” is not an argument. It’s about quantified risk.
Yes, and the 2 parties have very different motivates.
Yes but then somebody would make Rocky-2
It's unlikely they would sabotage their only competitive advantage though, whereas Oracle has lots of reasons to maintain an enterprise linux distribution besides just succeeding CentOS.
> Devil's advocate: why should I choose this yet-to-exist distribution over something already existing, such as Oracle Linux?

Because you want what CentOS was and this is basically going to be what CentOS was. Different name, different people, but same prinicple.

Theoretically Oracle Linux and CentOS are identical except the branding, except CentOS has been abandoned and Oracle is just getting started with OL.
But Oracle are the company we probably trust the least with matters like this.
But Oracle Linux has a different principle really, no? I never think of Oracle and think "Making the paid software free".
It just seems weird to willingly associate with a company that is trying to outlaw the very practice that gave birth to GNU/Linux in the first place.
Why not use SUSE then? Oracle is the last option that I would ever trust. They always have something up their sleeve