Hacker News new | ask | show | jobs
by pcr0 3480 days ago
But Nvidia's proprietary driver is a download right? Why is AMD trying to merge theirs into the kernel?
5 comments

Nvidia's proprietary driver breaks upon every new kernel release, which is why they have a shim. Furthermore, Nvidia can't ship their driver in the official Linux kernel due to copyright issues, and they're forced to handle all the maintenance burden of their driver (whereas AMD reaps the benefits of Intel's GPU driver bugfixing, and vice versa, thus lowering both Intel's and AMD's driver costs on Linux).

Besides, Nvidia's been having trouble with their Tegra GPUs on Android, and as a result have been forced to pitch in a bit on Nouveau (the reverse-engineered open-source Nvidia driver). They're still having trouble with their driver situation on mobile, as a result of their unwillingness to play ball with the kernel.

Actually, that last sentence above - I'm really not too confident on that, I've heard various hearsay but the only source I concretely remember is the "other drivers" section of http://richg42.blogspot.com.au/2014/05/the-truth-on-opengl-d...

> Furthermore, Nvidia can't ship their driver in the official Linux kernel

Nvidia has every ability to ship it, they just refuse to open it.

probably rightfully so, if this is the sort of welcome they'd get.
Opening their driver's code is different from merging it into the kernel source.
There is huge difference between opening code and merging it to the Linux kernel.
> Nvidia's proprietary driver breaks upon every new kernel release, which is why they have a shim.

ELI5: why does each Linux kernel release break driver code? It can't be THAT hard to just have a stable interface and leave it for long periods of time, e.g. only bumping it on major version bumps in the Kernel?

Because in practice, APIs inside Linux do, in fact, change quite a bit -- and by itself maybe that wouldn't matter so much, but the nvidia driver has an insane amount of surface area on top of it. It's a massive driver. You can imagine then, that breaking it is actually easier than you might think.

There is no rule kernel interfaces can only change on major bumps. In reality, they change quite frequently, as new APIs and drivers are merged in, which requires generalization, refactoring, etc across API boundaries to keep things sane. Kernel developers specifically reject the notion of a "stable ABI" like this because they feel it would tie their hands, and lead them to design APIs and workarounds for things which would otherwise be fundamentally simple if you "just" break some function and its call sites. APIs in Linux tend to organically grow, and die, as they are needed, by this logic.

Why wait 5 years for a "major version bump" to delete an API call, you could just do it today and fix the callers, since they're all right there in the kernel tree? It's far easier and more straightforward to do this than attempting to work around "stable" systems for very long periods of time, which is likely to accumulate cruft.

Because they do not care about out-of-tree code, when an API changes, their obligations are to refactor the code using that API, inside the kernel, and nothing else. That means the person making the change also has to fix all the other drivers, too, even if they don't necessarily maintain them. Out of tree users will have to adapt on their own.

This also explains why they do not want a HAL. When a Linux driver interface changes, the person changing it is responsible for changing everything else and fixing other drivers. That means if AMD wants a large change, it may have to go and touch the Intel driver and refactor it to match the new API. If Intel wants something new, they may have to touch the AMD driver in turn. This, in effect, helps reduce the burden and share responsibilities among the affected people.

They don't want a HAL because a HAL is a massive impediment to exactly that workflow. If Intel wants to improve a DRM/DRI interface in the kernel for their GPUs, they could normally do so and touch all the other drivers. Out with the old, in with the new. But now, they'd have to also wade through like 50,000 lines of AMD abstraction code that no other system, no other driver, uses. It effectively makes life worse for every graphics subsystem maintainer when this happens, except for AMD I guess since they can pawn off some of the work. But if AMD plays by the rules -- Intel fixing their AMDGPU driver when they make a change shouldn't be that unusual, or any more difficult compared any other graphics driver. And likewise -- AMD making a change and having to fix Intel's driver? That's just par for the course.

Obviously Linux isn't perfect here and they do, and have, accepted questionable things in the past, or have rejected seemingly reasonable API changes out of stability fear (while simultaneously not wanting a stable ABI -- which is fair). But the logic is basically something like the above, as to why this is all happening.

AMD's recent strategy has been to try to confine the proprietary stuff to userspace, and to implement an open-source kernel driver that can be used by either the proprietary userspace driver or open-source userspace stack.
It's already part of the kernel, this was a re-architecture of the display portion of the driver.
The amdgpu driver is open source, not proprietary.
Open source != free software.
The difference between open source and free software is mostly in the political camp the word comes from. Read the OSI definition of open source if you don't believe:

> https://opensource.org/osd-annotated

The reasons why "free software" people don't like the word "open source" are indeed political:

> https://www.gnu.org/philosophy/open-source-misses-the-point....

For software for which the source code is available, but does not give the four freedoms:

> https://www.gnu.org/philosophy/free-sw.html#content

it is common to use the word "shared source" (originally devised by Microsoft):

> https://en.wikipedia.org/wiki/Shared_source

Kernel driver of proprietary AMDGPU-PRO are licensed exactly same way as modules in mainline kernel. Most of them dual-licensed under MIT and GPL so BSD and other projects can use them.
But it is free software. It resides in the kernel tree: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-st...
https://www.phoronix.com/scan.php?page=article&item=amd_cata...

Note that both the amd and the nvidia kernel modules always have been FOSS because of the GPL license. It's just that nvidia provides it by its own ways, not through the official linux branch, and thus doesn't have to respect linux rules nor to document the driver.

Note that both the amd and the nvidia kernel modules always have been FOSS because of the GPL license.

Only open source part of their modules was shim while 99% of driver is contained in blob. That's true for both Nvidia or ATI/AMD fglrx.