| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vvanders 3480 days ago
	My naive assumption is that code reuse across platforms is a good thing, I'd love to understand why this isn't the case here or what the concrete arguments are against it.

8 comments

AnthonyMouse 3480 days ago

> My naive assumption is that code reuse across platforms is a good thing, I'd love to understand why this isn't the case here or what the concrete arguments are against it.

A driver is inherently platform-specific. It's glue that ties the hardware to the operating system. The only "correct" way to have one driver work on multiple operating systems is for the operating systems to all use the same driver model.

The ugly way is to create your own hardware abstraction layer and then write a translation layer between that and each operating system, because that's complicated and hideous.

But it's especially silly because Linux accepts suitable contributed code, so you could instead use the native Linux model as your "intermediary layer" and fix Linux if it isn't suitable in some way. And then translate that to what the closed operating system you can't modify uses.

The result is that the Linux people are happier and you have one less translation layer to maintain.

comex 3480 days ago

That might run into license issues. If you want to avoid licensing the other versions of your driver under GPLv2, you'd have to carefully avoid copying any code from the main kernel into your translation layer (rewriting any helper functions you end up using), and even then there's the idea of API copyright to contend with.

One might ask whether it is desirable to avoid the GPL, and there are a lot of arguments on both sides there, but it's certainly easy to run into issues when you have a GPL licensed module designed to be linked into a proprietary program (kernel).

AnthonyMouse 3479 days ago

> If you want to avoid licensing the other versions of your driver under GPLv2, you'd have to carefully avoid copying any code from the main kernel into your translation layer (rewriting any helper functions you end up using), and even then there's the idea of API copyright to contend with.

Isn't the point supposed to be to not have other versions of your driver, so you can use the same one on every platform?

comex 3479 days ago

By "other versions" I mean the codebase used for a given non-Linux platform, which would (hypothetically) include most of the Linux driver's code plus a translation layer from Linux APIs to that platform's.

AnthonyMouse 3479 days ago

The translation layer isn't where the interesting bits are. The parts hardware companies want to keep secret are the hardware-specific parts, not the OS-specific parts. It might even help them to open source the translation layers because then others could potentially use them and shoulder some of the maintenance cost.

I can't speak to the legal status of GPL drivers for Windows, but several seem to exist already (e.g. Windows ext4 driver), and if they were actually worried about it they could always get explicit permission from the copyright holders of the relevant code. Either they say yes and you're fine or they say no you know what pieces of code to replace.

EpicEng 3479 days ago

>But it's especially silly because Linux accepts suitable contributed code, so you could instead use the native Linux model as your "intermediary layer" and fix Linux if it isn't suitable in some way. And then translate that to what the closed operating system you can't modify uses.

But Linux repesents a tiny portion of the gaming community, so that approach would make no sense at all for a GPU vendor. C'mon.

aseipp 3479 days ago

Then they aren't going to get their driver upstream. End of story. Kernel developers have already done this once (Dave hinted at Exynos drivers in the past in his other posts) and it was a large amount of work to un-screw the pooch once all this crap came along.

I know that Linux people really really just want the kernel to take one for the team so they can have GPUs because that's just the goal, and clearly the goal is good and the means don't matter at all and everything else is irrelevant. 100,000 lines of crap code, 200k? 500k? Who cares, it's all in the name of GPUs clearly. It's obviously worth it no matter what.

But the kernel developers do not see it that way, and for good reason -- because once it's in tree, they are all on the hook for it and they all have to deal with the swamp, the added complexity, the maintenance, the un-fucking of this entire HAL, etc etc.

Having worked on a large open source project, I can assure you, it sucks when you have to say "This isn't acceptable and we aren't merging it", even when it's a feature the users want, and one someone worked on for a long time. It is also, almost always, the right thing to do in the long run (and several of those features did come back, in acceptable ways, in our case).

AnthonyMouse 3479 days ago

> But Linux repesents a tiny portion of the gaming community, so that approach would make no sense at all for a GPU vendor. C'mon.

The growth market for GPUs is GPGPU and servers. And Linux represents a large portion of the programming and server communities.

More to the point, as soon as you support Linux at all then it doesn't matter who has more share, it's still less work to do the above than have to maintain another translation layer.

freeone3000 3479 days ago

But AMD doesn't. GPGPU is already supported on nvidia drivers with their opaque blob. AMD has a more-transparent blob. People who want this to work already have a solution. This kernel change is probably important to some people, but those who simply want to run a GPGPU cluster on linux already have workable solutions.

AnthonyMouse 3479 days ago

The GPGPU market is the polar opposite of the gaming market.

Game developers might like to see clean driver source but they don't get to choose what kind of GPU their customers have already bought. And 99% of gamers are not going to choose their GPU based on Linux drivers. So nobody has any leverage and vendors have no incentive to change.

Meanwhile thousands of universities and institutions are each going to be looking for 25,000 GPUs and they can choose what brand they buy based on what makes their internal developers happy. Hosts like Amazon and Google are each going to be buying millions of GPUs, and having better and more transparent drivers so they can more easily e.g. improve power consumption by a small percentage, can save them a million dollars/year in electricity.

Someone like Google could come to each vendor and say "first to have mainline kernel drivers gets all our business" at any point. Or the same result in the other order; once there are clean drivers third parties are more likely to make power consumption and performance improvements that give AMD the edge when the major customers crunch the numbers.

There is a significant competitive advantage in it for AMD to get this right.

toxik 3479 days ago

Very good point, there's definitely a growing market for high-bandwidth GPGPU solutions, neural networks is probably just the start.

nindalf 3480 days ago

I agree with you almost entirely, except the part about fixing Linux. If the abstraction that Linux provides isn't suitable for some reason, it probably isn't straightforward to change it because of compatibility with existing code.

caf 3480 days ago

That's not so much a concern within the kernel boundary, which is the case that applies here. If you have a compelling reason to redesign an internal API, you "just" have to fix up all the code across the tree that consumes it. Changes are regularly made to the internal VFS interfaces, for example.

wtallis 3480 days ago

It's also often the case that kernel-driver interfaces are extended without breaking compatibility. In those cases, you want to ensure that the extensions are suitable for more than one driver to consume.

posterboy 3480 days ago

Or future changes in the linux target need to be translated to all other target wrappers.

0xcde4c3db 3480 days ago

The problem isn't that sharing code across platforms is bad, it's that not sharing code within Linux is bad. Airlie is basically saying that if the kernel API and subsystems are somehow inadequate, AMD should improve them directly instead of covering them up with a bunch more code.

wolfgke 3480 days ago

> Airlie is basically saying that if the kernel API and subsystems are somehow inadequate, AMD should improve them directly instead of covering them up with a bunch more code.

And you really believe that the maintainers will be accepting a giant patch that changes the API and subsystem completely (though into something better) that has the risk of causing lots of regressions to existing drivers? And you believe that AMD is supposed to fix all the regressions that are caused in drivers by other vendors that this change causes?

brongondwana 3480 days ago

Of course not, the maintainers will accept a well thought out series of patches that each make one small logical change towards the better interface.

And yes - who else is supposed to fix all the regressions caused by changes that AMD wants? Volunteers who would rather work on something else? If you want a change, you get to support the regressions - and if AMD's work gets merged, then anyone ELSE who wants to make a change in that page needs to support AMD's regressions.

Hence wanting to make sure that the changes from AMD are manageable and flexible enough to allow further changes.

wolfgke 3480 days ago

> If you want a change, you get to support the regressions

And what about a change to a stable internal kernel API, which the kernel developers refuse?

pas 3479 days ago

No, they just want a stable future, around which they can plan the API, but so far no one has delivered on that tiny requirement.

prodigal_erik 3479 days ago

Linux got where it is by evolving how the kernel and drivers interact whenever needed, without waiting to coordinate with outsiders and their closed work.

the_why_of_y 3479 days ago

The Linux kernel does not have stable internal kernel APIs.

https://www.kernel.org/doc/Documentation/stable_api_nonsense...

wolfgke 3479 days ago

This is exactly my point.

toast0 3480 days ago

(without looking at details) The problem is that Windows and Linux expose hardware and drivers in different ways. You can shim things up to make the code work, but you end up with a driver that doesn't look like a Linux driver and doesn't work like a Linux driver and can't easily be maintained by people working in the Linux graphics drivers is going to be a problem.

If the driver doesn't really belong in the Linux kernel source for those reasons, it's better to keep it outside the kernel tree.

Qwertious 3480 days ago

AIUI, the problem is

code re-use between drivers of different vendors but the same kernel/OS,

VS

code re-use between drivers of the same vendor but different kernels/OSes.

At the end of the day, both sides are arguing for code re-use, of sorts.

justincormack 3480 days ago

The open source developers don't care about invisible code reuse in a closed source driver. HALs across open source codebases do exist too (eg for ZFS) but Linux in particular does not like them.

cpeterso 3480 days ago

AMD should move their HAL code into their Windows driver, making it a superset of the Linux driver. AMD would get to reduce driver code duplication and Linux kernel developers don't need to merge the AMD's ugly Linux/Windows HAL.

wolfgke 3480 days ago

> AMD should move their HAL code into their Windows driver, making it a superset of the Linux driver.

This might theoretically make sense if the Linux subsystem was very stable over many years. Practice shows that the Windows interfaces are what are a lot more stable over the years and changes in them are communicated for a long time beforehand so that hardware vendors can begin changing their drivers long beforehand.

Kadin 3479 days ago

Regardless of one's thoughts on AMD, I think this is broadly true. Microsoft may do a lot of things poorly, but one thing they are good at (arguably, the only thing they're good at, hell maybe the key to their success, really) is maintaining compatibility and not breaking stuff.

mixedCase 3479 days ago

This is explained here: https://www.kernel.org/doc/Documentation/stable_api_nonsense...

Linux maintains compatibility by fixing the driver themselves when they break it. Microsoft cannot (actually, can, and does) break their interfaces since they don't control the drivers.

This allows Linux to keep improving without breaking things in production; while Microsoft has to either maintain huge backward compatibility abstractions for changes, go YOLO and break stuff (often unknowingly) or abstain from improving their OS.

tremon 3480 days ago

It is a good thing. For the developers of that piece of code (AMD in this case).

However, it is introducing a second API for a very specific subset of hardware into a kernel that is being developed by not just AMD people. Dave Airlie is rightly saying that the second API and hence two different code structures makes the whole DRI infrastructure harder to maintain for everyone else.

And Dave's responsibility is to everyone else, not to AMD.

din-9 3480 days ago

It is a good thing for the driver writer as they have less difference between their targets.

It is a bad thing for the targets as they implement both the driver functionality and the abstractions required to make the same code work cross platform. The response linked describes the cost of those abstractions to the target (Linux kernel in this case).

tlow 3480 days ago

I believe this link is the "concrete argument" against a unified abstraction layer in this particular instance.