Hacker News new | ask | show | jobs
by blibble 1218 days ago
obviously, but it allows delegation of trust onto other systems (like the DNS)

example: the package named "aws" on pypi was created by some random guy and has been abandoned for years

if pypi/pip supported namespacing that would be info.randomdude.aws instead

and amazon's packages would be under com.amazon

not being able to namespace internal packages is another security issue that is substantially improved with proper namespacing

to be blunt: not supporting it at this point is reckless and irresponsible

(I note you're part of pypa!)

3 comments

DNS isn't a particularly secure root of trust; Java is somewhat unique among package ecosystems for picking it as their trust anchor.

It also just kicks the can down the road: Amazon is the the easy case with `com.amazon`, but it isn't clear a priori whether you should trust `net.coolguy.importantpackage` or `net.cooldude.importantpackage`. These kinds of trust relationships require external communication of a kind that package indices are not equipped to supply, and should not attempt to solve haphazardly.

> (I note you're part of pypa!)

I am a member of PyPA, but I don't represent anyone's opinions but my own. It's a very loose collection of projects, and it would be incorrect to read a general opinion from mine.

I will note even namespaces for package management that don’t use DNS are a big step up over none.

For example in PHP/composer/packagist and node/npm they just have a vendor name that can be reserved.

It makes it very easy to distinguish “this package is from the (trusted vendor name here)” and prevents issues with namesquatting.

> Amazon is the the easy case with `com.amazon`, but it isn't clear a priori whether you should trust `net.coolguy.importantpackage` or `net.cooldude.importantpackage`

this is a classic example of not letting perfect be the enemy of good

there is no perfect solution, there never will be

piggybacking off of DNS works extremely well for Java and Go (and the tooling is a pleasure to work with)

meanwhile Python continues to be a complete disaster

I agree there is no perfect solution. But I want a good solution, and I disagree that DNS is a good one.
I look forward to another 20 years of no progress!
Your cynicism isn't warranted: we've made significant improvements to PyPI over the last 4 years[1][2], and I'm currently working on additional features that will make secure publishing to PyPI easier[3]. We're also working on a codesigning implementation for PyPI, based on Sigstore[4].

Security needs to be evidence and outcome-driven, first and foremost. That takes a while, but improved outcomes make it worth it.

[1]: https://pyfound.blogspot.com/2019/06/pypi-now-supports-two-f...

[2]: https://pythoninsider.blogspot.com/2019/07/pypi-now-supports...

[3]: https://github.com/pypi/warehouse/issues/12465

[4]: https://www.sigstore.dev/

> That takes a while, but improved outcomes make it worth it.

meanwhile the integrity of the supply chain continues to be compromised

> Your cynicism isn't warranted

it is: the python packaging situation is worse today than it was when I started writing Python in 2005

the legions of meetings, grandiose titles, conferences and mountains of unreadable proposals have produced tooling that is objectively worse than what Maven offered close to two decades ago

I like the way golang handled this. Imports are the URL to the resource. No central distribution mechanism at all. In the past few years they implemented a optional catching layer so you a dependencies going offline doesn't necessarily mean that it unavailable anymore.
Who's to say mr randomdude won't claim com.amazon first?
Let's encrypt solved this by doing a proof of control over the domain name, and in an automated way.

Pypi could do this. Or, they could require that someone demonstrate proof of ownership for a namespace by signing it with a certificate tied to the domain name (so you couldn't claim the com.bigco namespace without having the certs, which you can't get without owning that domain). There could even be signature requirements/proof for each package and/or version uploaded.

I would need to spend money to purchase a domain and some kind of server before I can publish a python module? That doesn't seem right. And I presume I would need to keep paying for it as long as I want my modules available and verified. Attaching required monetary purchases to an open source ecosystem is not a good idea.
Supporting namespacing does not preclude having the old system too. Or from having a public repo namespace like org.pypi or whatever that allows people to upload packages to the current repo using the system they currently have. Might help sort out some of the other packaging problems too - LWN had this the other day: https://lwn.net/SubscriberLink/923238/d48af5401c04db7d/ . Maybe it would help with the integrator notion org.conda or whatever.

Depending on how something like this is implemented, maybe com.github could set it up to pull straight from the project repo.

Just because there's ways it could go poorly, doesn't mean it will go poorly.

Well, in theory you could have a namespace schema that differentiates between user-submitted and organization-submitted packages such that randomdude's would appear as 'public.randomdude.aws' and organization-owned namespaces verified by a DNS record would appear as 'com.amazon.aws'
You could in principle do proof-of-ownership checks like Google does for things like Webmaster Tools, so you’d need to control a domain to have thr corresponding namespace.
It's much easier to correct the ownership of a single namespace than N packages in the global namespace