Hacker News new | ask | show | jobs
by weinzierl 3696 days ago
> well, as long as the crates.io site still exists, but if that goes away so does the Cargo index

That makes me wonder: Is it easy or possible to replace crates.io with a self hosted repository?

Background of the question is that I know of a company where access to the standard public maven repo is forbidden. They use a commercial repository provider but I don't know if it is hosted on premises.

2 comments

It's not easy, but it's possible. Everything is open source, and Cargo can easily be pointed at whatever host you want.

The feature that I'd like to see but haven't found the time to implement is delegation: "look at this index first, but if it's not here, go look at this other one". Right now, if you spin up your own crates.io and point Cargo at it, it won't have any packages... which works for some people, but not others.

> "look at this index first, but if it's not here, go look at this other one"

I agree that this would be very helpful for a lot of people, but it is kind of the opposite of what I was asking about.

As far as I understand it the commercial repos try to solve two main concerns:

1. license compliance

2. security

Better performance and reliability are just additional benefits.

I only know the details for a certain Fortune 500 company. They don't want the builds to fetch packages from a site they don't control, and they certainly don't want "if it's not here, go look at this other one". The idea is more control about where the packages come from, not more flexibility.

I think if Cargo doesn't provide a way support alternative (possibly commercial) repos, it would be an obstacle to the adoption of Rust in the corporate world.

If you just want to ignore the broader OSS ecosystem, and run your own version of Crates.io behind a firewall, that's 100% supported today. The only real issue is that how to do so isn't particularly well-documented.
Sorry for pestering again but I think this is kind of important and I haven't made myself entirely clear yet (English is not my first language).

In maven the repo URL is configurable in settings.xml. This URL can be different for different departments of even different projects.

From what I see in the cargo source the crates.io URL is hard coded. So the DNS is the only level of redirection we have. Using varying IP addresses for crates.io for different departments or even projects wouldn't fly, at least not in the world I live in.

>ignore the broader OSS ecosystem

It's not about that either because the commercial repos contain very much the same OSS packages as the standard repo but don't present all of them to everyone all the time. Take for example a car company: GPL3 for in-house projects that are just used by the employees are discouraged but somewhat tolerated. GPL3 for projects that run in the car are a big big no no. You want to be certain that no dev ever introduces GPL3 source into anything that is in the car. You want your build to fail if any of packages change their license to GPL3. You want your build to fail if any of you packages has a known vulnerability.

I know Cargo is not maven, but I believe this is a feature which is crucial for industry adoption. I think I will just add a feature request for this on GitHub.

  > Sorry for pestering again
No worries! This thread is a bit old but I'll try to pay attention to it.

  > From what I see in the cargo source the crates.io URL is hard coded.
It's not: http://doc.crates.io/config.html#configuration-keys

TL;DR:

  [registry]
  index = "URL_GOES_HERE"
and you're good.

  > It's not about that either because the commercial repos contain very
  > much the same OSS packages as the standard repo but don't present all 
  > of them to everyone all the time.
Ahh yeah. What I mean is, you'd have to set up the packages in that registry yourself. Which sounds like what they'd want to do, so seems fine.
What would be useful is a sort of "caching proxy" that could have various knobs to handle situations like:

- crates.io is down

- crates.io says this cached package doesn't exist.

- etc...

This is already possible today with existing caching proxies. This is a great way to make your CI/builds more reliable and quicker.
I meant a caching proxy that functioned as a mini Crates.io in the absence of actual crates.io being up. Depending on the crates.io protocol, just caching HTTP requests might not be enough, but also acting as a (offline-able) middle man that knows the protocol gives rise to other knobs and such (e.g. a configurable blacklist of packages).
What would be really nice is if there were a system like how debian, ubuntu, etc. does it, and allow for official (and unofficial) mirrors