Conversation
From the links on the respective websites.
42aec4b to
8e6556d
Compare
“fetchurl: remove unused mirrors” should be trivial after the relevant PRs are merged. “fetchurl: remove most secondary mirrors” should be simple to review, since it only drops mirrors and clearly leaves canonical ones, unless you’re concerned that some of them may be out of date in a way that happens to preserve an old package. “fetchurl: remove unencrypted SageMath mirrors” similarly. “fetchurl: use working mirror for OSDN” should also not be too bad, since you can see that the canonical mirror is dead anyway (and there are barely any users). So it’s only the commits to use canonical mirrors that I can see causing an issue. I observed that the directory structure was clearly identical as far as I could see between the sites getting replaced, and that they’re what the relevant upstreams point to on their sites. In some cases there are even redirects. If there’s a specific change you’re worried about I can make suggestions on how to test it. |
|
Some thoughts on why we should probably remove this mechanism entirely in the future in #409010 (comment), by the way. |
|
My concern (and it may be that it's a silly one) is that some mirrors deliver different contents, and that I, the merger, need to be able to certify that this isn't the case, for each derivation which uses the mirror mechanism. Since these are FODs, I need to use I've been tracking the "remove the mirrors" discourse and agree with your take. |
How does this apply to cases where we’re only removing mirrors? |
My reading of the code in |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
They could be serving different files depending on Host:. I thought maybe pr might be for pre‐release. However, it seems like they both eventually redirect to prdownload.sourceforge.io. prdownloads.sourceforge.net takes one less redirect hop, but downloads.sourceforge.net feels more canonical, so I’ve gone with that.
There was a problem hiding this comment.
In the browser, prdownload.sourceforge.io redirects to https://sourceforge.net/software/download-managers/, which is kinda weird IMO
There was a problem hiding this comment.
You’re missing an s, is why :)
There was a problem hiding this comment.
Uhh you said this just then:
However, it seems like they both eventually redirect to prdownload.sourceforge.io
So I assumed prdownload.sourceforge.io without the s also is some mirror... Sorry for the confusion.
This avoids synchronization issues and should cause limited load on the upstream servers due to cache.nixos.org FOD caching.
The Debian and OpenBSD mirrors are backed by the Fastly CDN; it looks like Canonical operate the Ubuntu one directly, but we don’t fetch many packages from it anyway.
Sadly their canonical upstream mirror doesn’t support HTTP, but it looks like we fetch Sage itself from GitHub, and there aren’t many uses of these mirrors across the tree. They could probably be cleaned up so that this could be dropped.
Unencrypted, outdated, and broken mirrors are a general nuisance. Most canonical upstreams are behind CDNs these days, and even when they’re not, I think the amount of load caused by Nixpkgs downloads should be negligible. The source FODs are cached on cache.nixos.org, so most users never end up downloading from upstream mirrors at all. When they do, it’s often in a corporate environment avoiding the cache, where FTP mirrors can cause firewall issues. Many sources don’t use the mirrors mechanism at all, so such users will probably want to cache FODs via the usual Nix store mechanisms or set up a caching HTTPS proxy. There’s just no excuse for unencrypted mirrors in general in 2025, especially when they’re only fallbacks. (Yes, a passive observer can probably deduce information about what files are being downloaded by the amount of data transmitted anyway, but HTTPS still avoids our first‐download TOFU process being MITM’d, and helps with dodgy middleboxes rewriting things.) I am not sure whether the mirrors mechanism makes sense to keep at all, but I leave deciding that to future work.
8e6556d to
29e3907
Compare
I guess I don’t really understand. We have multiple mirrors in the first place based on the idea that connections to one of the earlier ones might fail, so I don’t think “previously wasn’t used” is strictly correct. Bu this worry only applies to cases where we’re deleting ones other than the first mirror, right? In that case, are you worried about accidental broken links, or supply chain attacks? Everything pins a |
|
I guess it's not achievable as of now with the way Edit: maybe something simpler like |
Here's what I am (very slightly) worried about. Before this patch, assuming that the first mirror in the list returned the FOD-hash-appropriate contents, a Let's use as our example So all I really know is whether https://dlcdn.apache.org/pig/pig-0.17.0/pig-0.17.0.tar.gz works -- not whether ftp://ftp.funet.fi/pub/mirrors/apache.org/pig/pig-0.17.0/pig-0.17.0.tar.gz worked, let alone any of the others. After this patch, I'm checking to see whether https://downloads.apache.org/pig/pig-0.17.0/pig-0.17.0.tar.gz works. The testing I want to perform is this:
I'm willing to relax this and do a bunch of spot checks then merge. But I want to be able to do the full thing. |
(I am talking here only about dropping mirrors, not about the other changes to mirrors in some of the commits.) Semantically, I would not quite say that is true. You are checking the first one if it works. The only reason we have the backups is that it might not work. The mirror list only makes sense if we assume these properties:
Otherwise, if a link fails to work because of transient conditions, then you can fall back to a mirror that has the wrong hash. And therefore my response is: this problem already exists; it just might not be easy to observe. Note that the vast majority of these FODs will already be cache. Some of them will likely have already died upstream without anyone noticing yet. So we do already have that kind of problem before this PR. Anyway, let’s consider a few cases:
So the only thing I really worry about here is (1), because (2) makes a pre‐existing problem strictly more exposed, (3) makes no sense to me, and (4) is good. But I also don’t think (1) is any worse than the general Nixpkgs link rot problem (that FOD caching protects somewhat against in practice), and I also think it’s less likely. That’s why I am only worried about changing mirrors, not removing them (as long as the ones being removed are non‐canonical and canonical ones remain). We already do not know for sure which mirror will be selected (that’s the whole point of the mechanism), so every problem removing mirrors creates already exists stochastically; the only model that makes sense for maintaining lists of mirrors implies that removing them is safe except for the (1) case; the FOD cache buffers against any problem in practice; and any potential changes seem to me either desirable or both uncommon and pedestrian in nature.
Sure, I don’t oppose doing this, but I think it’ll be somewhat of a pain. There’s 1124 files in Perhaps you could edit |
There was a problem hiding this comment.
Semantically, I would not quite say that is true. You are checking the first one if it works. The only reason we have the backups is that it might not work. The mirror list only makes sense if we assume these properties:
- The first link may not always work (because of transient conditions on either side of the connection, or because a link went 404).
- Any links that do work will return the same hash.
also
-
The user is running
nix-buildon a host from a country that a particular mirror does not want to work with, or the user has all of its traffic routed through TOR.In this case, default mirrors frequently refuse to work, and thus alternatives are very useful.
In fact, I know for certain that the removal of
https://download.savannah.nongnu.org/releases/line will break over-TOR builds for packages hosted on Savannah. -
When the
/etc/ssl/certs/ca-certificates.crtis outdated, like when updating a really old NixOS system to a new release, the HTTPS mirrors might not work.Thus, removal of unencrypted mirrors will break such updates (and
tarballs.nixos.orgwon't help!).
Therefore, I approve of the changes that remove dead URLs and fix and add more canonical mirrors. But I disapprove of other removals.
Whether the canonical mirror or the dispatcher, when available, should be the first element in the list is unclear to me. I assume that the dispatchers going first has a technical reason and I would be uncomfortable with changing it without knowing the effects of such a change.
Well, sure. But countries have blocked GitHub in the past, as well, and Tor is a different case, since it is of course one of the primary ways of obtaining less filtered internet access. However, I just successfully downloaded https://download-mirror.savannah.gnu.org/releases/freetype/freetype-2.13.3.tar.xz over Tor, so if there was any issue it seems to have been resolved? (And of course there’s no guarantee that the redirector wouldn’t point you to a Tor‐blocking mirror anyway.)
That’s just unreasonable. There are many critical packages in Nixpkgs that have sources specified solely with HTTPS URLs. Getting an up‐to‐date certificates list is the one of the first steps in reviving such an ancient system. I cannot imagine any situation where working around it by randomly having HTTP and FTP URLs for some small semi‐random subset of packages solves this problem for anyone, since they will inevitably have to deal with HTTPS anyway. |
Thus, I don't see how the above is an argument for removing mirrors or why removal of working mirrors could be a good thing.
Well, TOR exit node is a bit of a lottery, isn't it? Maybe I simply get unlucky, but on some of my build machines builds fail all the time because
Making updates to outdated systems easier does not seem unreasonable to me. In fact, the fact that Also, let's not forget
|
|
Also, philosophically, I disagree with #409010 (comment) and #409010 (comment). The fact that CDNs are common in 2025 is not an argument for making Nixpkgs less resilient to censorship and networking issues. IMHO, with CDNs and GitHub centralization we should be adding more mirrors and implement the ability to specify alternative non-GitHub The fact that several companies can single-handedly prevent you from updating your machines does not seem that nice of a thing to me. |
I think that, if your system is that horribly out of date (I don't know exactly how old it has to be, but I assume over 5 years or so for it to really matter a lot), you should've updated way earlier, or just "re-bootstrap" it:
Also, how can building such a basic security-critical package be done without working certs, like at all? If you want to fallback to HTTP instead, you are ultimately trusting that you are lucky and you didn't get MITM'd. (there is no way to know if you have bogus certs) Edit: I agree on you with the Tor stuff tho. No reason to make nixpkgs explicitly support "less of it"... |
|
@dtomvan Re updates: HTTP does not hurt there, after all, the hashes get checked anyway. Last time I needed to do it, I did Which is simpler, IMHO. But it would have been nicer if I could simply |
There was a problem hiding this comment.
I agree with removing a lot of the insecure/unmaintained/FTP mirrors, but I heavily disagree with removing mirror sites that offer more global coverage outside of the Western world, let alone removing the mirror mechanism entirely.
Sure, mirror sites may be outdated and even questionable in terms of security and provenance, but the reality is that for a lot of us, it's to either use these mirror sites, or stare at a 300 kB/s connection for hours on end. Heck, when I'm back in China, I'm lucky to get a 200 kB/s connection to c.n.o, which makes Nix essentially unusable - in theory, I should be served by a nearby CDN, but in reality that means it's somewhere in the western US and the connection has to go around half of the planet with dozens of chokepoints along the way. The only way for me to use Nix with a reasonable connection speed is to use mirror sites provided by the likes of the University of Science and Technology of China or Tsinghua University, and it's precisely these sites that are on the chopping block of this PR.
I appreciate the idealism, but pragmatically this is a no for me. Sure, you can say but you need to connect to github.com in order to use 99% of the packages on Nixpkgs, but that to me is in itself a big problem - sometimes the government itself blocks access to GitHub, and sometimes GitHub even blocks access to Chinese users themselves like what happened on April 13th this year, causing mass panic on social media as we are once again reminded how crucial GitHub is for software development, and how unreliable it can be. Almost no one can do our damn jobs when GitHub is offline. Sure, they may claim it to be an outage and a mistake, but these incidents really serve to remind us that we really ought not to put all our eggs in the same basket.
We need to reserve Nixpkgs's ability to use alternate sources for our software, because this is what happens when you rely on a monopoly. The response to being entrapped in a monopolistic system cannot be to allow the monopoly to grow even larger. I'm sure many developers from the terribly-named "third world" like the Indian subcontinent, Southeast Asia, Africa and more would share a similar sentiment.
EDIT(XII. '25): Some more catastrophic outages and corporate homogenizations and hegemonizations later, there's no way that our current dependence on a single-source-of-truth model would ever work for a colossus like Nixpkgs itself, let alone the various pieces of software within it. I still stand by my original comment — yes, dead and insecure mirrors should be culled, but we need to also find and help maintain other mirrors to replace them.
| "https://netcologne.dl.sourceforge.net/sourceforge/" | ||
| "https://versaweb.dl.sourceforge.net/sourceforge/" | ||
| "https://freefr.dl.sourceforge.net/sourceforge/" | ||
| "https://osdn.dl.sourceforge.net/sourceforge/" |

Unencrypted, outdated, and broken mirrors are a general nuisance. Most canonical upstreams are behind CDNs these days, and even when they’re not, I think the amount of load caused by Nixpkgs downloads should be negligible. The source FODs are cached on cache.nixos.org, so most users never end up downloading from upstream mirrors at all. When they do, it’s often in a corporate environment avoiding the cache, where FTP mirrors can cause firewall issues. Many sources don’t use the mirrors mechanism at all, so such users will probably want to cache FODs via the usual Nix store mechanisms or set up a caching HTTPS proxy.
There’s just no excuse for unencrypted mirrors in general in 2025, especially when they’re only fallbacks. (Yes, a passive observer can probably deduce information about what files are being downloaded by the amount of data transmitted anyway, but HTTPS still avoids our first‐download TOFU process being MITM’d, and helps with dodgy middleboxes rewriting things.)
I am not sure whether the mirrors mechanism makes sense to keep at all, but I leave deciding that to future work.
Depends on #409063 and #409108.
cc @alyssais who had opinions on Hydra FOD caching so may have opinions on this (though I think even if Hydra didn’t cache FODs this change would be fine, as it only applies to a relatively small subset of sources anyway, and the binary packages that most users download would still be cached)
cc @jherland
Things done
nix.conf? (See Nix manual)sandbox = relaxedsandbox = truenix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)Add a 👍 reaction to pull requests you find important.