Skip to content

Building on Ubuntu 21.10 failing to install postfix#2468

Merged
NorseGaud merged 3 commits intomasterfrom
issues/2467
Mar 18, 2022
Merged

Building on Ubuntu 21.10 failing to install postfix#2468
NorseGaud merged 3 commits intomasterfrom
issues/2467

Conversation

@NorseGaud
Copy link
Copy Markdown
Member

@NorseGaud NorseGaud commented Mar 6, 2022

Description

Fixes #2467

This is a workaround until postfix gets this bug fixed.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (non-breaking change that does improve existing functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 6, 2022

Documentation preview for this PR is ready! 🎉

Built with commit: 8d16056

@NorseGaud
Copy link
Copy Markdown
Member Author

Original issue: #2023

@NorseGaud NorseGaud force-pushed the issues/2467 branch 5 times, most recently from 0253ee1 to 9f2c38e Compare March 7, 2022 00:24
Copy link
Copy Markdown
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the issue reference to a relevant comment and use a reserved TLD (.invalid) instead of an unregistered one.

Comment thread Dockerfile Outdated
Comment thread Dockerfile Outdated
Comment thread Dockerfile
@NorseGaud NorseGaud force-pushed the issues/2467 branch 3 times, most recently from 429914b to d3ec842 Compare March 7, 2022 00:55
Comment thread Dockerfile Outdated
@NorseGaud NorseGaud requested a review from a team March 7, 2022 01:00
@NorseGaud
Copy link
Copy Markdown
Member Author

To the team: Do we really want to have a workaround like this? I prefer ubuntu server, but I could switch to Debian I guess...

@polarathene
Copy link
Copy Markdown
Member

polarathene commented Mar 7, 2022

FWIW, it appears there's another option available now.

If using BuildKit with an ENV to provide a temporary hostname to the image builder.. however comments near the end also mention it being less straight-forward to use than docker build and a possible cache issue (I've not tried this approach myself): moby/buildkit#2373

@polarathene
Copy link
Copy Markdown
Member

I prefer ubuntu server, but I could switch to Debian I guess...

It's a VM right? The docker host side change is fairly simple to add and for a throwaway VM not really an issue.

I don't think the workaround this PR uses is going to cause any issues, so long as something else isn't trying to use hostname differently. Regardless, since it's only for the build phase, the hostname is a temporary hexadecimal ID, so shouldn't make any real difference at container run-time?

Debian will presumably run into the same problem by it's next release, or they fix their postfix package. I wouldn't mind us using a newer version of Postfix either though instead of the package if someone wants to add that (good if we adopt a multi-stage build?).

@NorseGaud
Copy link
Copy Markdown
Member Author

FWIW, it appears there's another option available now.

If using BuildKit with an ENV to provide a temporary hostname to the image builder.. however comments near the end also mention it being less straight-forward to use than docker build and a possible cache issue (I've not tried this approach myself): moby/buildkit#2373

Yep, I tried it. It doesn't work. Postfix still uses localhost for myhostname (the /etc/hosts has 127.0.0.1. localhost dms.invalid). :(

@NorseGaud
Copy link
Copy Markdown
Member Author

NorseGaud commented Mar 7, 2022

and for a throwaway VM not really an issue.

gasp, what did you call my special VM!? ;) I plan on keeping this VM around for any future changes I make but that's a fair point.

@casperklein
Copy link
Copy Markdown
Member

gasp, what did you call my special VM!? ;) I plan on keeping this VM around for any future changes

You mentioned in an other issue, that you mount DMS to your VM (which introduce permission problems). You could place DMS directly inside your VM. In case you are using VS Code, take a look at remote-ssh. With that, you can easily work with the files inside your VM.

@NorseGaud NorseGaud requested a review from a team March 15, 2022 14:18
@NorseGaud NorseGaud marked this pull request as ready for review March 15, 2022 14:18
@casperklein
Copy link
Copy Markdown
Member

To the team: Do we really want to have a workaround like this? I prefer ubuntu server, but I could switch to Debian I guess...

For me it's a bit like the MacOS specific hacks. We have a supported base, in this case Debian. If it also works on other OS's fine, but I am unsure if we should take care for it (introducing workarounds).

Comment thread Dockerfile
apt-get -qq dist-upgrade && \
echo "applying workaround for ubuntu/postfix bug described in https://github.com/docker-mailserver/docker-mailserver/issues/2023#issuecomment-855326403" && \
mv /bin/hostname{,.bak} && \
echo "echo docker-mailserver.invalid" > /bin/hostname && \
Copy link
Copy Markdown
Member

@casperklein casperklein Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this? I don't think this will work without a shebang line.

Suggested change
echo "echo docker-mailserver.invalid" > /bin/hostname && \
echo -e "#!/bin/bash\necho docker-mailserver.invalid" > /bin/hostname && \

@polarathene
Copy link
Copy Markdown
Member

For me it's a bit like the MacOS specific hacks. We have a supported base, in this case Debian. If it also works on other OS's fine, but I am unsure if we should take care for it (introducing workarounds).

I'm happy to just go with my original proposal of documenting it. I imagine most aren't regularly building the Docker image, and if they need to they could make the host side workaround I've detailed, or build the image within a Debian guest VM (Ubuntu 20.04 should work too AFAIK since that's what our CI is using, but newer versions of Ubuntu won't).

The next Debian release isn't due until sometime in 2023 apparently, so it may be a while before it affects Debian and they fix the package patches on their end. The workaround in this PR isn't much to worry about either though AFAIK. We have other options too but those require more effort on our end (such as building Postfix ourselves).

@polarathene
Copy link
Copy Markdown
Member

Just chiming in here as I'm building on Ubuntu 21.10 too, without problems.

You would only have problems if you did not have that search line, as my solution in the original issue detailed for fixing on the host instead of within the Dockerfile.

@georglauterbach
Copy link
Copy Markdown
Member

georglauterbach commented Mar 16, 2022

Just chiming in here as I'm building on Ubuntu 21.10 too, without problems.

You would only have problems if you did not have that search line, as my solution in the original issue detailed for fixing on the host instead of within the Dockerfile.

I see. I think there are actually many contributors to this project that use Ubuntu. Therefore, I'd (although this being hacky) vote for merging this. We should also document it and add a note about reverting this PR once there are upstream changes that fix this...

@casperklein
Copy link
Copy Markdown
Member

Host issues should be fixed on the host IMO and we should keep our code base free of workarounds. Therefore I vote against this addition. But feel free to outvote me - it's fine 😄

@polarathene
Copy link
Copy Markdown
Member

polarathene commented Mar 16, 2022

Host issues should be fixed on the host IMO

It's actually an issue with the Debian Postfix package post-install step making assumptions. If we compiled Postfix ourselves instead of using the package, or patched the package before installing it, we wouldn't have the issue either.

The only reason I am in more support of a workaround in the Dockerfile is because that's more of a viable solution than requiring affected users to modify their DNS resolution on host. It's probably unlikely when working with Docker, but in some environments the user may have the ability to run docker-mailserver image, but not modify the Docker host as much, like our CI for example.

@casperklein
Copy link
Copy Markdown
Member

Convinced 👍

PS: So if I understand correctly, we would have the some problem, when our CI upgrades to a newer Ubuntu version, right?

@polarathene
Copy link
Copy Markdown
Member

polarathene commented Mar 16, 2022

when our CI upgrades to a newer Ubuntu version, right?

Possibly, it would depend if Github CI instances have the local DNS resolver (/etc/resolv.conf) configured with a search field like @georglauterbach showed above. If that is missing, then the image build will fail as the package post-install step fails to resolve a DNS query IIRC (that's not actually of any real importance to us).

@NorseGaud
Copy link
Copy Markdown
Member Author

@georglauterbach do I need to change the base branch since this is slotted for 11.0? I'm moving over the next week so my replies will be delayed. Feel free to merge it once approved!

@casperklein
Copy link
Copy Markdown
Member

Can anyone explain, why adding search . is a good idea? I couldn't find anything about this. I thought to find something, if this is common best-practice.

@georglauterbach
Copy link
Copy Markdown
Member

@NorseGaud feel free to just merge this into `master´ once approved :)

@georglauterbach
Copy link
Copy Markdown
Member

@casperklein does this article help? I guess search . just "keeps" the address and makes it an "FQDN" instead of appending more...

@casperklein
Copy link
Copy Markdown
Member

I saw this too, but I found nothing about best-practicing, adding search ..

@polarathene
Copy link
Copy Markdown
Member

I couldn't find anything about this. I thought to find something, if this is common best-practice.

A DNS search domain is also different. When you query a name that is only a single label — a domain without any dots — a search domain gets appended to your query.

For example, because I’m currently connected to my Red Hat VPN, I have a search domain configured for redhat.com. This means that if I make a query to a domain that is only a single label, redhat.com will be appended to the query.

For example, I can query bugzilla and this will be treated as a query for bugzilla.redhat.com. This probably won’t work in your web browser, because web browsers like to convert single-label domains into web searches, but it does work at the DNS level.

Source

Search domain means the domain that will be automatically appended when you only use the hostname for a particular host or computer. This is basically used in a local network.

_Lets say you have a domain name like xyz.com (it may be available globally or may be local only) and you have 100 computers in the LAN. Now you want this domain name to be automatically appended when you look for any computer by just hostname of the computer. _

So, what search domain is doing in our case is that it is automatically appending a domain name to make it a FQDN when we are just using the hostname to look up a computer.

Source

A single DNS label is not valid for DNS query, only mDNS.


Oh you were specifically asking about search ... 😅

My linked issue about it references the associated systemd update changelog which describes why they add search . by default under some conditions (so the problem was more to do with that specific line, not the omission of search my bad), and in an earlier message on that issue thread I also in a collapsed details section covered what was going on with the Postfix post-install step which results in an invalid FQDN value to configure postfix for.

search . is not best practice per se AFAIK, but systemd will add it to adjust the behaviour of nss-dns which otherwise implicitly uses the hostname as a search domain apparently?

@georglauterbach
Copy link
Copy Markdown
Member

Now I see the error on one of my devices as well, pretty annoying...

@casperklein
Copy link
Copy Markdown
Member

casperklein commented Mar 18, 2022

I thought, systemd would just add the search option to /etc/resolve.conf on boot if it does not exist. Therefore a container does have the same /etc/resolve.conf as the host.

But even if you use a custom /etc/resolve.conf without search option on the host (not generated by systemd on boot), systemd still adds search . to the containers /etc/resolve.conf 🤦‍♂️

root@ubuntu2110:/# grep search /etc/resolv.conf && \
                                  echo --- && \
                                  docker run -it debian:11-slim grep search /etc/resolv.conf
#search foo.bar
---
search .

systemd: Just mapping /etc/resolv.conf 1:1 to the container would be too easy.. Let's fuck it up, despite what the user actually wants 🖕

@NorseGaud NorseGaud merged commit 1f174ce into master Mar 18, 2022
@NorseGaud NorseGaud deleted the issues/2467 branch March 18, 2022 17:07
@casperklein
Copy link
Copy Markdown
Member

It seems, if the shebang line in a script is missing, there is a fallback to the system's default shell. Not a clean solution IMO, but seems to work.

@polarathene
Copy link
Copy Markdown
Member

Therefore a container does have the same /etc/resolve.conf as the host.

If you want that, you can do so with host mode networking. systemd should not be doing anything to the containers config from the host, that's probably Docker meddling with it.

@casperklein
Copy link
Copy Markdown
Member

casperklein commented Mar 20, 2022

Sorry for asking again. But you said, that the addition of search . to the containers /etc/resolv.conf is a systemd (new version) thing? But then you said, it may be a docker thing? I used the same latest Docker version on all tests systems.

Let me summarize it:

OS Docker systemd Host resolv.conf Container resolv.conf
Ubuntu LTS latest old No search . is added search . is not present
Debian 11 latest old No search . is added search . is not present
Ubuntu 2110 latest new search . is added search . is present
Ubuntu 2110 (custom resolv.conf without search ., not auto generated) latest new No search . is added search . is present

I just try to finally understand, who's the bad guy and does weird things. From the table above it looks like, it's systemd?

@polarathene
Copy link
Copy Markdown
Member

polarathene commented Mar 20, 2022

But then you said, it may be a docker thing?

I said systemd only modifies the host system config, it will not interfere with the containers directly. The container may "import" the host config, especially with host mode networking, otherwise Docker daemon is doing it's own modification separately (like it does with setting hostname to the container name/id).

The newer version of systemd as per the changelog adds search . if needed to prevent implicit nss-dns fallback behaviour. This should only be happening on the host though, so if it only happens in the container, it's probably something going on with Docker.


With your custom change on host, did you just modify the file, or configure it so systemd was re-generating resolv.conf without it? Perhaps without making systemd aware of it, Docker is sourcing the resolv.conf to use in the container differently?

If you use host networking, perhaps it will use the modified one on your host, I just recall systemd adding a comment about resolv.conf being generated by systemds own config, and that my workaround that changed the search domain (at /etc/systemd/resolved.conf) required restarting the relevant systemd service systemd-resolved.

If you apply a change to search domain like that, it probably carries of into the guest container. Perhaps there's a way to prevent injecting the search . line as well without providing an alternative search domain, I just didn't bother since test is reserved and won't go to a public DNS resolver anyway.

@casperklein
Copy link
Copy Markdown
Member

I didn't use host networking, just the default network mode docker provides.

did you just modify the file, or configure it so systemd was re-generating [..]

I've already killed the test VM, so I cannot take a detailed look. But /etc/resolv.conf was symlinked to systemd's autogenerated file (which was in some systemd directory). I've deleted the symlink and created my own resolv.conf. I verified with dig, that the host uses resolv.conf created by me.

Maybe Docker don't cares for /etc/resolv.conf and uses systemd's autogenerated file directly?

Enough time invested 😆 Thanks for your help.

@polarathene polarathene mentioned this pull request Aug 31, 2022
7 tasks
@polarathene
Copy link
Copy Markdown
Member

FWIW, it appears there's another option available now.
If using BuildKit with an ENV to provide a temporary hostname to the image builder.. however comments near the end also mention it being less straight-forward to use than docker build and a possible cache issue (I've not tried this approach myself): moby/buildkit#2373

Yep, I tried it. It doesn't work. Postfix still uses localhost for myhostname (the /etc/hosts has 127.0.0.1. localhost dms.invalid). :(

Recently looked into this and verified it works:

  • BUILDKIT_SANDBOX_HOSTNAME is a --build-arg, not ENV for the command 😅
  • # syntax=docker/dockerfile:1 must be added to the Dockerfile to enable the support. The feature is part of the 1.4 Dockerfile syntax directive that requires BuildKit 0.10.0 (March 2022) or newer.
  • Usage: docker buildx build --build-arg BUILDKIT_SANDBOX_HOSTNAME=docker.invalid --tag test-image .
    • The hostname command will now output docker.invalid instead of buildkitsandbox (default for BuildKit, or without BuildKit a hexadecimal ID)
    • While /etc/hostname content is unaffected (BuildKit defaults to localhost, without BuildKit a different hexadecimal ID).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Building on Ubuntu 21.10 failing to install postfix

4 participants