Skip to content

Improved logic to handle HOSTNAME and DOMAINNAME + Improving handling of Letsencrypt changes + build docs WARNINGs should throw exit code#2245

Closed
NorseGaud wants to merge 56 commits intodocker-mailserver:masterfrom
NorseGaud:hostname-domain-and-ssl-issues
Closed

Improved logic to handle HOSTNAME and DOMAINNAME + Improving handling of Letsencrypt changes + build docs WARNINGs should throw exit code#2245
NorseGaud wants to merge 56 commits intodocker-mailserver:masterfrom
NorseGaud:hostname-domain-and-ssl-issues

Conversation

@NorseGaud
Copy link
Copy Markdown
Member

@NorseGaud NorseGaud commented Oct 9, 2021

Description

Original PR that may have broken this: https://github.com/docker-mailserver/docker-mailserver/pull/2175/files

In #2239, the user has:

hostname: mail
domainname: somedomain.com

environment:
- SSL_TYPE=letsencrypt
- SSL_DOMAIN=mail.somedomain.com

And the certificate was triggering changedetector: [[ INF ]] Cert found in /etc/letsencrypt/acme.json for mail.somedomain.com

Yet, they had two different directories under /etc/letsencrypt:

root@mail:/etc/letsencrypt# ls -al live/
total 16
drwxr-xr-x 4 root root 4096 Oct  6 10:20 .
drwxr-xr-x 3 root root 4096 Oct  5 20:15 ..
drwxr-xr-x 2 root root 4096 Oct  6 10:20 mail
drwxr-xr-x 2 root root 4096 Oct  5 20:15 mail.somedomain.com

They indicated that they found a solution by:

I copied the files from mail to mail.somedomain.com and restarted postfix/dovecot via supervisorctl restart dovecot / postfix. The renewed certificate is in use now

This means that mail is getting the renewal (_extract_certs_from_acme), but mail.somedomain.com is being watched.

The user has explicitly set SSL_DOMAIN to mail.somedomain.com and yet the extraction seems to be happening into a folder named mail. Under the setup-stack.sh I see:

      if [[ -f /etc/letsencrypt/acme.json ]]
      then
        if ! _extract_certs_from_acme "${SSL_DOMAIN}"
        then
          if ! _extract_certs_from_acme "${HOSTNAME}"
          then
            _extract_certs_from_acme "${DMS_DOMAIN}"
          fi
        fi
      fi

      # first determine the letsencrypt domain by checking both the full hostname or just the domainname if a SAN is used in the cert
      if [[ -e /etc/letsencrypt/live/${HOSTNAME}/fullchain.pem ]]
      then
        LETSENCRYPT_DOMAIN=${HOSTNAME}
      elif [[ -e /etc/letsencrypt/live/${DMS_DOMAIN}/fullchain.pem ]]
      then
        LETSENCRYPT_DOMAIN=${DMS_DOMAIN}
      else
        _notify 'err' "Cannot access '/etc/letsencrypt/live/${HOSTNAME}/fullchain.pem' or '/etc/letsencrypt/live/${DMS_DOMAIN}/fullchain.pem'"
        return 1
      fi

There are a few problems here:

  1. Extraction of their acme is happening into mail and not mail.somedomain.com
  2. The if statements after that don't set LETSENCRYPT_DOMAIN to SSL_DOMAIN

For 1, we see [[ INF ]] Cert found in /etc/letsencrypt/acme.json for mail.somedomain.com, so that's not the issue. HOWEVER, the code shows that ${1} is being used instead of ${HOSTNAME} so we aren't seeing accurately what is being used:

  if [[ -n "${KEY}${CERT}" ]]
  then
    mkdir -p "/etc/letsencrypt/live/${HOSTNAME}/"

    echo "${KEY}" | base64 -d >/etc/letsencrypt/live/"${HOSTNAME}"/key.pem || exit 1
    echo "${CERT}" | base64 -d >/etc/letsencrypt/live/"${HOSTNAME}"/fullchain.pem || exit 1
    _notify 'inf' "Cert found in /etc/letsencrypt/acme.json for ${1}"

To fix this, we shouldn't be using HOSTNAME and instead use ${1}.

For 2, I have added the required code


It also looked as if we were testing acme with a "changed" cert that was for example.com and not mail.my-domain.com. I'm unsure how this worked... I've expanded the tests to test the regular situation of mai.my-domain.com and then also the wildcard acme for example.com.


Seems as if the docs generation script wasn't properly exiting on WARNING of broken links. I fixed that too.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (non-breaking change that does improve existing functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@NorseGaud NorseGaud changed the title Hostname domain and ssl issues DOMAINNAME is not set properly + Improving handling of Letsencrypt changes Oct 9, 2021
Comment thread target/scripts/postfix-wrapper.sh Outdated
@polarathene
Copy link
Copy Markdown
Member

polarathene commented Oct 9, 2021

UPDATE: Misunderstanding on my part, despite my recall ,DOMAINNAME doesn't show up as a system ENV 😬


Maybe I don't understand this, but the old DOMAINNAME we're creating here is wrong

Just in-case anyone isn't aware... HOSTNAME and DOMAINNAME are system ENV, not custom ENV that we define. Docker additionally supports both as options for the CLI or docker-compose config, but has a spotty history of how they were handled (it's changed over time and at one point IIRC CLI and docker-compose behaved differently).

Our tests also I think only set hostname while docker-compose configs would set both, plus we have a separate ENV introduced to further mess around with that, in addition to any other related logic in our startup scripts. There's also different ways for software to check the hostname and domainname, several CLI utilities that output different FQDN splits to HOSTNAME and DOMAINNAME, and a kernel interface. The outputs also can differ across different distro base images (eg Alpine vs Debian), and other configurations with docker (user-ns causes issues IIRC).

So I'm not surprised if reliance on these values may be wonky 😅

Copy link
Copy Markdown
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this being another PR about issues with HOSTNAME/DOMAINNAME highlights the concern to properly review the usage is correct.

I'm not particularly fond of band-aiding such issues regularly, especially if they're not well understood and the fix is accepted on the merits of "well it resolves problem X for me".

When contributing fixes such as these, a related test that would fail prior and pass with the fix would at least be a bit helpful in mapping out the problems experienced.

Our test suite does not presently encounter the issue that has been reported for which this PR is intended to fix. In this case there appears to be a few variants of inputs to test against, I don't think DOMAINNAME looks like the appropriate fix.

That issue is due to HOSTNAME being a single label, not a FQDN, preventing it from matching the correct letsencrypt folder to monitor for changes?

Comment thread target/scripts/helper-functions.sh Outdated
Comment thread target/scripts/helper-functions.sh Outdated
Comment thread target/scripts/helper-functions.sh
Comment thread target/scripts/helper-functions.sh
@NorseGaud NorseGaud marked this pull request as draft October 10, 2021 00:33
@NorseGaud

This comment has been minimized.

@NorseGaud
Copy link
Copy Markdown
Member Author

Anyone know why we set hostname in the docker-compose.yml? The help output for docker run says:

      --domainname string              Container NIS domain name

Do our users even use NIS?

@polarathene
Copy link
Copy Markdown
Member

Do our users even use NIS?

It's legacy AFAIK, intended for pre-LDAP. Would have to do a bit of time traveling via git blame to identify when this was introduced.

I assume the original project author went with it, he made quite a few assumptions without properly understanding the choices I think, other than they seemed to work and that was good enough at the time.

I'm inclined to think that we don't need it for proper functionality, since AFAIK all or the majority of our tests only use hostname option in Docker CLI args, as originally hostname was the only one supported, and the CLI did some pre-processing or something to split it, again my recall on the history here is hazy.

It was probably misinterpreted and confused with a regular domain name. That said, Postfix or similar may implicitly leverage it as a default (pretty sure it does with HOSTNAME ENV or other location). I think if our tests would pass without setting domainname then it's probably not needed.

Docker should be taking the hostname and splitting for /etc/hosts AFAIK, if not it may be adding domainname in there to get a FQDN and assign hostname (or it's short hostname) as an alias. NSS and w/e else then resolves FQDN and host name/alias from that.

Comment thread docs/content/config/security/ssl.md Outdated
Comment thread target/bin/open-dkim Outdated
Comment thread target/scripts/check-for-changes.sh Outdated
Comment thread target/scripts/helper-functions.sh Outdated
@NorseGaud NorseGaud changed the title DOMAINNAME is not set properly + Improving handling of Letsencrypt changes Improved logic to handle HOSTNAME and DOMAINNAME + Improving handling of Letsencrypt changes Oct 11, 2021
…domain.sld.tld or only sld.tld

- look for all certs that have changed
- support only hostname (no domainname/subdomain)
- note about domain only for mailserver
- test fixes/improvements for new hostname logic
Comment thread docker-compose.yml Outdated
image: docker.io/mailserver/docker-mailserver:latest
container_name: mailserver
hostname: mail
hostname: mail # If you don't use a subdomain for your mail server (example: mail.example.com), you can set only `hostname` to the domain
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hostname: mail # If you don't use a subdomain for your mail server (example: mail.example.com), you can set only `hostname` to the domain
# If the FQDN for your mail-server is only two labels (eg: example.com), you can assign this entirely to `hostname` and remove `domainname`.
hostname: mail

Apart from the comment and example being a tad confusing, it should sit above the property when it's that long.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's incomplete

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, didn't mean to resolve this.

How do we feel about eliminating domainname entirely at this point @polarathene @casperklein ? I've confirmed that both subdomains and regular domains work in hostname without domainname.

domainname is not well documented and a little confusing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, what exactly you mean be "removing" domainname (ps: haven't read the whole PR discussion)?

The docker(compose) options for domainname do only affect /etc/hosts in the container afaik.

Of course without domainname, similar can be achieved by using a FQDN as hostname. What do you exactly try to achieve? What improvements are you expecting?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like all we require is setting hostname and not domainname. I'd like to pull it out but it's a breaking change so I'll leave it alone for now.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we feel about eliminating domainname entirely at this point

👍

I've confirmed that both subdomains and regular domains work in hostname without domainname.

I identified some cases where that's not the case I think, although I did not try them against docker-mailserver, and they're likely niche if at all in use (and some non-sensical ones).

I'd prefer to get the test suite refactored to run it against the variants though.

domainname is not well documented and a little confusing.

Yeah, IIRC it was made available for some need, but was generally advised against. Probably why the docs are lacking. I think it was adopted by this project either by misunderstanding or an attempted workaround.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker(compose) options for domainname do only affect /etc/hosts in the container afaik.

I've tested docker-compose config and it matches expectation of /etc/hosts with the Docker CLI results I observed. I didn't do a full parity comparison, but assume they're in sync these days.

Of course without domainname, similar can be achieved by using a FQDN as hostname. What do you exactly try to achieve? What improvements are you expecting?

Supporting domainname is legacy. These two config options used to behave differently with CLI and docker-compose a while back. There are past issues and PRs for the project related to it with workarounds due to confusion IIRC.

Depending on the config, and other contexts (eg network host mode), inconsistencies can be introduced.

Proposal is to at some point move any reference/usage of $HOSTNAME env (and not overwrite it) to a DMS scoped one, look at any usage of hostname command etc. We have some workaround ENV like OVERRIDE_HOSTNAME that may no longer be relevant, plus various *DOMAIN ENV.

Docs in a follow up PR will remove the usage of domainname and only advise hostname if this is the only relevant option to set. I am not sure what's required to support mail.example.com that uses mail headers example.com, or other related changes (eg if FQDN is more than 3 labels), but those use cases should be covered (I think we have tests that do).

Comment thread target/scripts/helper-functions.sh Outdated
Comment thread target/scripts/helper-functions.sh Outdated
Comment thread target/scripts/helper-functions.sh Outdated
then
DOMAINNAME="${HOSTNAME}"
#shellcheck disable=SC2034
DOMAINNAME="$(hostname -d)"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DOMAINNAME="$(hostname -d)"
DMS_DOMAIN="$(hostname -d)"

Has hostname -d been checked with:

  • hostname: mail.example.com (no domainname)
  • hostname: example.com (no domainname)
  • hostname: mail and domainname: example.com

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Looks good.

hostname: deliver
domainname: xkr.email
root@deliver:/# hostname -f
deliver.xkr.email
root@deliver:/# hostname -d
xkr.email
root@deliver:/# hostname
deliver
root@deliver:/# echo $HOSTNAME
deliver

hostname: xkr.email
root@xkr:/# hostname -f
xkr.email
root@xkr:/# hostname -d
email # We override this and use HOSTNAME internally in the scripts
root@xkr:/# hostname
xkr.email
root@xkr:/# echo $HOSTNAME
xkr.email

hostname: deliver.xkr.email
root@deliver:/# hostname -f
deliver.xkr.email
root@deliver:/# hostname -d
xkr.email
root@deliver:/# hostname
deliver.xkr.email
root@deliver:/# echo $HOSTNAME
deliver.xkr.email

Copy link
Copy Markdown
Member

@polarathene polarathene Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Looks good.

Was this the output from docker-mailserver after scripts are run? Or on a direct base image without docker-mailserver (at least it's scripts) involved? (EDIT: I've collected results from Debian and Alpine base images and have shared them in a follow up comment)

# We override this and use HOSTNAME internally in the scripts

This is kind of an issue for me as it sounds like we're messing with it when we shouldn't need to?

Is the difference in $HOSTNAME from hostname -f desirable? (it's the same in two examples, but in the 1st it is different), same with the commented -d result for xhr.email in the 2nd example.


UPDATE: Consistent outputs can be done with:

  • domainname command will output the actual domainname set for the container. Don't use hostname -d to retrieve it. Unless you want the FQDN minus the 1st label. If domainname was not set, the returned value for the domainname command is (none).
  • hostname likewise matches the actual hostname set for the container. hostname -f will return the composite value of hostname + domainname if a match is in /etc/hosts, but it must have an alias that matches the queried hostname (value of /proc/sys/kernel/hostname). For all 3 examples hostname -f will output the expected result (unless using --net=host).

Perhaps something like?:

Suggested change
DOMAINNAME="$(hostname -d)"
if [[ ! $(domainname) == '(none)' ]];
then
DMS_DOMAIN="$(domainname)"
fi

Comment thread target/scripts/helper-functions.sh Outdated
Comment thread target/scripts/postsrsd-wrapper.sh Outdated
else
NEW_DOMAIN_NAME="${DOMAINNAME}"
fi
NEW_DOMAIN_NAME="${SRS_DOMAINNAME:-"${DOMAINNAME}"}"
Copy link
Copy Markdown
Member

@polarathene polarathene Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated

If this is relying on the system (in this case within the container instance) ENV DOMAINNAME value, it should be equivalent to hostname -d, domainname -d, dnsdomain -d right? (UPDATE: DOMAINNAME ENV isn't a system provided one, my bad)

Suggested change
NEW_DOMAIN_NAME="${SRS_DOMAINNAME:-"${DOMAINNAME}"}"
local NEW_DOMAIN_NAME="${SRS_DOMAINNAME:-"${DMS_DOMAIN}"}"

or

Suggested change
NEW_DOMAIN_NAME="${SRS_DOMAINNAME:-"${DOMAINNAME}"}"
local NEW_DOMAIN_NAME="${SRS_DOMAINNAME:-"$(hostname -d)"}"

Technically, I don't think we want the output of domainname -d, and if hostname -d is only correct under some configurations, then dnsdomain -d may be more reliable. I'd need to check.

UPDATE: All three commands -d option output the same result, no difference. domainname command default output however matches the actual domainname value configured for the container.

Suggested change
NEW_DOMAIN_NAME="${SRS_DOMAINNAME:-"${DOMAINNAME}"}"
local NEW_DOMAIN_NAME="${SRS_DOMAINNAME:-"$(dnsdomain -d)"}"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in a situation where hostname in the docker-compose/run is set as a normal domain like example.com. hostname -d would be com :(

Copy link
Copy Markdown
Member

@polarathene polarathene Oct 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: At the time of writing this comment I had some misunderstandings. Those have since been cleared up with findings described here.


hostname -d would be com :(

That would be expected and working correctly imo. Can you explain why that's bad?

Think about it with 4 labels or 5 perhaps, and what you'd expect. Or what about example.co.uk and mail.example.co.uk, some may even have mail.internal.company.com but want company.com in the email header? 🤷‍♂️

Perhaps this just highlights the issue you're working on here? Should you be relying on HOSTNAME/DOMAINNAME?

When is a distinction of separation important and would it be better handled via setting only hostname, reading HOSTNAME(or hostname -s and hostname -f) and having a separate ENV that we use internally (unrelated to DOMAINNAME or hostname -d and similar) that allows control to explicitly deviate from the default behaviour?

I know that some software like Postfix can implicitly use the detected hostname (and possibly the domainname) values it queries from the system, as can others, but if they also support explicit config, then that's a non-issue.

Some routing is handled via /etc/hosts based on these two values, but it would only impact services within the container AFAIK, since we're using containers for sandboxing, other containers on the same network I believe also can query the containers IP via that, but nothing on the docker daemon host itself should.


I know that the tests cover different variations, but I'm not familiar of what the different use-cases are that they're all covering. Lacks examples 😅

I assume setting hostname is still of value.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK our use of OVERRIDE_HOSTNAME, HOSTNAME, and DOMAINNAME (now DMS_HOSTNAME_DOMAIN) is simply to create uniformity and predictability depending on what the user ends up configuring.

You'd notice that HOSTNAME takes priority over DOMAINNAME in postfix's config and several other important places. Changing it gives me a lot of anxiety.

Having hostname -d be com or in my case email (xkr.email) is not correct and doesn't work without some changes ( I'd rather not be responsible for ;) ).

We can't use hostname it's the just subdomain part of subdomain.sld.tld.

It looks like the /etc/hosts entry is being set properly and I've added tests for it too.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK our use of OVERRIDE_HOSTNAME, HOSTNAME, and DOMAINNAME (now DMS_HOSTNAME_DOMAIN) is simply to create uniformity and predictability depending on what the user ends up configuring.

Seems a bit much and these names are a bit overloaded? The word HOSTNAME and DOMAINNAME is getting awfully confusing looking into this and discussing it since the values vary based on context.

I'd like to try address that in future. The less I have to think about things the better 😅

This is the current usage right?:

  • Public (user provided): OVERRIDE_HOSTNAME.
  • Private (internal): DOMAINNAME, DMS_HOSTNAME_DOMAIN.
  • System (container default ENV already set): HOSTNAME (partial internal override).

Changing it gives me a lot of anxiety.

That's fine. No need to change it any time soon if there's no immediate issues.

I'd like to get the test suite into a better shape for more easily testing this sort of change. I'll create a PR for it when I have the time.

Having hostname -d be com or in my case email (xkr.email) is not correct and doesn't work without some changes ( I'd rather not be responsible for ;) ).

My point was that it's the correct behaviour for those commands.

  • Use only hostname config option with the two label FQDN (example.com or xkr.email) instead.
  • Use hostname or domainname commands if you want to match the equivalent config, or source directly from /proc/sys/kernel/{hostname,domainname}.
  • hostname -d or domainname -d will both output the same result. It is the 1st value from a matched /etc/hosts entry after the IP. The full value is what hostname -f/domainname -f output, while the -d variant truncates the 1st label.
  • For the short hostname, Debian will provide the first label via hostname -s. Alpine linux and others however may do the same or act similar to -d and -f in sourcing from /etc/hosts, depending on packages installed. Remember that -s is always the first label only, regardless if you set a multi-label hostname and still provide a domainname.
  • In the even that there is no valid match in /etc/hosts, the queried hostname may perform a DNS query to public resolvers, which may not be desireable (rarely likely to be an issue, but may hide the underlying issue).

We can't use hostname it's the just subdomain part of subdomain.sld.tld.

The command? As mentioned above, it's whatever was assigned to the container's hostname via config options.

The hostname is meant to identify the container itself on a network. That network doesn't need to be publicly accessible AFAIK.

It looks like the /etc/hosts entry is being set properly and I've added tests for it too.

It's valid if not:

  • Duplicating labels between hostname and domainname.
  • Having both config options set but with a multi-label hostname value.
  • Using host network mode, unless an entry has been added for the containers hostname into /etc/hosts (host or container copy, note that these differ in resolution behaviour).

Comment thread test/mail_hostname.bats Outdated
Comment on lines +16 to +18
-e OVERRIDE_HOSTNAME=subdomain.sld.tld \
--domainname sld2.tld \
-h subdomain2 \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious what the point is to provide subdomain2 if you have ENV OVERRIDE_HOSTNAME with subdomain?

Is this ENV meant to simulate -h subdomain.sld.tld while having some value in hostname -s returning subdomain2 or hostname -f returning subdomain2.sld.tld? I haven't looked into what goes on logic wise, but that's the impression I get. I would look at /etc/hosts as well, as presumably it would container the FQDN of subdomain2.sld2.tld and alias subdomain2.

If I'm correct, it could be why some breakage is already being experienced, and possibly affecting your encryption PR issues if you've got a similar setup and Dovecot or similar tries to query the hostname (short or FQDN) and that conflicts with the HOSTNAME ENV that docker-mailserver modifies?


EDIT: I viewed the related test and see that it at least checks for expected override behaviour from all known occurrences.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ENV meant to simulate -h subdomain.sld.tld
Yep. That's what OVVERIDE_HOSTNAME does from what I can tell.

I would look at /etc/hosts as well, as presumably it would container the FQDN of subdomain2.sld2.tld and alias subdomain2.

Correct. We don't override /etc/hosts.

possibly affecting your encryption PR issues if you've got a similar setup and Dovecot or similar tries to query the hostname (short or FQDN) and that conflicts with the HOSTNAME ENV that docker-mailserver modifies?

Good point -- I'm hoping to try that stuff again and find the time to reach out to dovecot admins list after this is addressed.

@NorseGaud

This comment has been minimized.

@NorseGaud NorseGaud self-assigned this Oct 13, 2021
@polarathene
Copy link
Copy Markdown
Member

Good to go now.

As the changes are rather large, along with the lengthy discussion and commit history.. if reviewers would prefer I can interactively rebase the whole thing and cherry pick it out to separate smaller PRs if preferred?

Otherwise I can write up a summary/overview of the entire PR here.


The commits I've added to finish up the PR don't have the cleanest history (I have refrained from rewriting history already pushed), but I've tried to scope commits to ease review if needed, with associated commit messages for context.

I've done a bit of refactoring, added some fixes and rather heavy inline documentation to avoid any maintenance confusion about the niche knowledge we encountered during this PR.

I've completely rewritten the letsencrypt tests. They adopt the helpers I added recently, and have a few methods that could later be extracted out to share with the other TLS tests. Let me know if I went overkill with the DRY abstractions, I understand that may not be as easy to decipher vs raw hard-coded commands.

The entire letsencrypt test file directory has been deleted. I didn't like the actual cert data when inspecting it, and decided to use the existing example.test certs instead (they aren't provisioned by letsencrypt, but that shouldn't be an issue).

For the Traefik tests, acme.json files have been created to cover each of these test certs with added test documentation for other maintainers to understand and reproduce if needed.

There are some checks in the _acme_wildcard test that won't pass currently. This is because I'm not testing with the wildcard acme.json config from a fresh container but additionally testing that when a wildcard cert replaces a static FQDN, that docker-mailserver correctly configures itself. It would if I restart the container via the startup scripts, but this highlights an issue with the current check-for-changes.sh. Currently the Postfix and Dovecot configs aren't updated if the /etc/letsencrypt/live/<FQDN> changes during runtime (eg: due to extraction).

I am deferring that to a follow-up PR due to the size of this one and the refactoring required to let check-for-changes.sh share the needed startup methods. Unless someone more familiar with the setup scripts can advise how to call the TLS setup methods from check-for-changes.sh, presumably I could just source setup-stack.sh and call the method?


@NorseGaud thanks for the tips on a previous PR for better debugging bats test failures, and for your contributions at bats-core that made that possible, very helpful! 😀

Regarding the SSL docs, I've tried to make a compromise given your feedback earlier. There's now an early section about the FQDN and it's relation to configuration. It mentions the usage of hostname -f internally and falling back to the domainname if needed. Your additions about the symlinks was also retained, just shifted into a contextual note below the instructions.

@polarathene polarathene marked this pull request as ready for review October 23, 2021 01:59
polarathene
polarathene previously approved these changes Oct 23, 2021
Comment thread target/scripts/helper-functions.sh Outdated
@NorseGaud
Copy link
Copy Markdown
Member Author

@NorseGaud thanks for the tips on a previous PR for better debugging bats test failures, and for your contributions at bats-core that made that possible, very helpful! 😀

Yeah man -- big thanks to @martin-schulze-vireso for that as he took the reins and finalized it.

Also moves the service conditions into the prepare method.
@github-actions
Copy link
Copy Markdown
Contributor

Documentation preview for this PR is ready! 🎉

Built with commit: 1f0c237

Copy link
Copy Markdown
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spotted an issue that should be investigated further (I'll do so after I complete other tasks) while working on a complimentary check-for-changes.sh PR to pull out it's duplicated code that fell out of sync. I came across this concern when looking into /etc/postfix/vhost usage in our scripts.


Notes:

  • Our default target/postfix/main.cf uses the value $myhostname, localhost.$mydomain, localhost, same as the FAQ entry example. This is also the upstream default for mydestination.
  • setup-stack.sh:_setup_ldap has a sed operation at the end on /etc/postfix/main.cf to remove $myhostname from the mydestination value resulting in: mydestination = localhost.$mydomain, localhost.

Based on the Postfix docs for $myhostname, it is probably better that we explicitly set the value of myhostname in main.cf rather than rely on implicit values. That may also require some consideration for mydomain too, in addition to possibly explicitly setting mydestination.

EDIT: We already do set myhostname and mydomain via _setup_postfix_hostname with postconf -e.

Comment thread docker-compose.yml
Comment on lines +7 to +8
# If the FQDN for your mail-server is only two labels (eg: example.com),
# you can assign this entirely to `hostname` and remove `domainname`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs also mention this should work.

However, a FAQ entry points out that using a bare domain requires additional configuration.

mydestination should not contain a value that /etc/postfix/vhost does, which is used for virtual_mailbox_domains in Postfix main.cf, the Postfix docs specifically warn against this:

NEVER list a virtual MAILBOX domain name as a mydestination domain!

We may want to add some extra handling logic to detect and abort or possibly fix this issue before advising bare domain support like this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to additionally add mention of issues with --network=host which will have trouble establishing the hostname from /etc/hosts. Present advice has been to use OVERRIDE_HOSTNAME, but it's probably more correct to ensure an entry in /etc/hosts.

@polarathene
Copy link
Copy Markdown
Member

polarathene commented Oct 25, 2021

We aren't running check-for-change.sh if LDAP is enabled?

[[ ${ENABLE_LDAP} -eq 0 ]] && _register_start_daemon '_start_changedetector'

That was introduced with the original check-for-changes.sh PR in Oct 2017.

At the time it was only for non-LDAP config updates, and it seems we have since added in the SSL cert renewal change detection. I don't think anything else matters to LDAP users, but I'm not familiar enough with the monitored configuration and LDAP setups to know for sure.

I will defer this concern to a follow up PR.


That PR also seems to be the origin of --cap-add=SYS_PTRACE being added I think, which AFAIK isn't documented anywhere that it's used/mentioned.

@casperklein casperklein marked this pull request as draft October 30, 2021 12:12
@NorseGaud NorseGaud closed this Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/scripts area/tests kind/improvement Improve an existing feature, configuration file or the documentation priority/high

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants