Skip to content

scripts: cleanup and prep for setup split#3112

Closed
georglauterbach wants to merge 11 commits intomasterfrom
scripts/cleanup-and-prep-for-setup-split
Closed

scripts: cleanup and prep for setup split#3112
georglauterbach wants to merge 11 commits intomasterfrom
scripts/cleanup-and-prep-for-setup-split

Conversation

@georglauterbach
Copy link
Copy Markdown
Member

@georglauterbach georglauterbach commented Feb 24, 2023

Description

A PR including changes I wanted to do for a longer time; the last before my vacation march and before v12 is released.

This PR opens the possibility to split the big setup-stack.sh into smaller pieces in a dedicated directory. Moreover, it simplifies scripts by a good amount (getting rid of the fixes and misc stacks completely).

The single new feature of this PR is important though: there is now a check whether the container was improperly restarted, and if so, the container will show an error and shut down. This makes sense, since we cannot guarantee anything in case of a bad restart, and IMO it is better to not run a possibly faulty configuration at all than to run with a faulty config (as seen in #3109).

This PR has its first commit based off of #3111. For the other commits I tried to provide a nice message so you see the rationale/changes more easily.

Fixes #3109

Type of change

  • New feature (non-breaking change which adds functionality)
  • Improvement (non-breaking change that does improve existing functionality)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

mazzz1y and others added 9 commits February 24, 2023 15:07
config: disable SMTP authentication on port 25 (#3006)

* postfix: remove smtpd_sasl_auth_enable global setting

* tests: disable auth on 25 port

* tests: revert ldap-smtp-auth-spoofed-sender-with-filter-exception.txt

* Skip failing test

The test seems to have been broken from the beginning.

Sadly, no LDAP maintainers can verify. Added a TODO item if ever a LDAP maintainer comes around.

* Apply PR feedback

---------

Co-authored-by: Georg Lauterbach <[email protected]>

added plausibility checks for milter insertion

corrected grep in tests

revert changes to milter insertion
This directory should be used in the future to split up the large
`setup-stack.sh`.
The stacks were mostly superflous; the save-state is basically setup
functionality and the fixes are not "fixes" but actually setup if some
services are disabled.
This is the only functionality added with the changes of this PR: we
will abort the container startup in case `CONTAINER_START` is present,
since the container was thus not properly restarted.

Since we _cannot give any guarantees_ about our scripts altering already
altered files, it is better to show the user the error and abort than
have the user running a possibly faulty configuration: FAIL FAST.
Since we now guarantee a proper container restart, `sedfile` needn't
bother anymore.
@georglauterbach georglauterbach added kind/new feature A new feature is requested in this issue or implemeted with this PR area/scripts kind/improvement Improve an existing feature, configuration file or the documentation labels Feb 24, 2023
@georglauterbach georglauterbach self-assigned this Feb 24, 2023
Copy link
Copy Markdown
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of shuffling going on around here that is distinct from other logic being introduced into the PR.

That complicates review a little bit, but I understand that you're trying to get the changes through review faster (_it'd probably have been faster for me to review if it was staged out as separate PRs instead of only as commits 😅 _)


  1. The new preventative check is interesting, but seems a bit heavy-handed personally. I'd like the perspective of another maintainer before I provide approval. I do understand the intent and value of it.
  2. sedfile changes don't seem to be necessary to drop right now. Just like the fixes-stack.sh logic that is being kept in setup-stack.sh, which is mostly focused on fixes for persisted internal state for container restarts AFAIK (except /var/mail).
  3. Not fond of the early methods added to setup-stack.sh and check-stack.sh, I don't think these are necessary or need to be there.
  4. misc-stack.sh should live in setup.d/ if you insist on moving it, not blended into setup-stack.sh methods, you have an entire commit in this PR introducing setup.d/ for the purpose of avoiding a large setup-stack.sh file.
  5. I have discussed in some review comments certain changes that may be better served by follow-up PRs to avoid all the other noise of this PR affecting relevance to their proposed changes. Such as when _setup_user_patches() may be better suited for running.

Comment on lines -32 to -46
# ? << Sourcing helpers & stacks
# --
# ? >> Early setup & environment variables setup
# ------------------------------------------------------------

# shellcheck source=./helpers/variables.sh
source /usr/local/bin/helpers/variables.sh

_setup_supervisor
_obtain_hostname_and_domainname
_environment_variables_backwards_compatibility
_environment_variables_general_setup

# ------------------------------------------------------------
# ? << Early setup & environment variables setup
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested action:

  • Keep this logic in start-mailserver.sh, don't migrate it to setup-stack.sh. Wrap it in a function here if better co-location of the start-up sequence order at the end of start-mailserver.sh is important to you.
  • Run your fail early shutdown check after as part of check-stack.sh.

You chose to migrate this into setup-stack.sh as a new _run_early_setup() method, only to call it below in this file start-mailserver.sh prior to running general _setup()?

  • _setup_supervisor() doesn't appear to have much importance in priority here, and could easily be registered as the first function for _setup(). I assume it's treated as distinct and first to run, since it would restart reload supervisord, and that presumably re-runs this script and repeats any logging output.
  • _obtain_hostname_and_domainname is an important helper to run early, as is the ENV methods prior to setup.

Your main motivation for to migrate these into setup-stack.sh was presumably for these reasons?:

  1. Logic is co-located with the rest of start-up sequence at the end of this file.
  2. The new prevention check for persisted internal state being run earlier than this?
  • 1 is fine, but I don't see the value in moving this into setup-stack.sh. It can be wrapped into a function within this file.
  • 2 seems unnecessary. You complicate the check-stack.sh to fail slightly faster. The ENV setup logic can be run without concern first, and checks immediately after that are still valid. Easier to follow the flow this way too instead of zig zagging.

If you insist on the new check being first, you could just have it declared within start-mailserver.sh. It's a rather invasive check / action though. Preventing any user from using container restart policies due to shutdown instead of emitting a warning / error. If we were to later add an opt-out via ENV (if enough users requested it after release), it'd probably make some sense to have this occur with the other checks in check-stack.sh though.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an upcoming PR after #3115 I have tried to find a solution (or a compromise if you will), for this issues. Stay tuned :)

#!/bin/bash

# shellcheck disable=SC2034
declare -A VARS
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This belongs in variables.sh or start-mailserver.sh?

  • VARS is for the most part only used / referenced in variables.sh, with the exception of check-stack.sh where it sets the LOG_LEVEL ENV key.
  • In start-mailserver.sh, we originally ran the lines moved below into setup-stack.sh:_run_early_setup(), which handles the bulk of ENV setup with the VARS array. Later optionally running two more functions from variables.sh for handling LDAP + SASLAUTHD.
  • Finally after that (at the end of all setup-stack.sh via _setup()) and the prior separate fixes-stack.sh + misc-stack.sh, we'd call one more method from variables.sh (_environment_variables_export) to export the VARS array to a settings file.

Since only check-stack.sh references VARS, I'd say VARS should be declared in variables.sh? NOTE: I am not familiar with scope access in BASH with source, since you've now relocated variables.sh to be sourced in a method of setup-stack.sh, instead of early on in start-mailserver.sh... Perhaps that was your reason for placing it here, but I'd argue that it'd belong in start-mailserver.sh then 🤷‍♂️

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done due to a Bash having very weird scoping. With #3115 and the follow-up, this is solved since variables.sh will become a proper "stack", so the sourcing will be done by start-mailserver.sh, which in turn makes declaring this array possible in variables.sh (then: variables-stack.sh).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done due to a Bash having very weird scoping.

Yeah I figured that was the case 😅

I'm particularly interested in your ENV parser rust project having potential to eventually manage not only the ENV documentation, but share the same struct / module for what variables.sh is effectively responsible for.

Then we won't have drift, as recently experienced with the SA_KILL update to mailserver.env.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking about extending the parser this way too :D


local PATH_TO_SCRIPTS='/usr/local/bin/setup.d/'

source "${PATH_TO_SCRIPTS}/rspamd.sh"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required as part of _run_early_setup()? You later source it again in _setup_rspamd, is it meant to be sourced twice?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, the other sourcing should not happen. Resolved in #3115.

Comment on lines +41 to +42
# Run user-patches
__setup_user_patches
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this adjusted to an internal method call?

It's basically like all other setup methods for handling a feature?

I think it may have been kept separate in start-mailserver.sh due to important to run it after _setup()? But I'm also wondering if that could not just be handled by adding it to the end of the _setup() registered methods, with a comment stating it should be run last?

It'd seem better to handle it like that, unless there's a reason to run it after _prepare_for_change_detection()? Alternatively, we can give it special treatment like _prepare_for_change_detection(), but move it to be ran prior? (EDIT: The special treatment for _prepare_for_change_detection() appears to be because it's from helper scripts, not a method declared in setup-stack.sh, so _setup_user_patches probably doesn't belong here?)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have actually no clue why an underscore was prepended.. again, should not have happened.

Comment on lines +43 to +44
# Show the environment if the log level "demands" it
[[ ${LOG_LEVEL} =~ (debug|trace) ]] && print-environment
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, this probably belongs in start-mailserver.sh (or variables.sh). It's not really relevant to _setup()?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm am willing to leave it in there; #3115 and follow-ups will not move it.

else
_log 'debug' 'Disabling SpamAssassin'
echo "@bypass_spam_checks_maps = (1);" >>"${DMS_AMAVIS_FILE}"
rm -f /etc/cron.daily/spamassassin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action required, can leave as-is. This is just a review comment note.

Both lines should be kept as-is, since there is no harm and if we were to allow bypassing the new fail fast check, then these fixes remain relevant (and would need to be realized again).


This rm -f along with the ones below for ClamAV are both migrated from fixes-stack.sh, which AFAIK was intended for addressing concerns with container restarts.

The intent of this PR is to disallow that, so this internal state should not be present to cleanup (unless it exists from the Dockerfile).

Comment on lines +1298 to +1307
function _setup_apply_fixes
{
_log 'trace' 'Removing leftover PID files from a stop/start'
find /var/run/ -not -name 'supervisord.pid' -name '*.pid' -delete
touch /dev/shm/supervisor.sock

_log 'debug' 'Checking /var/mail permissions'
_chown_var_mail_if_necessary || _shutdown 'Failed to fix /var/mail permissions'
_log 'trace' 'Permissions in /var/mail look OK'
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action required, can leave as-is. This is just a review comment note.


These fixes are a bit different.

  • Supervisor fix probably is similar to the SA and ClamAV ones, and not relevant to reload in _setup_supervisor()? This would be run before daemons.stack.sh before, but presumably could be part of _start_daemons()?
  • /var/mail chown still relevant. Technically, this is specific to Dovecot AFAIK, but I'm not 100% sure.

Comment on lines +1309 to +1403
# consolidate all states into a single directory
# (/var/mail-state) to allow persistence using docker volumes
function _setup_save_states
{
local STATEDIR FILE FILES

STATEDIR='/var/mail-state'

if [[ ${ONE_DIR} -eq 1 ]] && [[ -d ${STATEDIR} ]]
then
_log 'debug' "Consolidating all state onto ${STATEDIR}"

# Always enabled features:
FILES=(
lib/logrotate
lib/postfix
spool/postfix
)

# Only consolidate state for services that are enabled
# Notably avoids copying over 200MB for the ClamAV database
[[ ${ENABLE_AMAVIS} -eq 1 ]] && FILES+=('lib/amavis')
[[ ${ENABLE_CLAMAV} -eq 1 ]] && FILES+=('lib/clamav')
[[ ${ENABLE_FAIL2BAN} -eq 1 ]] && FILES+=('lib/fail2ban')
[[ ${ENABLE_FETCHMAIL} -eq 1 ]] && FILES+=('lib/fetchmail')
[[ ${ENABLE_POSTGREY} -eq 1 ]] && FILES+=('lib/postgrey')
[[ ${ENABLE_RSPAMD} -eq 1 ]] && FILES+=('lib/rspamd')
[[ ${ENABLE_SPAMASSASSIN} -eq 1 ]] && FILES+=('lib/spamassassin')
[[ ${SMTP_ONLY} -ne 1 ]] && FILES+=('lib/dovecot')

for FILE in "${FILES[@]}"
do
DEST="${STATEDIR}/${FILE//\//-}"
FILE="/var/${FILE}"

# If relevant content is found in /var/mail-state (presumably a volume mount),
# use it instead. Otherwise copy over any missing directories checked.
if [[ -d ${DEST} ]]
then
_log 'trace' "Destination ${DEST} exists, linking ${FILE} to it"
# Original content from image no longer relevant, remove it:
rm -rf "${FILE}"
elif [[ -d ${FILE} ]]
then
_log 'trace' "Moving contents of ${FILE} to ${DEST}"
# Empty volume was mounted, or new content from enabling a feature ENV:
mv "${FILE}" "${DEST}"
fi

# Symlink the original path in the container ($FILE) to be
# sourced from assocaiated path in /var/mail-state/ ($DEST):
ln -s "${DEST}" "${FILE}"
done

# This ensures the user and group of the files from the external mount have their
# numeric ID values in sync. New releases where the installed packages order changes
# can change the values in the Docker image, causing an ownership mismatch.
# NOTE: More details about users and groups added during image builds are documented here:
# https://github.com/docker-mailserver/docker-mailserver/pull/3011#issuecomment-1399120252
_log 'trace' 'Fixing /var/mail-state/* permissions'
[[ ${ENABLE_AMAVIS} -eq 1 ]] && chown -R amavis:amavis /var/mail-state/lib-amavis
[[ ${ENABLE_CLAMAV} -eq 1 ]] && chown -R clamav:clamav /var/mail-state/lib-clamav
[[ ${ENABLE_FETCHMAIL} -eq 1 ]] && chown -R fetchmail:nogroup /var/mail-state/lib-fetchmail
[[ ${ENABLE_POSTGREY} -eq 1 ]] && chown -R postgrey:postgrey /var/mail-state/lib-postgrey
[[ ${ENABLE_RSPAMD} -eq 1 ]] && chown -R _rspamd:_rspamd /var/mail-state/lib-rspamd
[[ ${ENABLE_SPAMASSASSIN} -eq 1 ]] && chown -R debian-spamd:debian-spamd /var/mail-state/lib-spamassassin

chown -R root:root /var/mail-state/lib-logrotate
chown -R postfix:postfix /var/mail-state/lib-postfix

# NOTE: The Postfix spool location has mixed owner/groups to take into account:
# UID = postfix(101): active, bounce, corrupt, defer, deferred, flush, hold, incoming, maildrop, private, public, saved, trace
# UID = root(0): dev, etc, lib, pid, usr
# GID = postdrop(103): maildrop, public
# GID for all other directories is root(0)
# NOTE: `spool-postfix/private/` will be set to `postfix:postfix` when Postfix starts / restarts
# Set most common ownership:
chown -R postfix:root /var/mail-state/spool-postfix
chown root:root /var/mail-state/spool-postfix
# These two require the postdrop(103) group:
chgrp -R postdrop /var/mail-state/spool-postfix/maildrop
chgrp -R postdrop /var/mail-state/spool-postfix/public
# These all have root ownership at the src location:
chown -R root /var/mail-state/spool-postfix/dev
chown -R root /var/mail-state/spool-postfix/etc
chown -R root /var/mail-state/spool-postfix/lib
chown -R root /var/mail-state/spool-postfix/pid
chown -R root /var/mail-state/spool-postfix/usr
elif [[ ${ONE_DIR} -eq 1 ]]
then
_log 'warn' "ONE_DIR is enabled (=1), but '${STATEDIR}' does not exist - is a volume mounted there?"
else
_log 'debug' "Mail state is not consolidated into one directory"
fi
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested action: Extract into separate setup.d file, or keep misc-stack.sh

Technically fitting of setup.d/one-dir.sh, as it's an integration feature to support the ONE_DIR ENV paired with an external volume. As a purely integration focused feature though, it's a bit distinct from other setup-stack.sh, and is serving two functions (ONE_DIR=1 symlinks to /var/mail-state, and ensuring permissions/ownership is correct across restarts, notably during updating DMS releases when user and group IDs may have changed and need to adjust attributes of externally persisted state).


The only change here from misc-stack.sh really is the extra conditional handling at the end to output a warning or debug log message.

I am a tad confused though. With the commit expressing intent to split setup-stack.sh into smaller files for better scoping, you later deleted misc-stack.sh to add it here in setup-stack.sh, despite it being a good candidate as a separate file?

Personally, I'd envision this being refactored in future if we had better scoping of features. I preferred misc-stack.sh as a separate stage in it's current form, but would like services in setup to have a more clear cut scope where they'd have some convention to do their config setup, fix permissions/ownership like this method does, etc. At least for me it seems nicer if the bulk of Amavis + SA support and rspamd for example were mostly self-contained within their own folders.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be moved in the follow-up after #3115 into a separate file in setup.d. I kept it in setup.sh because, yeah, TBH, I don't know either. Should have directly put it into setup.d.

Comment on lines -39 to -57
@test 'checking sedfile substitute failure (on first container start)' {
# delete marker
_run_in_container rm '/CONTAINER_START'
assert_success

# try to change 'baz' to 'something' and fail
_run_in_container sedfile -i 's|baz|something|' "${TEST_FILE}"
assert_failure
assert_output --partial "No difference after call to 'sed' in 'sedfile' (sed -i s|baz|something| /tmp/sedfile-test.txt)"

# file unchanged?
_run_in_container cat "${TEST_FILE}"
assert_success
assert_output 'foo bar'

# recreate marker
_run_in_container touch '/CONTAINER_START'
assert_success
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this test case should be removed?

It's not related to DMS restarts. You'll notice the start and end of the test case ensures CONTAINER_START is not present during the test, so SKIP_ERROR condition would not have applied here regardless. It ensures that sedfile should fail when expected to.

It does look like this test would become redundant, but so does the whole --strict + SKIP_ERROR logic of sedfile. You'd want to revert that feature entirely then?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably, right; in one of the last PRs in the series after #3115, I will try to remeber this and maybe sedfile does not need an adjust at all; as you've said.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe sedfile does not need an adjust at all; as you've said.

I don't think it does :)

It would only needs changes if we wanted to remove any support we've added to (partially) handle starting the container with persisted internal state.

No harm in leaving that around for a while though, easier to have a single PR specifically removing just that if it were a maintenance concern, so that it's self-documenting of what would need to be restored if we did want to bring back support for restarting or restoring stopped containers.

Comment thread target/bin/sedfile
exit 1
fi >&2

[[ -f /CONTAINER_START ]] && SKIP_ERROR=1 # hide error if container was restarted
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should technically be able to keep this and the two related sedfile test cases? The tests only simulate a restart by adjusting the presence of CONTAINER_START. Your fail early check wouldn't conflict with that.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right; I will test this in the last PR of the series after #3115.

georglauterbach added a commit that referenced this pull request Feb 25, 2023
The variables script is actually kinda like a stack for us; what you
additionally need to know is that the declaration of the arrays is
weirdly scoped in Bash; hence #3112 takes such an unusual approach at
placing the declarations.

With variables being a stack, everything becomes a bit nicer.
@georglauterbach
Copy link
Copy Markdown
Member Author

@polarathene you're right as usual; I will try to answer all of the comments now. I have opened #3115 though, which is the first part of a series of PRs that do the same as this PR but incrementally. This should make reviewing easier.

@georglauterbach georglauterbach added the pr/waiting for other PR to get merged This PR is waiting for another / other PR(s) to get merged label Feb 25, 2023
@georglauterbach georglauterbach marked this pull request as draft February 25, 2023 15:06
@georglauterbach
Copy link
Copy Markdown
Member Author

Closing as #3121 and #3123 are merged.

@georglauterbach georglauterbach deleted the scripts/cleanup-and-prep-for-setup-split branch February 28, 2023 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/scripts kind/improvement Improve an existing feature, configuration file or the documentation kind/new feature A new feature is requested in this issue or implemeted with this PR pr/waiting for other PR to get merged This PR is waiting for another / other PR(s) to get merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] mail from client to server slow, perhaps opendkim causing delays

3 participants