Skip to content

Fix 1197 random test failures#1201

Merged
erik-wramner merged 16 commits intodocker-mailserver:masterfrom
erik-wramner:fix_1197_setup_in_tests
Aug 2, 2019
Merged

Fix 1197 random test failures#1201
erik-wramner merged 16 commits intodocker-mailserver:masterfrom
erik-wramner:fix_1197_setup_in_tests

Conversation

@erik-wramner
Copy link
Copy Markdown
Contributor

Attempt to fix random test failures, most important change moving contended chksum file.
Mount test files read only from a location that is not above the other mounted folders.
Finally retry on several checks that may fail due to timing issues.

Comment thread target/check-for-changes.sh Outdated
echo "${log_date} Start check-for-changes script."

# create work area outside mounted directory
mkdir -p /tmp/docker-mailserver-work
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it be an idea to work with mktmp here to prevent accidentally using leftovers from other runs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case no, the directory can be reused by the same container without ill effects (in fact it should as a change in checksums should trigger updates of the files in /etc). The problem I'm trying to fix is where multiple containers share the same checksum file in the mounted directory. We could put it in /var/something instead if that makes it clearer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm.. have not looked that deep into this part of this project yet. So if I see this correctly then this script just writes a file that then includes the different checksums. In that case I would not even create a subdir for this in /tmp but rather give the file a descriptive name and place it directly into /tmp.

since the information in this file can be easily regenerated I would not put this into /var

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I dislike clutter and thought that there could be other similar files later that could live in the same place, but we can certainly remove the directory and put the file directly in /tmp. We can always add a directory later.

Comment thread test/tests.bats Outdated

# Create a container which will send wrong authentications and should get banned
docker run --name fail-auth-mailer -e MAIL_FAIL2BAN_IP=$MAIL_FAIL2BAN_IP -v "$(pwd)/test":/tmp/docker-mailserver-test -d $(docker inspect --format '{{ .Config.Image }}' mail) tail -f /var/log/faillog
docker run --name fail-auth-mailer -e MAIL_FAIL2BAN_IP=$MAIL_FAIL2BAN_IP \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use a line break after ´--name fail-auth-mailer´

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I can fix the formatting.

Comment thread test/tests.bats Outdated


@test "checking user login: predefined user can login" {
# This should really not be necessary, but this test sometimes fails, probably due to timing
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a file or a socket within the container that instead could signalise that the container has properly started?

to prevent having the complete test command twice something like https://stackoverflow.com/a/12321815/4754613 could be done as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not, there are other issues/PRs about that and it seems to be non-trivial to detect when everything is up and running. If we could that could speed up all the tests, so it would be an excellent separate PR. Plus possibly the detect changes job needs to run once, not sure in this case.

I'm not a bash expert, so I tried to keep things simple. We could revert this part if you like though, the most important changes are the checksum file and moving the test files to a read-only mount.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. already seen al the "sleep xx" in the scripts, but did not yet look into if this can be done any better way. I don't think you need to revert, but I would add a "TODO" so its clear that this still needs to be improved upon.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A combination of checking the logs (write something when start-mailserver.sh is done) and connecting to a socket should work. I'll add TODOs to the sleeps I added.

Comment thread test/tests.bats Outdated
@test "checking setup.sh: email add login validation" {
# validates that the user created previously with setup.sh can login

if ! (docker exec mail doveadm auth test -x service=smtp [email protected] 'test_password' >/dev/null); then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but again I'd like to keep it simple as I'm not that good with bash. Feel free to improve the code though.

@erik-wramner
Copy link
Copy Markdown
Contributor Author

erik-wramner commented Jul 31, 2019

Changes pushed, but in my fork it fails on the account created early (line 1452). I think there may be a race condition between start-mailserver.sh and check-for-changes.sh. Ideally start-mailserver.sh should create the checksum file when it configures the accounts so that they are guaranteed to be in sync or check-for-changes.sh should always process the files once "just in case". I'll add that part as well.

@erik-wramner
Copy link
Copy Markdown
Contributor Author

@fbartels I want to fix the race condition between start-mailserver.sh and check-for-changes.sh and there are two options. One (easiest) is to avoid creating the checksum file in the start of the check-for-changes.sh script. That will be interpreted as a change and all the relevant files will be generated. Worst case that may cause some extra work at startup, but the end result should be correct and we can't miss changes. I like that as it is simple and a single script owns the checksum file.

The other option is to create the checksum file in start-mailserver.sh when the configuration is created and to change check-for-changes.sh so that it doesn't create the file but waits for it to be created. Then there will be no conflicts and no extra work, but the downside is that both scripts need to know how to work with the file and suddenly they are more dependent on each other.

What would you suggest? I favor the simple solution, but perhaps I'm just being lazy...

@fbartels
Copy link
Copy Markdown
Member

@erik-wramner looking through check-for-changes this script seems to do a lot of initial setup tasks.

I would also favour your first option. I think this suggestion is also easy to combine with making the creation of the checksum file more atomic. instead of ever appending the file to should be created in a temporary location and then copied over to its final destination at the end of the script.

@erik-wramner
Copy link
Copy Markdown
Contributor Author

Good idea. I'll see if I can protect the files with flock as well, making sure that there are no conflicts between the start script and the change script.

@erik-wramner
Copy link
Copy Markdown
Contributor Author

After a closer look I've changed my mind. The code is spread out in the start script and an extra run in the change script is fairly expensive as it will restart postfix and dovecot to make the changes bite. Doing all that work once in the start script and then immediately doing it again in the change script feels wrong and will effect the startup time for the container.

I'll create the checksum file early in the start script instead before the daemons are spawned. I'll need about five lines from the change script for that. With the checksum file in place the change script will detect changes even if they happen during startup, which should solve the problem at hand.

Getting rid of the appends and in-place editing would be nice, but I don't want to make too large changes for fixing what is mostly a test issue (race conditions can happen in real life, but are much less likely - normally new users are not added when the server is starting, but before or after).

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Aug 1, 2019

This seems ready for merge. I'll let you confirm and I'll proceed.
Thanks guys.

@erik-wramner
Copy link
Copy Markdown
Contributor Author

I'm still failing on "email add login validation" from time to time and never locally so it is hard to debug. I'm working on log parsing to ensure that the changes are processed plus that the change detector service is up. Tests running now. If that works out it should be show time. If not I'll keep working...

@erik-wramner
Copy link
Copy Markdown
Contributor Author

Failed again. I'll add flock calls, if that doesn't help I don't know what this could be. At least it can be reproduced with enough patience.

@erik-wramner
Copy link
Copy Markdown
Contributor Author

@fbartels three builds without errors so far, building for the fourth time now. Can you make a final review of the new changes before we merge, please? A known weakness is that there are no locks for aliases yet, but they have not been as problematic. Can add later if this pans out.

@erik-wramner erik-wramner merged commit 33c85d7 into docker-mailserver:master Aug 2, 2019
@erik-wramner erik-wramner deleted the fix_1197_setup_in_tests branch August 2, 2019 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants