Skip to content

Run services with supervisor#631

Closed
LukeAllen wants to merge 9 commits intodocker-mailserver:masterfrom
LukeAllen:master
Closed

Run services with supervisor#631
LukeAllen wants to merge 9 commits intodocker-mailserver:masterfrom
LukeAllen:master

Conversation

@LukeAllen
Copy link
Copy Markdown

With this change, most of the services are run via supervisor, so they'll be automatically restarted if they die. Fixes #434 and #575

Note that to run the processes with supervisor I had to use the 'run in foreground' version of their start commands, rather than the init.d script that was used before. @tomav you should verify that I haven't lost any special configurations you had for those services.

Also, I added tests, but I'm having an error actually running the automated test suite on my machine. The problem seems unrelated to my changes. When I run make, the container mail fails to stay up. It looks like its start-mailserver.sh is exiting with this error:

Check that hostname/domainname is provided or overidden (no default docker hostname/kubernetes) [_check_hostname]
  * Setting hostname/domainname is required

Is any special environment required to run make?

@gegere
Copy link
Copy Markdown

gegere commented Jun 12, 2017

@LukeAllen this is a nice step forward. Supervisor does a good job managing processes, let me know if I can help in any way.

@LukeAllen
Copy link
Copy Markdown
Author

@gegere thanks! If you could figure out what went wrong in the automated test, that would be really helpful. It looks like the TravisCI automated test is failing at a different point than on my machine, so I'm kind of confused.

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jun 13, 2017

Restarted build.

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jun 13, 2017

I reproduced the issue locally. Will try to have a look.

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jun 13, 2017

Note that there's no more mail.log output in foreground (accessible using docker logs), which was a good thing to debug. (this is important otherwise DMS_DEBUG has no interest)

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jun 13, 2017

Problem seems related to Dovecot. I tried multiple times and its not always the same container which stops. But I can always see the same thing in the logs:

Fatal: Dovecot is not running (read from /var/run/dovecot/master.pid)

@LukeAllen
Copy link
Copy Markdown
Author

LukeAllen commented Jun 15, 2017

The output of start-mailserver.sh goes to stdout now, so it will be visible. (I had been redirecting the output to /var/log/container-startup.log which is why it wasn't showing. Now it goes both places.) I'm still not sure what's happening with the tests.

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jun 23, 2017

I guess services have to be launched in background mode, and supervisor in foreground.
start-mailserver.sh probably can't get exit 0 from foreground startups and hangs.
Just my 2 cents.

@LukeAllen
Copy link
Copy Markdown
Author

Yeah people usually run supervisor in the foreground for docker containers, but it can run in the background too. I ran it in the background to stay compatible with your existing startup script.

I replaced the old init.d calls with supervisorctl start appname which does immediately return exit 0. It does return before the app has finished starting though. I'm not sure if that breaks some assumption of the tests or the startup script. I can't figure out where the test is actually failing; is there a way to make the TravisCI tests run with DMS_DEBUG=1?

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jun 27, 2017

You can add DMS_DEBUG=1 in the Makefile

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jul 5, 2017

Hi @LukeAllen, any update on this?

@LukeAllen
Copy link
Copy Markdown
Author

I fixed the dovecot problem. The issue was that in some configurations, start-mailserver.sh starts dovecot, writes some config files, then immediately calls dovecot reload. The dovecot reload broke because supervisor hadn't finished starting dovecot. I modified it to just write all config files before starting dovecot. So now all the tests run, but there are new failures.

@tomav
Copy link
Copy Markdown
Contributor

tomav commented Jul 8, 2017

@alinmear
Copy link
Copy Markdown
Contributor

alinmear commented Jul 8, 2017

@LukeAllen do you need help getting this done?

I think we should check whether the tasks could run in foreground mode or not. If so (notifications and errors are mapped to stdout and stderr) we should set the following to each supervisord definition:

stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0 

Benefit would be that we would have 1 interface for all logging informations. @tomav what do you think about that?

@LukeAllen
Copy link
Copy Markdown
Author

@alinmear good idea; I incorporated your logging suggestion. Thanks for the help offer; yeah I could use some help. I'm having trouble understanding these tests. I've verified that the processes are actually all running, and the container can send mail, so my guess is that some of the tests might make assumptions that aren't true any more. Not sure though.

@alinmear
Copy link
Copy Markdown
Contributor

@LukeAllen will try to check this on the weekend!

@evilstiefel
Copy link
Copy Markdown

Another possibility would be to use baseimage-docker by phusion which was designed specifically for this purpose (and other enhancements). Maybe they've already solved the problem and we don't need to reinvent the wheel?

I would be willing to invest some time into this. The docker image is perfect for running a mailserver, but I am unsure about dependability and resiliance against service failures...

@johansmitsnl
Copy link
Copy Markdown
Contributor

Hi all,

I have rebased the changes from @LukeAllen on the new master branch and did some switching of the flow.
Currently I have almost all tests working nicely and have only 8 errors (ldap/log related). I will spend some more time on it to make these go away.

Could someone take a review of my changes on top of the @LukeAllen branch issue-631-run-services-with-supervisor and give out some ideas if some changes are needed before I wrap it up?

When done I need to make a new pull request or what is the procedure of it?

@johansmitsnl
Copy link
Copy Markdown
Contributor

Solved all tests, see PR #676 because I could not make a PR on this PR.

@LukeAllen
Copy link
Copy Markdown
Author

Awesome thanks @johansmitsnl; I was having trouble understanding those tests. Closing this PR now since PR #676 incorporates it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants