Skip to content

Disable service stats for Dovecot explicitly#2292

Merged
georglauterbach merged 10 commits intomasterfrom
dovecot-stats
Dec 12, 2021
Merged

Disable service stats for Dovecot explicitly#2292
georglauterbach merged 10 commits intomasterfrom
dovecot-stats

Conversation

@georglauterbach
Copy link
Copy Markdown
Member

Description

A small addition that makes sure the service stats { ... is set correctly in Dovecot's master.cf. I noticed this during another, unrelated event when I restarted the mail server (image tag :edge). I figured we could prevent such a message with this addition. Have a look at the correct permissions yourself with ls -lh /var/run/dovecot/stats-*.

Fixes a "warning" message potentially shown during startup stating:

Nov 11 11:42:43 mail dovecot: auth: Error: net_connect_unix(/var/run/dovecot/stats-writer) failed: Connection refused

which might just be timing-related in my case, but I figured this addition should not hurt in any way and may prevent this message in the logs of others.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change that does improve existing functionality)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@georglauterbach georglauterbach added priority/low service/dovecot kind/improvement Improve an existing feature, configuration file or the documentation labels Nov 11, 2021
@georglauterbach georglauterbach added this to the v10.3.0 milestone Nov 11, 2021
@georglauterbach georglauterbach requested a review from a team November 11, 2021 11:08
@georglauterbach georglauterbach self-assigned this Nov 11, 2021
@casperklein
Copy link
Copy Markdown
Member

Permissions looks good in my setup:

root@mail:/# ls -lh /var/run/dovecot/stats-*
srw------- 1 root root    0 Nov  7 01:27 /var/run/dovecot/stats-reader
srw-rw---- 1 root dovecot 0 Nov  7 01:27 /var/run/dovecot/stats-writer

@georglauterbach
Copy link
Copy Markdown
Member Author

Permissions looks good in my setup:

root@mail:/# ls -lh /var/run/dovecot/stats-*
srw------- 1 root root    0 Nov  7 01:27 /var/run/dovecot/stats-reader
srw-rw---- 1 root dovecot 0 Nov  7 01:27 /var/run/dovecot/stats-writer

Very nice! Thank You for confirming this :)

@polarathene
Copy link
Copy Markdown
Member

How do you reproduce the log? A fresh container instance for me with DMS_DEBUG=1 and a /var/mail-state volume (to trigger ONE_DIR=1 handling) only seems to output the two following concerns:


_notify 'inf' "Moving contents of ${FILE} to ${DEST}:" "$(ls "${FILE}")"

Originally introduced in June 2016. With original output:

echo "Moving contents of $d to $dest:" `ls $d`
Moving contents of /var/mail-state to : lib-amavis lib-clamav lib-dovecot lib-fail2ban lib-postfix lib-postgrey lib-spamassassin spool-postfix

While now the rough equivalent splits to multi-line:

echo "Moving contents of $d to $dest:" "$(ls $d)"
Moving contents of /var/mail-state to : lib-amavis
lib-clamav
lib-dovecot
lib-fail2ban
lib-postfix
lib-postgrey
lib-spamassassin
spool-postfix

In logs this is a bit more messy looking, and seems to break the syntax highlighting of notify type:

[[  INF  ]]  Starting miscellaneous tasks
[[  INF  ]]  Consolidating all state onto /var/mail-state
-eactive
bounce
corrupt
defer
deferred
dev
etc
flush
incoming
lib
maildrop
pid
private
public
saved
usr [[  \e[34mINF\e[0m  ]]  Moving contents of /var/spool/postfix to /var/mail-state/spool-postfix:
[[  INF  ]]  Moving contents of /var/lib/postfix to /var/mail-state/lib-postfix:
-edb
tmp
virusmails [[  \e[34mINF\e[0m  ]]  Moving contents of /var/lib/amavis to /var/mail-state/lib-amavis:
-ebytecode.cvd
daily.cvd
freshclam.dat
main.cvd [[  \e[34mINF\e[0m  ]]  Moving contents of /var/lib/clamav to /var/mail-state/lib-clamav:
-esa-update-keys [[  \e[34mINF\e[0m  ]]  Moving contents of /var/lib/spamassassin to /var/mail-state/lib-spamassassin:
[[  INF  ]]  Moving contents of /var/lib/fail2ban to /var/mail-state/lib-fail2ban:
[[  INF  ]]  Moving contents of /var/lib/postgrey to /var/mail-state/lib-postgrey:
[[  INF  ]]  Moving contents of /var/lib/dovecot to /var/mail-state/lib-dovecot:

I don't know much about amavis, and I doubt this log output matters other than noise?:

Nov 12 01:16:18 mail amavis[437]: starting. /usr/sbin/amavisd-new at mail.my-domain.com amavisd-new-2.11.0 (20160426), Unicode aware
Nov 12 01:16:18 mail amavis[437]: Net::Server: Group Not Defined.  Defaulting to EGID '114 114'
Nov 12 01:16:18 mail amavis[437]: Net::Server: User Not Defined.  Defaulting to EUID '112'
Nov 12 01:16:18 mail amavis[437]: No ext program for   .zoo, tried: zoo
Nov 12 01:16:18 mail amavis[437]: No ext program for   .doc, tried: ripole
Nov 12 01:16:18 mail amavis[437]: No decoder for       .F   
Nov 12 01:16:18 mail amavis[437]: No decoder for       .doc 
Nov 12 01:16:18 mail amavis[437]: No decoder for       .zoo

Have you got Dovecot stats configured prior at all? The config you're adding only seems to be mentioned in the Dovecot docs for old statistics, rather than those mentioned in the main Dovecot statistics docs page.

Just curious if you're fixing this because it's an actual issue triggered by your setup, or just to hide a warning for something that may be considered legacy?

I probably need to have a container with something to warrant Dovecot attempting to use the stats-writer, from the warning I guess that's a login event? Although you mentioned it as being logged from restarting a container only.

polarathene
polarathene previously approved these changes Nov 12, 2021
@georglauterbach
Copy link
Copy Markdown
Member Author

georglauterbach commented Nov 12, 2021

@polarathene I had never seen this issue before, and it is likely timing related, but by adding these lines to the Dovecot configuration, we can make sure it is not an error that resulted from missing configuration :)

How do you reproduce the log?

Not sure I actually can :)


Regarding the messy log, yeah, this could use some cleanup. If I have time I will tend to it!

UPDATE: See #2294

wernerfred
wernerfred previously approved these changes Nov 12, 2021
@polarathene
Copy link
Copy Markdown
Member

but by adding these lines to the Dovecot configuration, we can make sure it is not an error that resulted from missing configuration :)

Hmm, I'd still rather understand why it's being triggered tbh rather than hiding a warning. If it is something Dovecot is doing by default when an IMAP login or similar metric occurs, it'd make more sense to either disable statistics by default explicitly or properly setup and document them.

It's not going to introduce any problems though and I don't have spare time to allocate towards looking into configuring stats collection, so happy to merge 👍

Regarding the messy log, yeah, this could use some cleanup. If I have time I will tend to it!

Thanks! 🎉

@casperklein
Copy link
Copy Markdown
Member

As Georg already wrote, the "connection refused" thing is probably due a timing issue.

I did a quick google search. The proposed change is a fix for a permission problem like:

net_connect_unix(/var/run/dovecot/stats-writer) failed: Permission denied

I am a bit ambivalent, about fixing things, that haven't been an issue ever, because that could introduce new issues in the future.

But I am no dovecot expert. So if you all are sure, that this change does not have any downside, I am fine with merging it.

@casperklein
Copy link
Copy Markdown
Member

casperklein commented Nov 12, 2021

For the record, this is the current runtime configuration:

doveconf | grep -A 4 'unix_listener stats-'

  unix_listener stats-reader {
    group =
    mode = 0600
    user =
  }

  unix_listener stats-writer {
    group = $default_internal_group
    mode = 0660
    user =
  }

@polarathene
Copy link
Copy Markdown
Member

But I am no dovecot expert. So if you all are sure, that this change does not have any downside, I am fine with merging it.

As the stats feature is apparently implicitly enabled, as your output shared shows, this PR is just correcting permissions for that, so it should be all good and not impact anything else 👍

@casperklein
Copy link
Copy Markdown
Member

After checking the implicit default settings, I now think there isn't need for an adjustment. Also as seen here, the permissions are already correctly used by dovecot.

IMO, this change does not harm, but also does not introduce any benefit.

@polarathene
Copy link
Copy Markdown
Member

Also as seen here, the permissions are already correctly used by dovecot.

This PR does not change permissions of any files. It changes permissions the service process should use by running as that user and group I thought? (Dovecot service docs seem to confirm that)

The Dovecot docs for services on listeners also state default user/group is root. At the bottom of the docs page there is also mention of stats default service.

As the log is only about stats-writer that's really the only misconfigured service, stats-reader is fine and it's defaults are the same as the documented defaults.

@casperklein
Copy link
Copy Markdown
Member

casperklein commented Nov 13, 2021

It changes permissions the service process should use by running as that user and group I thought?

That is what I meant 👍

Edit:

As the log is only about stats-writer that's really the only misconfigured service,

If you refer to my quote "net_connect_unix(/var/run/dovecot/stats-writer) failed: Permission denied", that was just a random error message found via google and is not related to DMS (otherwise we would see that error, which we don't)

@polarathene

This comment has been minimized.

@polarathene
Copy link
Copy Markdown
Member

polarathene commented Nov 13, 2021

TL;DR

Probably best to reject the PR and disable stats for v11 release (requires a Debian image upgrade for the fix in a newer Dovecot release) with stats_writer_socket_path= as a line to target/dovecot/10-master.conf.

The proposed fix is not clear if it will actually fix it without being able to reproduce the error. This is because dovecot (104:107) may not be correct and docker (5000:5000) may be expected (or depending on trigger, both but neither share the same group) which is often assigned in Dovecot configs for user and group.


We can disable the stats services, but that might warrant deferring merge until v11?:

To disable those stats sockets, add the following configuration to a
file in /etc/dovecot/conf.d/ :

service stats {
  unix_listener stats-reader {
    mode = 0
  }
  unix_listener stats-writer {
    mode = 0
  }
}

service old-stats {
  fifo_listener old-stats-mail {
    mode = 0
  }
  fifo_listener old-stats-user {
    mode = 0
  }
  unix_listener old-stats {
    mode = 0
  }
}

(Per https://wiki2.dovecot.org/Services , setting mode to 0 disables the
socket entirely.)

Then restart dovecot, and then delete /run/dovecot/stats-* and
/run/dovecot/old-stats-*.
Add to dovecot.conf: stats_writer_socket_path=

Possibly only the last part is required now, if you go through that mailing list discussion, but we won't know unless we reproduce the logged error 😅

Here's the docs for stats_writer_socket_path (nothing indicates empty value disables it), the mailing list details plenty of cases where attempts to disable didn't work out, but that appears to have been patched since Apr 2020 which the last comment in the mailing list link mentioned.

That change isn't available in our current Dovecot 2.3.4 (Nov-2018) release, but is available from the Debian 11 Bullseye release which comes with Dovecot 2.3.13 (Jan-2021).


Additionally the Dovecot service docs do advise using $default_internal_user and $default_internal_group (both default to dovecot) for managing permissions like this for dovecot config.

Note that we use docker (5000) for this though, so the default dovecot may not actually be correct with our setup 🤷‍♂️

@georglauterbach georglauterbach changed the title Added service stats explictly to Dovecot master.cf Disable service stats for Dovecot explictly Nov 13, 2021
polarathene
polarathene previously approved these changes Nov 13, 2021
@polarathene
Copy link
Copy Markdown
Member

The test failure below is expected from the linked mailing list. We should see it pass once we merge the bullseye base image upgrade.

not ok 307 checking accounts: listmailuser (quotas enabled) in 913ms
# (from function `assert_output' in file test/test_helper/bats-assert/src/assert.bash, line 239,
#  in test file test/tests.bats, line 673)
#   `assert_output '* [email protected] ( 12K / ~ ) [0%]'' failed
# 
# -- output differs --
# expected (1 lines):
#   * [email protected] ( 12K / ~ ) [0%]
# actual (2 lines):
#   doveadm: Error: net_connect_unix() failed: Connection refused
#   * [email protected] ( 12K / ~ ) [0%]
# --
# 

@georglauterbach georglauterbach added stale-bot/ignore Indicates that this issue / PR shall not be closed by our stale-checking CI meta/feature freeze On hold due to upcoming release process labels Nov 15, 2021
Comment thread target/dovecot/60-stats.conf Outdated
@georglauterbach georglauterbach removed the meta/feature freeze On hold due to upcoming release process label Nov 29, 2021
@polarathene
Copy link
Copy Markdown
Member

FWIW, I'm working through my backlog of tabs and it seems I missed adding a link/mention here about the Dovecot 2.3 release notes, at the end it says:

Due to the new stats environment, for now some environments may get harmless errors about not being able to connect to stats-writer socket.

To avoid these errors, give enough permissions for the processes to connect to the stats-writer, for example:

service stats {
  client_limit = 10000 # make this large enough so all Dovecot processes (especially imap, pop3, lmtp) can connect to it
  unix_listener stats-writer {
    user = vmail
    #mode = 0666 # Use only if nothing else works. It's a bit insecure, since it allows any user in the system to mess up with the > statistics.
  }
}

If disabling it fails to work or someone wants it enabled again, that should resolve the issue too. Which is roughly what your original solution was (yours was more correct and thorough of course).

@georglauterbach georglauterbach changed the title Disable service stats for Dovecot explictly Disable service stats for Dovecot explicitly Dec 10, 2021
@georglauterbach
Copy link
Copy Markdown
Member Author

@polarathene you were right, and tests pass for now - nice :D

@georglauterbach georglauterbach removed the stale-bot/ignore Indicates that this issue / PR shall not be closed by our stale-checking CI label Dec 12, 2021
@georglauterbach georglauterbach merged commit a4095a7 into master Dec 12, 2021
@georglauterbach georglauterbach deleted the dovecot-stats branch December 12, 2021 11:08
@polarathene polarathene modified the milestones: v11.0.0, v10.4.0 Dec 18, 2021
@ggiammat
Copy link
Copy Markdown

Hi all, I'm on 11.1.0 and I'm trying to enable the Open Metrics exporter in Dovecot as exaplined here: https://doc.dovecot.org/configuration_manual/stats/openmetrics/.

I followed the instructions in that page, however when I connected to http://localhost:9900/metrics all metrics were always "0". After several trials with different configurations, I found that reverting some of the changes made in this PR actually solves the issue and metrics are exported correctly.

In particualr I had to explicitly enable the sockets for reading/writing the stats:

service stats {
  inet_listener http {
    port = 9900
  }
  unix_listener stats-reader {
    group =
    mode = 0600
    user =
  }
  unix_listener stats-writer {
    group = $default_internal_group
    mode = 0660
    user =
  }
}

and also I had to remove this last line from the /etc/dovecot/dovecot.conf file:

stats_writer_socket_path=

I'm a bit confused because, if I understood correctly, this PR was done to remove the old-stats, but from what I see it also have effect on the new stats and the exporting of metrics (unless I'm wrong and there is another configuration to get metrics exported correctly without reverting the changes in this PR).

@polarathene
Copy link
Copy Markdown
Member

AFAIK, the old stats disabling handled by this PR was reverted in a follow-up PR 2336 and is explained why there.

The new stats were disabled to avoid the error log noted at the start of this PR I think. If you go through my responses here, I do seem to suggest that it may have been fixable via adjusting permissions correctly, followed by this reference / snippet which may have been the only required changes (instead of the mode = 0 this PR introduced), in addition to stats_writer_socket_path= which was noted as a way to disable the writer.

Also from a comment of mine that I hid:

I am referring to the stats-writer warning log in the PR description. Something during auth attempted to update stats but could not due to incorrect permissions.

But without a reproduction it cannot be verified to be a fix 😅

Presumably Dovecot is attempting to interact with the stats-writer socket but as default ownership is root:root, it's unable to. With root:dovecot 0660 it would be able to write as part of the authorized groups. This also means it isn't likely to work with reading stats either (even with this PR)

I don't think we ever had official support for stats with the project, so the intention was to make those opt-in instead for users that specifically needed it.

If there was enough interest expressed for such we could add some toggle support, otherwise our usual advice is to defer to documenting for others how to leverage user-patches.sh to make the change.


If you revert the changes from this PR and can reproduce the error output that motivated the changes, but resolve it in a more compatible way that would probably get approved and merged 👍

It also looks like the issue may have been related to an earlier release of Dovecot we had, and if that no longer needs the workaround, then it could be dropped. It didn't seem like it was clear what was causing the error to reproduce it unfortunately 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/configuration (file) kind/improvement Improve an existing feature, configuration file or the documentation priority/low service/dovecot

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants