Rspamd: more features by georglauterbach · Pull Request #3159 · docker-mailserver/docker-mailserver

georglauterbach · 2023-03-05T21:22:12Z

Description

Add new features to our Rspamd integration:

With MOVE_SPAM_TO_JUNK=1, we will now move messages marked with the X-Spam: Yes header to the "Junk" folder
With RSPAMD_LEARN=1, we enable
1. auto-learning for the Bayes filter
2. when a user moves messages from or to the "Junk" folder, Rspamd will learn them as ham or spam (for the Bayes filter as well)

I tried to separate commits as cleanly as possible to make reviews easier.

Closes #3145
Closes #3156

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (README.md or the documentation under docs/)
If necessary I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

georglauterbach · 2023-03-05T21:28:13Z

@hanscees @alexanderadam you could really help me here: do you know how to trigger Sieve locally by moving a mail into the "Junk" folder with a command executed inside the container? I want to complete the last test for RSPAMD_LEARN.

alexanderadam · 2023-03-06T07:19:31Z

do you know how to trigger Sieve locally by moving a mail into the "Junk" folder with a command executed inside the container?

Sadly I don't 😬
But I would guess copying a SPAM mail into the new folder maybe?
Sorry, but I'm not very good with this email infrastructure thing.

hanscees · 2023-03-06T08:39:06Z

I have to think about that. Will come back to it this evening.
You basically have to log in as imap user. Then do a move from folder a to folder b.

georglauterbach · 2023-03-06T08:54:55Z

I have to think about that. Will come back to it this evening.

Thanks!

You basically have to log in as imap user. Then do a move from folder a to folder b.

Indeed. I don't know how to do that actually, I have never done that before.

georglauterbach · 2023-03-06T12:18:04Z

@polarathene @casperklein would you be fine if this PR got included into v12.0.0? I marked it for v12.1.0, but if you're okay with it, I'd take care of this ASAP so we can merge this for v12.0.0.

casperklein · 2023-03-06T17:48:47Z

@polarathene @casperklein would you be fine if this PR got included into v12.0.0? I marked it for v12.1.0, but if you're okay with it, I'd take care of this ASAP so we can merge this for v12.0.0.

Why shouldn't it be fine to include it in v12? I see no reason against it.

hanscees · 2023-03-06T19:24:17Z

I have to think about that. Will come back to it this evening.

Thanks!

You basically have to log in as imap user. Then do a move from folder a to folder b.

Indeed. I don't know how to do that actually, I have never done that before.

Hi,
I am not familiar with dockers testing, but I assume you can spin up an image and run commands in it.

A test that simulates an email copy from commandline to dovecot is possible, but wouldnt recommend it since its very complex. However, the notes to do copy's with imap are here: see https://www.atmail.com/blog/advanced-imap/
You would have to make a user. Send email in a folder and then move it. Then look in the logging of dovecot.
And you would need some sort of rather large expect script. Perhaps these do exists, don't know.

Perhaps more manageable is to run and parse the command doveconf and look for the plugin section and check a couple of things :

doveconf
usable things 
plugin {
  imapsieve_mailbox1_before = file:/etc/dovecot/sieve/learn-spam.sieve
  imapsieve_mailbox1_causes = COPY
  imapsieve_mailbox1_name = INBOX.Junk
  imapsieve_mailbox2_before = file:/etc/dovecot/sieve/learn-ham.sieve
  imapsieve_mailbox2_causes = COPY
  imapsieve_mailbox2_from = INBOX.Junk
  imapsieve_mailbox2_name = *
  sieve = ~/.dovecot.sieve
  sieve_after = /usr/lib/dovecot/sieve-global/after/
  sieve_before = /usr/lib/dovecot/sieve-global/before/
  sieve_dir = ~/sieve
  sieve_extensions = +notify +imapflags +vnd.dovecot.pipe +vnd.dovecot.filter
  sieve_filter_bin_dir = /usr/lib/dovecot/sieve-filter
  sieve_global_extensions = +vnd.dovecot.pipe
  sieve_pipe_bin_dir = /etc/dovecot/sieve
  sieve_plugins = sieve_imapsieve sieve_extprograms
  }

By the way a similar command for rspamd is rspamd configdump

Does this help a bit?

georglauterbach · 2023-03-06T22:29:45Z

I am not familiar with dockers testing, but I assume you can spin up an image and run commands in it.

That's what we do :)

the notes to do copy's with imap are here: see https://www.atmail.com/blog/advanced-imap/

Thanks for the link, will have a look.

You would have to make a user. Send email in a folder and then move it. Then look in the logging of dovecot.

The Rspamd test sends 4 emails actually: 1 normal, 1 Gtube rejection, one virus rejection, and one that Rspamds adds a header to.

The part that moves the mail does not trigger Sieve for me though.

And you would need some sort of rather large expect script. Perhaps these do exists, don't know.

Don't worry, our test suite (BATS) plus our helpers are rather comprehensive.

Perhaps more manageable is to run and parse the command doveconf and look for the plugin section and check a couple of things :

Test is already doing that.

Does this help a bit?

To will have a look at the link you provided. The rest is already all implemented :)

georglauterbach · 2023-03-06T22:32:20Z

Why shouldn't it be fine to include it in v12? I see no reason against it.

👍🏼

georglauterbach · 2023-03-06T22:33:17Z

@casperklein @polarathene you can already go ahead and review changes. The only additions I will try to implement is for the last test to be more comprehensive.

polarathene · 2023-03-06T22:43:33Z

I marked it for v12.1.0, but if you're okay with it, I'd take care of this ASAP so we can merge this for v12.0.0.

This is probably going to be low on my priority to get to. Or if @casperklein reviews before me and approves, that's fine to merge without waiting on my review input 👍

I am probably going to need a day or so before I could make time for it so by all means if that suits you go for it.

casperklein · 2023-03-06T22:56:48Z

Generally, "move_spam_to_junk" isn't a spamassassin/rspamd specific thing.

I think there is no need for a separate RSPAMD_MOVE_SPAM_TO_JUNK ENV / function.

docker-mailserver/target/scripts/startup/setup.d/dovecot.sh

Lines 77 to 85 in dab7070

    
           # sieve will move spams to .Junk folder when SPAMASSASSIN_SPAM_TO_INBOX=1 and MOVE_SPAM_TO_JUNK=1 
        
           if [[ ${SPAMASSASSIN_SPAM_TO_INBOX} -eq 1 ]] && [[ ${MOVE_SPAM_TO_JUNK} -eq 1 ]] 
        
           then 
        
             _log 'debug' 'Spam messages will be moved to the Junk folder' 
        
             cp /etc/dovecot/sieve/before/60-spam.sieve /usr/lib/dovecot/sieve-global/before/ 
        
             sievec /usr/lib/dovecot/sieve-global/before/60-spam.sieve 
        
           else 
        
             rm -f /usr/lib/dovecot/sieve-global/before/60-spam.sieve /usr/lib/dovecot/sieve-global/before/60-spam.svbin 
        
           fi

Just remove the ${SPAMASSASSIN_SPAM_TO_INBOX} -eq 1 condition and MOVE_SPAM_TO_JUNK can be used universally.

PS: Your (rspamd) sieve solution

require ["fileinto","mailbox"];
if header :contains "X-Spam" "Yes" {
    fileinto :create "Junk";
}

looks a bit more advanced, than what currently is used:

#target/dovecot/sieve/before/60-spam.sieve:

require "fileinto";
if header :contains "X-Spam-Flag" "YES" {
    fileinto "Junk";
}

I think this can be updated safely 👍

georglauterbach · 2023-03-07T13:51:54Z

@casperklein you are right! I went ahead and implemented your approach, then rewrote history to have cleanly separated commits again; suggestions are incorporated :)

I will force-push my changes now.

georglauterbach · 2023-03-08T18:17:18Z

FYI: I'd like to merge this at the very latest on Friday, 12 o'clock UTC+1 because thereafter I will be unavailable to merge/adjust for 2 weeks straight.

polarathene · 2023-03-09T22:17:40Z

Friday, 12 o'clock UTC+1

12 hours to go 😅

I'll try to review before then, but if another maintainer approves that's fine by me too for merging.

polarathene

Light review as short on time and Dovecot sieve + rspamd aren't anything I am too knowledgeable about.

I've left some notes, but if they're not a concern in your opinion that's fine, go ahead and merge 👍 Happy vacation! 😀

casperklein · 2023-03-10T10:14:22Z

FYI: I'd like to merge this at the very latest on Friday, 12 o'clock UTC+1 because thereafter I will be unavailable to merge/adjust for 2 weeks straight.

I wasn't able to review yesterday :-\ Tonight/tomorrow I can review.

georglauterbach · 2023-03-10T10:52:52Z

I wasn't able to review yesterday :-\ Tonight/tomorrow I can review.

I won't be able to properly respond to feedback then; I got about 3 hours left for that, as I said.

georglauterbach · 2023-03-10T14:50:30Z

I was able to feedback from @polarathene now; I hope you like the PR @casperklein :)

Happy vacation! 😀

Thank's a lot! :D

polarathene

I suppose we merge this after your feedback @casperklein as @georglauterbach is no longer able to respond for 2 weeks.

Feel free to merge after your review or make any changes 👍

polarathene · 2023-03-11T00:24:23Z

+    cat >/usr/lib/dovecot/sieve-global/after/spam_to_junk.sieve << EOF
+require ["fileinto","mailbox"];
+
+if anyof (header :contains "X-Spam-Flag" "YES",
+          header :contains "X-Spam" "Yes") {
+    fileinto "Junk";
+}
+EOF


Honestly not a fan of EOF like this vs separate files, but I've said this before with cron jobs and looked the other way in the PR where this was also done 😅

Separate config files have better observability in my opinion.

Far more visible in the file tree as distinct content from scripts. Also beneficial for tasks like grep on content that isn't .sh when searching / filtering the entire project for something, less noise to wade through.

git blame far more useful as they don't tend to go through as much churn as scripts have been (each year it seems they get refactored heavily).

I don't think there's much gain from inlining content like this, but I know it's preferred by you and casper 😅

I know you're not a fan, but from my perspective, the gains (reduced complexity in the file tree, Dockerfile and removal of the uncanny rm in the else branch) outweigh the disadvantages in this case for such a small file :)

With larger files, I agree with you!

I don't think there's much gain from inlining content like this, but I know it's preferred by you and casper 😅

It depends on 😆 For static content, I generally prefer plain files. For small snippets (with variables), a heredoc may better suited.
If there is consent, we can in a separate PR move some (not only rspam related) heredocs to files.

If there is consent, we can in a separate PR move some (not only rspam related) heredocs to files.

I can live with it 😅

There's some merit for the co-location to that extent perhaps, but those less familiar where everything is and wanting to contribute or troubleshoot something may first expect files in the container that we add to exist as files in our project, rather than embedded into scripts 🤷‍♂️ (I know this was a bit of a pain-point for me with the cron jobs, which I think existed in several locations in the container, some from packages beyond what DMS added)

It probably won't make much of a difference to getting new maintainers onboard though, so if it's easier for those actively maintaining the project to leverage heredocs, go for it 😎

georglauterbach · 2023-03-11T08:51:25Z

I suppose we merge this after your feedback @casperklein as @georglauterbach is no longer able to respond for 2 weeks.

Very quick feedback I can give, will be on GH for 1 minute a day :) Flight hasn't yet started :D I just can't edit or adjust anything

Feel free to merge after your review or make any changes 👍

Yes - go ahead! :)

georglauterbach · 2023-03-14T12:42:34Z

FYI: I'd like to merge this at the very latest on Friday, 12 o'clock UTC+1 because thereafter I will be unavailable to merge/adjust for 2 weeks straight.

I wasn't able to review yesterday :-\ Tonight/tomorrow I can review.

ping :)

github-actions · 2023-03-14T12:42:53Z

Documentation preview for this PR is ready! 🎉

Built with commit: 3ab664b

casperklein · 2023-03-15T18:49:08Z

I wasn't able to review yesterday :-\ Tonight/tomorrow I can review.

ping :)

Enjoy your vacation and don't hang around at github 😝
Sorry for the delay again, I am currently out of order.. 🤒

casperklein

Finally 🎉

casperklein · 2023-03-18T13:07:50Z

+    cat >/usr/lib/dovecot/sieve-global/after/spam_to_junk.sieve << EOF
+require ["fileinto","mailbox"];
+
+if anyof (header :contains "X-Spam-Flag" "YES",
+          header :contains "X-Spam" "Yes") {
+    fileinto "Junk";
+}
+EOF


I don't think there's much gain from inlining content like this, but I know it's preferred by you and casper 😅

It depends on 😆 For static content, I generally prefer plain files. For small snippets (with variables), a heredoc may better suited.
If there is consent, we can in a separate PR move some (not only rspam related) heredocs to files.

HeySora · 2023-05-01T14:38:18Z

@georglauterbach Commit b918de1 introduced a breaking change:

The sieve rules are now global "after" rules; this makes more sense than "before" rules..

Having the rules as "before" rules allowed users to manually "whitelist" a sender whose e-mails could be wrongly flagged as spam (with a fileinto to the inbox), since the user rules were parsed after.
With this change, this is not possible anymore, as this sieve rule will now override any user filter, thus making user whitelists impossible.

May I ask what was the reasoning behind this change? I'm asking because I might be missing the point. Thanks in advance!

georglauterbach · 2023-05-02T08:43:31Z

May I ask what was the reasoning behind this change? I'm asking because I might be missing the point. Thanks in advance!

Truth be told, I'm just not a Sieve expert 😅 I didn't know that this could be breaking.

Having the rules as "before" rules allowed users to manually "whitelist" a sender whose e-mails could be wrongly flagged as spam (with a fileinto to the inbox), since the user rules were parsed after.

With this change, this is not possible anymore, as this sieve rule will now override any user filter, thus making user whitelists impossible.

So if I understand you correctly, it's not really about the before/after, but about the global/user order? I will look into this again; if you could help me a bit and provide information about your setup so I can see where your configuration files are, that'd help a lot! Thanks :)

HeySora · 2023-05-02T09:20:06Z

Truth be told, I'm just not a Sieve expert 😅 I didn't know that this could be breaking.

It's alright 😄 There are so many things that can go wrong, and collaborating helps figuring all of this out!

So if I understand you correctly, it's not really about the before/after, but about the global/user order?

This docker image exposes ManageSieve on the port 4190, which is used to let each user set up their own filters.
The order of Sieve filters are: global "before", then user, then global "after".

I manage a mailserver where a few users have genuine e-mails getting wrongly caught by SpamAssassin. As a result, some users have custom rules where if an e-mail comes from a trusted e-mail address, to fileinto "INBOX"; so it doesn't get to spam.
As far as I'm aware, this is the only way users can "whitelist" specific e-mail senders without the help of the administrator.

I'm not a Sieve expert either, but from my understanding, the last fileinto directive is the one that takes predecence over all the others. Your commit moved the fileinto "Junk"; directive into sieve-global/after/, which will override any user filter, thus preventing people from doing that "whitelisting" again.

User filters are located in docker-data/dms/mail-data/DOMAIN/USERNAME/sieve/*.sieve, and are documented here, on the official documentation of this project.

Hopefully that makes sense!

In regards to my specific setup, I'm providing SnappyMail (active fork of Rainloop) as our webmail, which lets users interface with ManageSieve and make their own filters.

One user in particular has filters like this in place:

if header :contains ["From"] "twitch"
{
    fileinto "INBOX.Twitch";
}

if header :contains ["From"] "bandcamp"
{
    fileinto "INBOX.Bandcamp";
}

However, with your global "after" filter, spam mails would still be filed into the "Junk" folder, with no way of overriding it.

georglauterbach · 2023-05-02T09:29:22Z

I genuinly didn't think the last fileinto directive is the one that is used... now all makes sense. I can try to fix this up and provide a PR.

Thanks a lot for the explanation!

HeySora · 2023-05-02T09:45:27Z

.....Actually I think I just got hit with the syndrome of "asking for help / telling someone something, to then find an answer by yourself" 😅 Apologies!

It might be worth trying to keep the Junk filter as an "after" filter just like you did, and attempt using stop; from user filters to see if that would work? That might actually be the right call here, instead of what the particular user I mentioned was doing.

I'm just not sure if a stop; only halts the execution of the current (here, user) script, or if it halts everything, including "after" filters.

I'll go test this ASAP

georglauterbach · 2023-05-02T09:50:41Z

Haha, I know this "syndrom" very well :D

What I found about your question: https://dovecot.org/list/dovecot/2018-March/111384.html so it seems to be possible.

I am uncertain which route to go down. Your idea using stop; sounds reasonable to me though.

I will adjust the release notes though to make users aware of this unintended change. It is, after all, somehow breaking...

georglauterbach · 2023-05-02T09:57:57Z

Please reach back to me whether you think the stop; is the right approach or whether we should return to the old way. I do prefer the stop; (disabling implicit keep) though, as it seems to be the cleaner solution to me.

HeySora · 2023-05-11T17:26:56Z

Please reach back to me whether you think the stop; is the right approach or whether we should return to the old way. I do prefer the stop; (disabling implicit keep) though, as it seems to be the cleaner solution to me.

Hi @georglauterbach, I can now confirm after a while that this has been working flawlessly! 😄

georglauterbach added kind/new feature A new feature is requested in this issue or implemeted with this PR service/security/rspamd labels Mar 5, 2023

georglauterbach added this to the v12.1.0 milestone Mar 5, 2023

georglauterbach requested review from casperklein and polarathene March 5, 2023 21:22

georglauterbach self-assigned this Mar 5, 2023

georglauterbach added the meta/feature freeze On hold due to upcoming release process label Mar 5, 2023

georglauterbach marked this pull request as draft March 5, 2023 21:25

georglauterbach force-pushed the rspamd/more-features branch 2 times, most recently from a3767e8 to 7a89471 Compare March 6, 2023 13:08

georglauterbach mentioned this pull request Mar 6, 2023

[BUG] OpenDMARC rejecting specific messages without obvious reason #3130

Closed

casperklein reviewed Mar 6, 2023

View reviewed changes

Comment thread target/scripts/startup/setup.d/security/rspamd.sh Outdated

Comment thread target/scripts/startup/setup.d/security/rspamd.sh Outdated

georglauterbach modified the milestones: v12.1.0, v12.0.0 Mar 6, 2023

georglauterbach removed the meta/feature freeze On hold due to upcoming release process label Mar 6, 2023

georglauterbach marked this pull request as ready for review March 6, 2023 22:33

georglauterbach force-pushed the rspamd/more-features branch from 7a89471 to 9a84ded Compare March 7, 2023 13:52

georglauterbach force-pushed the rspamd/more-features branch from 9a84ded to cbd43c5 Compare March 7, 2023 14:02

georglauterbach mentioned this pull request Mar 7, 2023

[BUG] SA_KILL does not delete spam #3165

Closed

10 tasks

polarathene previously approved these changes Mar 10, 2023

View reviewed changes

georglauterbach added 3 commits March 10, 2023 15:46

simplify file creation / removal

e76ad60

apply PR feedback

e27127a

Merge remote-tracking branch 'origin/master' into rspamd/more-features

3ca2757

georglauterbach dismissed polarathene’s stale review via 3ca2757 March 10, 2023 14:47

polarathene approved these changes Mar 11, 2023

View reviewed changes

Merge branch 'master' into rspamd/more-features

3ab664b

casperklein approved these changes Mar 18, 2023

View reviewed changes

georglauterbach merged commit e58dd1b into master Mar 18, 2023

georglauterbach deleted the rspamd/more-features branch March 18, 2023 15:32

georglauterbach mentioned this pull request Mar 28, 2023

Rspamd: add greylisting option & code refactoring #3206

Merged

9 tasks

Uh oh!

Conversation

georglauterbach commented Mar 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist:

Uh oh!

georglauterbach commented Mar 5, 2023

Uh oh!

alexanderadam commented Mar 6, 2023

Uh oh!

hanscees commented Mar 6, 2023

Uh oh!

georglauterbach commented Mar 6, 2023

Uh oh!

georglauterbach commented Mar 6, 2023

Uh oh!

casperklein commented Mar 6, 2023

Uh oh!

Uh oh!

Uh oh!

hanscees commented Mar 6, 2023

Uh oh!

georglauterbach commented Mar 6, 2023

Uh oh!

georglauterbach commented Mar 6, 2023

Uh oh!

georglauterbach commented Mar 6, 2023

Uh oh!

polarathene commented Mar 6, 2023

Uh oh!

casperklein commented Mar 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

georglauterbach commented Mar 7, 2023

Uh oh!

georglauterbach commented Mar 8, 2023

Uh oh!

polarathene commented Mar 9, 2023

Uh oh!

polarathene left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

casperklein commented Mar 10, 2023

Uh oh!

georglauterbach commented Mar 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

georglauterbach commented Mar 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

polarathene left a comment

Choose a reason for hiding this comment

Uh oh!

polarathene Mar 11, 2023

Choose a reason for hiding this comment

Uh oh!

georglauterbach Mar 11, 2023

Choose a reason for hiding this comment

Uh oh!

casperklein Mar 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

georglauterbach Mar 18, 2023

Choose a reason for hiding this comment

Uh oh!

polarathene Mar 20, 2023

Choose a reason for hiding this comment

Uh oh!

georglauterbach commented Mar 11, 2023

Uh oh!

georglauterbach commented Mar 14, 2023

Uh oh!

georglauterbach commented Mar 5, 2023 •

edited

Loading

casperklein commented Mar 6, 2023 •

edited

Loading

georglauterbach commented Mar 10, 2023 •

edited

Loading

georglauterbach commented Mar 10, 2023 •

edited

Loading

casperklein Mar 18, 2023 •

edited

Loading

casperklein Mar 18, 2023 •

edited

Loading

HeySora commented May 2, 2023 •

edited

Loading

HeySora commented May 2, 2023 •

edited

Loading

georglauterbach commented May 2, 2023 •

edited

Loading