[BUG] `ENABLE_SRS=1` causing high CPU usage with `postsrsd`

### Miscellaneous first checks

- [X] I checked that all ports are open and not blocked by my ISP / hosting provider.
- [X] I know that SSL errors are likely the result of a wrong setup on the user side and not caused by DMS itself. I'm confident my setup is correct.

### Affected Component(s)

postsrsd

### What happened and when does this occur?

Another user [reported to upstream `postsrsd`](https://github.com/roehling/postsrsd/issues/122) experiencing this same issue earlier in the year (_specifically with DMS_).

I noticed this while working on a PR and debugging local test failures. Why the tests do not fail in our CI is unclear (_might be due to older environment/packages in CI?_):

```
 ✗ checking configuration: hostname/domainname override: check headers of received mail [235]
   (from function `assert_success' in file test/test_helper/bats-assert/src/assert.bash, line 114,
    in test file test/mail_hostname.bats, line 104)
     `assert_success' failed
   
   -- command failed --
   status : 1
   output : ls: cannot access '/var/mail/localhost.localdomain/user1/new': No such file or directory
   --
```

```
 ✗ checking configuration: non-subdomain: check headers of received mail [308]
   (from function `assert_success' in file test/test_helper/bats-assert/src/assert.bash, line 114,
    in test file test/mail_hostname.bats, line 163)
     `assert_success' failed
   
   -- command failed --
   status : 1
   output : ls: cannot access '/var/mail/localhost.localdomain/user1/new': No such file or directory
   --
```

I had a process monitor active in the background and noticed the four containers spun up in that test each used 25% CPU for the containers `postsrsd` process.

When starting a single container in my VM (2vCPU) `postsrsd` was consistently showing 50% CPU usage, despite no mail to process.

The ports `10001` and `10002` are not bound to localhost:

```console
$ ss -tulpn | grep LISTEN

tcp   LISTEN 0      4096       127.0.0.1:10024      0.0.0.0:*                                     
tcp   LISTEN 0      100        127.0.0.1:10025      0.0.0.0:*                                     
tcp   LISTEN 0      100          0.0.0.0:587        0.0.0.0:*                                     
tcp   LISTEN 0      100          0.0.0.0:143        0.0.0.0:*    users:(("dovecot",pid=642,fd=32))
tcp   LISTEN 0      100          0.0.0.0:465        0.0.0.0:*                                     
tcp   LISTEN 0      100          0.0.0.0:65265      0.0.0.0:*    users:(("dovecot",pid=642,fd=54))
tcp   LISTEN 0      100          0.0.0.0:25         0.0.0.0:*                                     
tcp   LISTEN 0      4096       127.0.0.1:8891       0.0.0.0:*                                     
tcp   LISTEN 0      4096       127.0.0.1:8893       0.0.0.0:*                                     
tcp   LISTEN 0      100             [::]:587           [::]:*                                     
tcp   LISTEN 0      100             [::]:143           [::]:*    users:(("dovecot",pid=642,fd=33))
tcp   LISTEN 0      100             [::]:465           [::]:*                                     
tcp   LISTEN 0      100             [::]:65265         [::]:*    users:(("dovecot",pid=642,fd=55))
tcp   LISTEN 0      100             [::]:25            [::]:*
```

`postsrsd` process is running:

```console
$ ps aux | grep postsrsd

root         591  0.0  0.0   4864  3872 pts/0    S    02:37   0:00 /bin/bash /usr/local/bin/postsrsd-wrapper.sh
root         623  0.0  0.0   2420  1592 pts/0    S    02:37   0:00 /bin/sh /etc/init.d/postsrsd start
root         774 98.2  0.0   2332   708 pts/0    R    02:37   3:01 /usr/sbin/postsrsd -f 10001 -r 10002 -d example.test -s /etc/postsrsd.secret -a = -n 4 -N 4 -u postsrsd -p /var/run/postsrsd.pid -c /var/lib/postsrsd -l 127.0.0.1 -D -X
```

However after roughly 10 minutes running like this, `postsrsd` heavy CPU activity ceases and the ports do show as bound. It's unclear why it takes this long.

### What did you expect to happen?

`postsrsd` should not be consuming so much CPU activity, nor taking about 10 minutes to finish starting up. Possibly stuck in [this upstream _while loop_](https://github.com/roehling/postsrsd/blob/6e701fa51f26bb344bc0230cdfb13ae1e14afb8d/postsrsd.c#L645)?

### How do we replicate the issue?

1. Next step uses the config dir of the failed test container, but it should reproduce with a minimal config containing `postfix-accounts.cf` from the `make generate-accounts` command.
2. Run:
   ```bash
   docker run --rm -it \
     --volume "${PWD}/test/duplicate_configs/mail_override_hostname:/tmp/docker-mailserver" \
     --volume "${PWD}/test/test-files:/tmp/docker-mailserver-test:ro" \
     -e ENABLE_SRS=1 \
     -h 'example.com' \
     --name 'postsrsd_cpu' \
     "mailserver/docker-mailserver:edge"
   ```
   
2. Monitor the `postsrsd` process, it should show high CPU activity (_can be observed on the host, does not require running anything within the container_).
3. Run `docker exec -it postsrsd_cpu bash` in a separate terminal, and then try send a mail `sendmail -f 'user@external.tld' 'user1@localhost.localdomain' < /tmp/docker-mailserver-test/email-templates/existing-user1.txt`
4. DMS will respond to it but fail as it can't reach the PostSRSD service on `localhost:10001`, as the service is running (`ps aux | grep postsrsd`) but ports are not yet bound (`ss -tulpn | grep LISTEN`), and our logs appear to be useless (_`cat /var/log/supervisor/postsrsd.log` only has a line about the service being started, nothing added once it quiets down 10 mins later_).
5. Monitor the `postsrsd` process, it should cease the heavy workload after a while, and ports for the service finally bound. Mail can then be sent to the local user account as before and actually be received (`ls /var/mail/localhost.localdomain/user1`).

### DMS version

v11.1

### What operating system is DMS running on?

Linux

### What instruction set architecture is DMS running on?

x86_64 / AMD64

### What container orchestration tool are you using?

Docker

### Docker CLI command

```bash
docker run --rm -it \
  --volume "${PWD}/test/duplicate_configs/mail_override_hostname:/tmp/docker-mailserver" \
  --volume "${PWD}/test/test-files:/tmp/docker-mailserver-test:ro" \
  -e ENABLE_SRS=1 \
  -h 'example.com' \
  --name 'postsrsd_cpu' \
  'mailserver/docker-mailserver:edge'
```

### Relevant log output

Nothing relevant was logged, other than the failure to handle the mail due to the service not yet being up.

### Other relevant information

I suspect the environment may have something to do with it.

Environment is:

- VMWare Workstation Player 16.3 on a Linux 5.17 x86_64 host.
- VM guest is Fedora 36 (5.18.9 kernel), Docker version is 20.10.17.
- CPU (2vCPU) under 50% load reported in the guest OS, 4GB RAM + 4GB swap (ZRAM), 2GB of each free.

Network if it's something going wrong with the binding step:

```console
$ ip addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever

1511: eth0@if1512: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.3/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever
```

### Code of conduct

- [X] I have read this project's [Code of Conduct](https://github.com/docker-mailserver/docker-mailserver/blob/master/CODE_OF_CONDUCT.md) and I agree
- [X] I have read the [README](https://github.com/docker-mailserver/docker-mailserver/blob/master/README.md) and the [documentation](https://docker-mailserver.github.io/docker-mailserver/edge/) and I searched the [issue tracker](https://github.com/docker-mailserver/docker-mailserver/issues?q=is%3Aissue) but could not find a solution

### Improvements to this form?

It's eating up markdown syntax when some inputs are wrapped in codefences upon submission, while others are assuming formats such as `yml` for `docker-compose.yml` config sharing, despite saying you can also provide a docker CLI command instead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BUG] `ENABLE_SRS=1` causing high CPU usage with `postsrsd` #2722

Miscellaneous first checks

Affected Component(s)

What happened and when does this occur?

What did you expect to happen?

How do we replicate the issue?

DMS version

What operating system is DMS running on?

What instruction set architecture is DMS running on?

What container orchestration tool are you using?

Docker CLI command

Relevant log output

Other relevant information

Code of conduct

Improvements to this form?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] ENABLE_SRS=1 causing high CPU usage with postsrsd #2722

Description

Miscellaneous first checks

Affected Component(s)

What happened and when does this occur?

What did you expect to happen?

How do we replicate the issue?

DMS version

What operating system is DMS running on?

What instruction set architecture is DMS running on?

What container orchestration tool are you using?

Docker CLI command

Relevant log output

Other relevant information

Code of conduct

Improvements to this form?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[BUG] `ENABLE_SRS=1` causing high CPU usage with `postsrsd` #2722