Conversation
…es to configs and prevention of unnecessary service restarts
|
I haven't looked over this PR update, but chiming in to say I still feel strongly about what I said in the previous PR back in Nov 2021. I've mentioned it a few times but had no feedback. I was hoping to have started working on that by now, but lacked resources. That should be resolved sometime this month thankfully, and I can start contributing again. Did you want to discuss the alternate route I think we should pursue, or for the time being get your approach in first? |
Do you mean |
|
Regarding watchexec: If I understand correctly, we would also need a work-around for at least NFS. I think we should aim a one-for-all solution (if possible). |
What is wrong with NFS and how does it relate to waiting for changes to settle? I'd like to keep this PR to only the essentials if possible. These massive PRs (which I'm responsible for) are painful for us all. |
|
I've updated my post above, I hope this makes it clearer 😉 There is nothing wrong with this PR from my point of view. |
It additionally has a debounce setting, so that it won't run any script if an event triggers again within a duration (must be idle for X time after last event to run script), and a throttle IIRC to ensure that it will not debounce forever, at least once per Y time.
Yes, The developer is quite friendly / responsive, chances are if a problem needed to be investigated/addressed, it'd be done sooner on their end than us with bash given how long some bug-fixes/features can take for us volunteer maintainers to complete. Regardless, if we take your approach first for now, you'd have a fallback option to rollback to and maintain, but I don't think that'd be necessary in practice.
It might not be that hard to add to the existing bash support, but overall we could reduce a fair amount of bash to maintain, which I consider a decent reason in this case since we've already had bash introduce breakage that wasn't easy to identify and resolve quickly.. That was due to a recent base image update caused memory leak because We've already had other maintainers (myself included) find this area a bit tricky/confusing in the past to understand (thankfully a fair amount of refactoring by you and me has greatly improved that at least). I don't know about you, but my memory on such inner workings / layers in a codebase can get foggy over time, and I know I won't be as active of a maintainer after this year, so someone else needs to retain/accumulate that knowledge and so forth. As time goes by digging through git blame (especially with big changes and logic being split into separate files or renamed/relocated), trying to learn from past contributors/PRs can be frustrating, I did that extensively for this area especially for file locking logic. IMO, simplifying this sort of thing as much as possible so that it's easy to grok and maintain is more important than it being handled in pure bash. Similar to the python This PR is probably fine for now, but my opinion is it only further complicates the logic / control-flow. If Not sure if you're aware but related tests updated by myself and @georglauterbach and perhaps your own experience have had failures due to the sleep approach required by current TL;DR
Bash solution may be simple, but that doesn't always make it appropriate to prefer bash for the sake of it. Personally, I'm not against reviewing and merging this PR in the meantime though, but I think there's a strong case for |
polarathene
left a comment
There was a problem hiding this comment.
LGTM 👍 Just a few minor changes to consider.
8a71d5c to
9ee0290
Compare
|
Seeing some flakiness in tests. I'm going to run them a few more times to see what and how often it fails. |
|
As always, good summary @polarathene 👍
This sounds like a jack of all trades 😄 If that works reliable and the code base can be considerably reduced, I think it's worth to evaluate this solution. But until then, there are no reasons to not improve the current script with this PR. |
|
Flakey test: |
Difficult to say but I have never experienced this test to be flakey with other PRs or the current code base. How certain we be that this test really is flakey and it's not the PR's fault? |
|
From the shared log, it's possible that the timeout of 10 seconds is too short due to the extra sleeps and slower CI hardware vs local testing (I recall this being a gotcha in the past for other maintainers). If the test is consistently passing locally it's likely the cause is CI environment. There was many more failures in the failed commit earlier, likely all due to this reason.. It may be a good indication to take caution with eventual merging if it's going to ripple through to other PRs? For tests that have too short timeouts, we could increase them (presumably safe), but others may be failing due to assumptions that delays between service availability wouldn't be as long without any explicit timing? 🤷♂️ |
We merged a lot PRs in the last days and this test was never flaky IIRC. PS: This test also fails (reliable) when I run the tests locally. |
Thanks, give me a bit to dig into this. |
…3.x in /bin/bash, and brew installs 5.x into /usr/local/bin
…-changes wait for settle
| assert_output '' | ||
| run ./setup.sh -c mail_with_relays alias add [email protected] [email protected] | ||
| run_until_success_or_timeout 10 docker exec mail_with_relays grep -e domain2.tld /etc/postfix/relayhost_map | ||
| run_until_success_or_timeout 11 docker exec mail_with_relays grep -e domain2.tld /etc/postfix/relayhost_map |
There was a problem hiding this comment.
Turns out adding aliases takes 1 second longer than anything else :)
|
I see, you add more and more Mac specific workarounds. This will be a cat & mouse game forever. Every PR is only tested against linux. So it's just a matter of time, until something breaks on MacOS again (unnoticed, until you stumble upon it.). |
38c855a to
077483f
Compare
I agree. We should pull all custom macOS changes from the code and update the Contributor guide to explain macOS users need to use a Linux VM (I can do in a different PR). What does @georglauterbach and @polarathene think? |
I very much agree that this should be the way to go :) |
I agree as well. I personally just spin up a VM to run tests. |
|
This pull request has become stale because it has been open for 20 days without activity.
|
|
Not stale! I just moved so haven't had time for this |
If you want to you can add the PS: Also sorry for the upcoming merge conflicts due to #2498 🙃🙈 |
|
This PR is now open for quite some time without recent contribution, and we have quite a lot of merge conflicts here. I‘d advocate for closing this and, if someone want‘s to try again in the future, open a new PR. Any objections from @docker-mailserver/maintainers? |
|
I intend to tackle this within a week once the |
|
As part of a cleanup series for v13.0.0, I will close this as part of a TODO list (see #3289). It may be a personal thing, but I like the PR list to be clean and up-to-date with non-stale PRs. I hope this does not offend anyone, but maintaining DMS for me is easier when I see all relevant PRs as open (or non-stale WIP). Maintaining the PR and issue list is a lot of work, especially for me, so I'd like to keep it in a way that is comfortable for me. This PR is just too old for my taste; this does not mean you cannot incorporate these changes though! Please (re-)open when you're working on this again. |
Description
Original PR this was split from: #2157
Goal: Have check-for-changes.sh wait for changes to "settle" (checksum comparison has to not have changed over two different checks with 5-second gaps between) before modifying and restarting anything.
Why: The eventual API (docker-mailserver/docker-mailserver-admin#3) will allow users to make frequent changes to the config and we don't want them triggering service restarts until those requests/changes settle down.
Type of change
Checklist:
docs/)