clickhouse-test: send TERM to all childs (to avoid hung check triggering)#23750
clickhouse-test: send TERM to all childs (to avoid hung check triggering)#23750alexey-milovidov merged 1 commit intoClickHouse:masterfrom
Conversation
tests/clickhouse-test
Outdated
There was a problem hiding this comment.
I tried to do this earlier with a separate process group + atexit callbacks:
573983d#diff-3a359de18cacf146f406a7ae332fb47196aa5e0aa430eb4b157a202a3cb8e6e3R578
But that commit was later reverted because it also tried to switch to multithreading instead of multiprocessing, and that didn't go good. SIG_IGN and SIG_DFL were broken then (https://bugs.python.org/issue23395), now they are fixed but not quite but maybe it's not relevant for us.
There was a problem hiding this comment.
I tried to do this earlier with a separate process group + atexit callbacks:
573983d#diff-3a359de18cacf146f406a7ae332fb47196aa5e0aa430eb4b157a202a3cb8e6e3R578
Thanks for the reference!
SIG_IGN and SIG_DFL were broken then (https://bugs.python.org/issue23395), now they are fixed but not quite but maybe it's not relevant for us.
I didn't see any issues while testing locally, and I think that it is better to kill tests and get some (even non-standard since first part of the fix is to identify the error) error instead of leaving tests in background, since hung check will find them and report, thoughts?
…ing) This is another try of not leaving child processes in clickhouse-test, first one was in [1] by @akuzm: "I tried to do this earlier with a separate process group + atexit callbacks: ClickHouse@573983d#diff-3a359de18cacf146f406a7ae332fb47196aa5e0aa430eb4b157a202a3cb8e6e3R578 But that commit was later reverted because it also tried to switch to multithreading instead of multiprocessing, and that didn't go good. SIG_IGN and SIG_DFL were broken then (https://bugs.python.org/issue23395), now they are fixed but not quite but maybe it's not relevant for us." I looked (only briefly) through that bug report in python, but I don't see any issues with killing child processes during testing this patch. Plus to me it is better to get some unknown python error (and fix it somehow) instead of leaving child processes. v2: correctly catch INT/TERM/HUP too
| # Move to a new process group and kill it at exit so that we don't have any | ||
| # infinite tests processes left | ||
| # (new process group is required to avoid killing some parent processes) | ||
| os.setpgid(0, 0) |
There was a problem hiding this comment.
This actually does not work for everything, since timeout also calls setpgid
Hung check like here 1.
Changelog category (leave one):