[pantsd] Address pantsd-runner hang on Linux and re-enable integration test.#4407
Merged
kwlzn merged 6 commits intopantsbuild:masterfrom Mar 31, 2017
Merged
Conversation
stuhood
reviewed
Mar 31, 2017
| match self.node.clone() { | ||
| EntryKey::Valid(n) => { | ||
| let pool = context_factory.pool(); | ||
| let pool_opt = context_factory.pool(); |
Member
There was a problem hiding this comment.
One line for this is probably fine.
Member
Author
There was a problem hiding this comment.
not according to the compiler.
error: borrowed value does not live long enough
--> src/rust/engine/src/graph.rs:115:88
|
115 | let pool = context_factory.pool().as_ref().expect("Uninitialized CpuPool!");
| ---------------------- temporary value created here ^ temporary value dropped here while still borrowed
...
122 | },
| - temporary value needs to live until here
|
= note: consider using a `let` binding to increase its lifetime
stuhood
approved these changes
Mar 31, 2017
| """Fork, daemonize and invoke self.post_fork_child() (via ProcessManager).""" | ||
| self.daemonize(write_pid=False) | ||
|
|
||
| def pre_fork(self): |
Member
There was a problem hiding this comment.
IIRC, post_fork is called in two places... does this need to be as well?
207f8d6 to
65e5697
Compare
lenucksi
pushed a commit
to lenucksi/pants
that referenced
this pull request
Apr 25, 2017
…n test. (pantsbuild#4407) ### Problem Currently, on Linux the first thin client call to the daemon can deadlock just after the pantsd->fork->pantsd-runner workflow. Connecting to the process with `gdb` reveals a deadlock in the following stack in the `post_fork` `drop` of the `CpuPool`: ``` #0 0x00007f63f04c31bd in __lll_lock_wait () from /lib64/libpthread.so.0 No symbol table info available. pantsbuild#1 0x00007f63f04c0ded in pthread_cond_signal@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 No symbol table info available. pantsbuild#2 0x00007f63d3cfa438 in notify_one () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/condvar.rs:52 No locals. pantsbuild#3 notify_one () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys_common/condvar.rs:39 No locals. pantsbuild#4 notify_one () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sync/condvar.rs:208 No locals. pantsbuild#5 std::thread::{{impl}}::unpark () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/thread/mod.rs:633 No locals. pantsbuild#6 0x00007f63d3c583d1 in crossbeam::sync::ms_queue::{{impl}}::push<futures_cpupool::Message> (self=<optimized out>, t=...) at /home/kwilson/.cache/pants/rust-toolchain/registry/src/github.com-1ecc6299db9ec823/crossbeam-0.2.10/src/sync/ms_queue.rs:178 guard = <optimized out> self = <optimized out> pantsbuild#7 0x00007f63d3c588ed in futures_cpupool::{{impl}}::drop (self=<optimized out>) at /home/kwilson/.cache/pants/rust-toolchain/git/checkouts/futures-rs-a4f11d094efefb0a/f7e6bc8/futures-cpupool/src/lib.rs:236 self = 0x37547a0 pantsbuild#8 0x00007f63d3be871c in engine::fs::{{impl}}::post_fork (self=0x3754778) at /home/kwilson/dev/pants/src/rust/engine/src/fs.rs:355 self = 0x3754778 pantsbuild#9 0x00007f63d3be10e4 in engine::context::{{impl}}::post_fork (self=0x37545b0) at /home/kwilson/dev/pants/src/rust/engine/src/context.rs:93 self = 0x37545b0 pantsbuild#10 0x00007f63d3c0de5a in {{closure}} (scheduler=<optimized out>) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:275 scheduler = 0x3740580 pantsbuild#11 with_scheduler<closure,()> (scheduler_ptr=<optimized out>, f=...) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:584 scheduler = 0x3740580 scheduler_ptr = 0x3740580 pantsbuild#12 engine::scheduler_post_fork (scheduler_ptr=0x3740580) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:274 scheduler_ptr = 0x3740580 pantsbuild#13 0x00007f63d3c1be8c in _cffi_f_scheduler_post_fork (self=<optimized out>, arg0=0x35798f0) at src/cffi/native_engine.c:2234 _save = 0x34a65a0 x0 = 0x3740580 datasize = <optimized out> pantsbuild#14 0x00007f63f07b5a62 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0 ``` This presents as a hang in the thin client, because the pailgun socket is left open in the pantsd-runner. ### Solution Add pre-fork hooks and tear down the `CpuPool` instances prior to forking and rebuilding them. ### Result Can no longer reproduce the hang.
asherf
added a commit
to asherf/pants
that referenced
this pull request
Sep 11, 2021
https://python-poetry.org/history/#118---2021-08-19 Fixed an error with repository prioritization when specifying secondary repositories. (pantsbuild#4241) Fixed the detection of the system environment when the setting virtualenvs.create is deactivated. (pantsbuild#4330, pantsbuild#4407) Fixed the evaluation of relative path dependencies. (pantsbuild#4246) Fixed environment detection for Python 3.10 environments. (pantsbuild#4387) Fixed an error in the evaluation of in/not in markers (python-poetry/poetry-core#189)
stuhood
pushed a commit
that referenced
this pull request
Sep 13, 2021
https://python-poetry.org/history/#118---2021-08-19 Fixed an error with repository prioritization when specifying secondary repositories. (#4241) Fixed the detection of the system environment when the setting virtualenvs.create is deactivated. (#4330, #4407) Fixed the evaluation of relative path dependencies. (#4246) Fixed environment detection for Python 3.10 environments. (#4387) Fixed an error in the evaluation of in/not in markers (python-poetry/poetry-core#189)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Currently, on Linux the first thin client call to the daemon can deadlock just after the pantsd->fork->pantsd-runner workflow. Connecting to the process with
gdbreveals a deadlock in the following stack in thepost_forkdropof theCpuPool:This presents as a hang in the thin client, because the pailgun socket is left open in the pantsd-runner.
Solution
Add pre-fork hooks and tear down the
CpuPoolinstances prior to forking and rebuilding them.Result
Can no longer reproduce the hang.