-
Notifications
You must be signed in to change notification settings - Fork 173
fix(rust/driver_manager): don't dlclose drivers #3844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(rust/driver_manager): don't dlclose drivers #3844
Conversation
|
Thank you @savannahar68 for debugging the issue and helping with the fix |
5fdead7 to
4570524
Compare
rust/driver_manager/src/lib.rs
Outdated
| // By default, go builds the libraries with '-Wl -z nodelete' which does not | ||
| // unload the go runtime. This isn't respected on mac ( https://github.com/golang/go/issues/11100#issuecomment-932638093 ) | ||
| // so we need to explicitly load the library with RTLD_NODELETE( which prevents unloading ) | ||
| if cfg!(target_os = "macos") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the cfg attribute #[cfg(...)] is better than cfg! here.
(In this case, cfg! case will compile on any OS.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Updated the code to use #[cfg(...)]
4570524 to
0bfa253
Compare
rust/driver_manager/src/lib.rs
Outdated
| #[cfg(target_os = "macos")] | ||
| use libloading::os::unix::Library; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #[cfg(target_os = "macos")] | |
| use libloading::os::unix::Library; |
I think we can remove this and just qualify the identifier below.
rust/driver_manager/src/lib.rs
Outdated
| Library::open( | ||
| Some(filename.as_ref()), | ||
| RTLD_LAZY | RTLD_LOCAL | RTLD_NODELETE, | ||
| ) | ||
| .map(Into::into) | ||
| .map_err(libloading_error_to_adbc_error)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Library::open( | |
| Some(filename.as_ref()), | |
| RTLD_LAZY | RTLD_LOCAL | RTLD_NODELETE, | |
| ) | |
| .map(Into::into) | |
| .map_err(libloading_error_to_adbc_error)? | |
| libloading::os::unix::Library::open( | |
| Some(filename.as_ref()), | |
| RTLD_LAZY | RTLD_LOCAL | RTLD_NODELETE, | |
| ) | |
| .map_err(libloading_error_to_adbc_error)? | |
| .into() |
This feels a little simpler and more idiomatic.
rust/driver_manager/src/lib.rs
Outdated
| const RTLD_LAZY: i32 = 0x1; | ||
| const RTLD_LOCAL: i32 = 0x4; | ||
| const RTLD_NODELETE: i32 = 0x80; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could pull these from the libc crate, is it worth it though? cc @eitsupi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that RTLD_LAZY and RTLD_LOCAL are also defined in libloading? 🤔
https://github.com/nagisa/rust_libloading/blob/dab97c569b33bd515e16637b8dedbdc696d9ec9c/src/os/unix/consts.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Only those two, but not RTLD_NODELETE. I don't feel super stressed about hard-coding them here to avoid taking on libc as a dep.
|
I don't fully understand the nature of the problem, so this may be off the mark, but is this a problem specific to the Rust Driver Manager? (Just to be clear, I'm not saying everything should be addressed in this PR, just a question.) |
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think we should just std::mem::forget the library (bypassing its Drop implementation) and not dlclose anything at all - it's just not safe. (Or remove the platform-specific cfg and do this for all platforms.)
I haven't tried with other languages, but yeah, we'll probably need to do the same for other languages as well. I'm thinking of getting the rust fix in first. Doing the same for other languages should be trivial. |
I've tested with |
|
IIRC, we don't dlclose in the C++ driver manager, or C#. So Rust may be the only one affected.
|
|
Makes sense. I'll open a separate PR upstream soon to add support for RTLD_NODELETE. Are there any other changes I should do to make this PR mergeable |
|
I think we can remove the |
|
I'm not sure about windows. Let me see the current implementation and try te replicate it. |
Yeah. Libloading does have separate definitions for windows and unix ( https://github.com/nagisa/rust_libloading/blob/dab97c569b33bd515e16637b8dedbdc696d9ec9c/src/safe.rs#L6 ). |
0bfa253 to
252ccd3
Compare
|
Pushed. Can you folks take another look? Meanwhile, I'll work on getting the flag, and some other improvements merged upstream to libloading |
rust/driver_manager/src/lib.rs
Outdated
| }; | ||
| #[cfg(not(unix))] | ||
| let library = unsafe { | ||
| libloading::Library::new(filename.as_ref()).map_err(libloading_error_to_adbc_error)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked GitHub issue notes there's a Windows equivalent flag; is that possible to provide to libloading?
On Windows, a similar
GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_PIN, ...)is required for the same effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a windows expert, but I see that this flag is not passed when we load the library. During loading, the LoadLibraryExW function is called ( Ref ).
The way to do it in windows is to call the GetModuleHandleExW function with this flag. libloading supports this using the pin function ( Ref ). Let's call this fn in the windows cfg, so that we have the same behaviour across all platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed this change for windows. Can you take a look? I'm also opening up a conversation upstream to try and get this changes merged into libloading Upstream discussion thread -> nagisa/rust_libloading#193
fd2fe3c to
97055bd
Compare
|
It seems tests are failing? |
|
Weird. I tested again on mac and linux and I don't see this error. In the CI, for a test, I also see the driver not found Could it be an intermittent issue with the way the driver was built? I'm trying to reproduce this error, but meanwhile can you retry the pipeline once? |
475d31e to
137b2a4
Compare
|
@lidavidm The tests are passing now. Here's the changes I did
|
rust/driver_manager/src/lib.rs
Outdated
| let library: libloading::Library = unsafe { | ||
| use std::os::raw::c_int; | ||
|
|
||
| const RTLD_NODELETE: c_int = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears this is different on my system:
/usr/include/x86_64-linux-gnu/bits/dlfcn.h
41:#define RTLD_NODELETE 0x01000
It appears to be 0x80 on macOS. Where did 8 come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need some #[cfg] to define this, and fall back to 0xFFFFFFFF on Unix-like platforms besides linux (glibc) and macOS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. You're right. It's 0x80 for mac ( Ref -> https://github.com/realthunder/mac-headers/blob/88d4348d732fe95385faef950eb660626f9ff711/usr/include/dlfcn.h#L72 )
Maybe we need some #[cfg] to define this, and fall back to 0xFFFFFFFF on Unix-like platforms besides linux (glibc) and macOS?
Yeah. Doing this. Or do you think we should print error/warning and not 'OR' with RTLD_NODELETE for unknown OS ? Something like this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not spam the console.
950483e to
eb5873f
Compare
rust/driver_manager/src/lib.rs
Outdated
| )) { | ||
| 0x1000 | ||
| } else { | ||
| 0xFFFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also wait, shouldn't this be 0 since we're OR-ing it? (Sorry, I messed this up)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. I also missed it. Fixed now.
eb5873f to
d0d2c94
Compare
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
Thank you for merging this @lidavidm . |
|
Hopefully this week (because you happened to coincide with the release plans) |
According to golang/go#11100 (comment), the
-Wl,-z,nodeletedoes not work on macos. This causes dlclose to be called when the shared library is dropped, which results in unexpected behaviour ( I've observed the program hangs because Go cannot kill the goroutines and so waits for them to close )On linux, since the
-Wl,-z,nodeleteis respected, we do not see this issue.This PR fixes the problem by ensuring that we explicitly open the library with
RTLD_NODELETEflag which ensures thatdlcloseis not called during unloadFixes #3840