-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault during eval of a regex #10
Comments
Could you please do the debugging work required to confirm that it's a bug in this library and not PCRE2 itself? I can indeed reproduce this, however, tweaking the regex as you suggest makes the segfault go away. That strongly points toward PCRE2 itself. |
Hmmm, I wonder if this is related to the fact that you're passing an empty vec. Since it doesn't allocate, it probably points to something that can't be dereferenced. So this library probably needs to special case an empty slice. |
That's a possibility, but I find it odd that the behavior would change with the regex, if it is upstream, possibly a optimization bug? I'll look into it further |
I think the empty slice issue is the more likely candidate. But changing a regex and having it result in buggy behavior is by itself not odd at all. Lots of optimizations in production grade regex engines have thresholds and what not, which trigger different code paths. For example, in your regex, all branches have the same fixed length. So the regex engine can tell that there is no match by just inspecting the length. It could do the same thing when you remove an |
Yes most likely |
pub unsafe fn find(
&mut self,
code: &Code,
subject: &[u8],
start: usize,
options: u32,
) -> Result<bool, Error> {
println!("pointer: {:?}", subject.as_ptr());
let rc = pcre2_match_8(
code.as_ptr(),
subject.as_ptr(),
subject.len(),
start,
options,
self.as_mut_ptr(),
self.match_context,
);
|
Not exactly sure what the best solution is yet, please take a look at the pull-request! Happy to make changes. |
Closed by #11. |
For completeness I also reported it to pcre2 maintainers: https://bugs.exim.org/show_bug.cgi?id=2440 On one hand: Yes, we're passing it an invalid pointer, who knows what will happen |
Awesome, thanks! |
To work around likely bugs in (older versions of) PCRE2. Namely, at one point, PCRE2 would dereference the haystack pointer even when the length was zero. This was reported in #10 and we worked around this in #11 by passing a pointer to a const `&[]`, with the (erroneous) presumption that this would be a valid pointer to dereference. In retrospect though, this was a little silly, because you should never be dereferencing a pointer to an empty slice. It's not valid. Alas, at that time, Rust did actually hand you a valid pointer that could be dereferenced. But [this PR][rust-pull] changed that. And thus, we're back to where we started: handing buggy versions of PCRE2 a zero length haystack with a dangling pointer. So we fix this once and for all by passing a slice of length 1, but with a haystack length of 0, to the PCRE2 search routine when searching an empty haystack. This will guarantee the provision of a dereferencable pointer should PCRE2 decide to dereference it. Fixes #42 [rust-pull]: rust-lang/rust#123936
To work around likely bugs in (older versions of) PCRE2. Namely, at one point, PCRE2 would dereference the haystack pointer even when the length was zero. This was reported in #10 and we worked around this in #11 by passing a pointer to a const `&[]`, with the (erroneous) presumption that this would be a valid pointer to dereference. In retrospect though, this was a little silly, because you should never be dereferencing a pointer to an empty slice. It's not valid. Alas, at that time, Rust did actually hand you a valid pointer that could be dereferenced. But [this PR][rust-pull] changed that. And thus, we're back to where we started: handing buggy versions of PCRE2 a zero length haystack with a dangling pointer. So we fix this once and for all by passing a slice of length 1, but with a haystack length of 0, to the PCRE2 search routine when searching an empty haystack. This will guarantee the provision of a dereferencable pointer should PCRE2 decide to dereference it. Fixes #42 [rust-pull]: rust-lang/rust#123936
Found this interesting segfault today! Seems to happen using this regex, possibly others... But if you remove a single
x
, it no longer segfaults...Here is a barebone example
The text was updated successfully, but these errors were encountered: