Skip to content

Commit 088d7f3

Browse files
committed
api: add Cow guarantee to replace API
This adds a guarantee to the API of the `replace`, `replace_all` and `replacen` routines that, when `Cow::Borrowed` is returned, it is guaranteed that it is equivalent to the `haystack` given. The implementation has always matched this behavior, but this elevates the implementation behavior to an API guarantee. There do exists implementations where this guarantee might not be upheld in every case. For example, if the final result were the empty string, we could return a `Cow::Borrowed`. Similarly, if the final result were a substring of `haystack`, then `Cow::Borrowed` could be returned in that case too. In practice, these sorts of optimizations are tricky to do in practice, and seem like niche corner cases that aren't important to optimize. Nevertheless, having this guarantee is useful because it can be used as a signal that the original input remains unchanged. This came up in discussions with @quicknir on Discord. Namely, in cases where one is doing a sequence of replacements and in most cases nothing is replaced, using a `Cow` is nice to be able to avoid copying the haystack over and over again. But to get this to work right, you have to know whether a `Cow::Borrowed` matches the input or not. If it doesn't, then you'd need to transform it into an owned string. For example, this code tries to do replacements on each of a sequence of `Cow<str>` values, where the common case is no replacement: ```rust use std::borrow::Cow; use regex::Regex; fn trim_strs(strs: &mut Vec<Cow<str>>) { strs .iter_mut() .for_each(|s| moo(s, &regex_replace)); } fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) { let result = f(&c); match result { Cow::Owned(s) => *c = Cow::Owned(s), Cow::Borrowed(s) => { *c = Cow::Borrowed(s); } } } fn regex_replace(s: &str) -> Cow<str> { Regex::new(r"does-not-matter").unwrap().replace_all(s, "whatever") } ``` But this doesn't pass `borrowck`. Instead, you could write `moo` like this: ```rust fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) { let result = f(&c); match result { Cow::Owned(s) => *c = Cow::Owned(s), Cow::Borrowed(s) => { if !std::ptr::eq(s, &**c) { *c = Cow::Owned(s.to_owned()) } } } } ``` But the `std::ptr:eq` call here is a bit strange. Instead, after this PR and the new guarantee, one can write it like this: ```rust fn moo<F: FnOnce(&str) -> Cow<str>>(c: &mut Cow<str>, f: F) { if let Cow::Owned(s) = f(&c) { *c = Cow::Owned(s); } } ```
1 parent a5ae351 commit 088d7f3

File tree

2 files changed

+34
-0
lines changed

2 files changed

+34
-0
lines changed

src/regex/bytes.rs

+17
Original file line numberDiff line numberDiff line change
@@ -651,6 +651,9 @@ impl Regex {
651651
/// case, this implementation will likely return a `Cow::Borrowed` value
652652
/// such that no allocation is performed.
653653
///
654+
/// When a `Cow::Borrowed` is returned, the value returned is guaranteed
655+
/// to be equivalent to the `haystack` given.
656+
///
654657
/// # Replacement string syntax
655658
///
656659
/// All instances of `$ref` in the replacement string are replaced with
@@ -761,6 +764,13 @@ impl Regex {
761764
/// replacement provided. This is the same as calling `replacen` with
762765
/// `limit` set to `0`.
763766
///
767+
/// If no match is found, then the haystack is returned unchanged. In that
768+
/// case, this implementation will likely return a `Cow::Borrowed` value
769+
/// such that no allocation is performed.
770+
///
771+
/// When a `Cow::Borrowed` is returned, the value returned is guaranteed
772+
/// to be equivalent to the `haystack` given.
773+
///
764774
/// The documentation for [`Regex::replace`] goes into more detail about
765775
/// what kinds of replacement strings are supported.
766776
///
@@ -855,6 +865,13 @@ impl Regex {
855865
/// matches are replaced. That is, `Regex::replace_all(hay, rep)` is
856866
/// equivalent to `Regex::replacen(hay, 0, rep)`.
857867
///
868+
/// If no match is found, then the haystack is returned unchanged. In that
869+
/// case, this implementation will likely return a `Cow::Borrowed` value
870+
/// such that no allocation is performed.
871+
///
872+
/// When a `Cow::Borrowed` is returned, the value returned is guaranteed
873+
/// to be equivalent to the `haystack` given.
874+
///
858875
/// The documentation for [`Regex::replace`] goes into more detail about
859876
/// what kinds of replacement strings are supported.
860877
///

src/regex/string.rs

+17
Original file line numberDiff line numberDiff line change
@@ -642,6 +642,9 @@ impl Regex {
642642
/// case, this implementation will likely return a `Cow::Borrowed` value
643643
/// such that no allocation is performed.
644644
///
645+
/// When a `Cow::Borrowed` is returned, the value returned is guaranteed
646+
/// to be equivalent to the `haystack` given.
647+
///
645648
/// # Replacement string syntax
646649
///
647650
/// All instances of `$ref` in the replacement string are replaced with
@@ -748,6 +751,13 @@ impl Regex {
748751
/// replacement provided. This is the same as calling `replacen` with
749752
/// `limit` set to `0`.
750753
///
754+
/// If no match is found, then the haystack is returned unchanged. In that
755+
/// case, this implementation will likely return a `Cow::Borrowed` value
756+
/// such that no allocation is performed.
757+
///
758+
/// When a `Cow::Borrowed` is returned, the value returned is guaranteed
759+
/// to be equivalent to the `haystack` given.
760+
///
751761
/// The documentation for [`Regex::replace`] goes into more detail about
752762
/// what kinds of replacement strings are supported.
753763
///
@@ -842,6 +852,13 @@ impl Regex {
842852
/// matches are replaced. That is, `Regex::replace_all(hay, rep)` is
843853
/// equivalent to `Regex::replacen(hay, 0, rep)`.
844854
///
855+
/// If no match is found, then the haystack is returned unchanged. In that
856+
/// case, this implementation will likely return a `Cow::Borrowed` value
857+
/// such that no allocation is performed.
858+
///
859+
/// When a `Cow::Borrowed` is returned, the value returned is guaranteed
860+
/// to be equivalent to the `haystack` given.
861+
///
845862
/// The documentation for [`Regex::replace`] goes into more detail about
846863
/// what kinds of replacement strings are supported.
847864
///

0 commit comments

Comments
 (0)