-
Notifications
You must be signed in to change notification settings - Fork 38.7k
refactor: introduce single-separator split helper (boost::split replacement) #22953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: introduce single-separator split helper (boost::split replacement) #22953
Conversation
maflcko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems ok to use the already existing split function.
e332fcd to
e3c7fe4
Compare
|
Force-pushed with changes as suggested by MarcoFalke (#22953 (comment)). Pinging @practicalswift as someone who has always shown interest in potential boost replacements and string processing helpers, with lots of contributions in that area. Maybe you feel like taking a look :) |
e3c7fe4 to
742fc50
Compare
|
I would still add a unit test even though there are tests for the internal helper. Maybe it's a little bit over the top so feel free to ignore my suggestion. (Something like this possibly: kiminuo@ac6dd5a) |
|
Any reason the tests aren't fixed up as well? Also, if you add a unit test, might as extend the |
|
There non-tests occurrences too: If I'm not missing something. edit: Ah, I'm not aware what the difference is between |
742fc50 to
25915e4
Compare
I agree that it's a good idea to add a unit test.
Thanks! As quick feedback, can you change the commit subject to be more verbose (e.g. "test: add unit tests for SplitString helper") and put the test into
Oh, I only |
Modified here: kiminuo@4516727. Anyway, feel free to modify the commit in whatever way you like :) |
|
The fuzz test I mentioned: (might not compile) diff --git a/src/test/fuzz/string.cpp b/src/test/fuzz/string.cpp
index 0c1b45b86c..e5a09c0013 100644
--- a/src/test/fuzz/string.cpp
+++ b/src/test/fuzz/string.cpp
@@ -127,6 +127,7 @@ FUZZ_TARGET(string)
int64_t amount_out;
(void)ParseFixedPoint(random_string_1, fuzzed_data_provider.ConsumeIntegralInRange<int>(0, 1024), &amount_out);
}
+ (void)SplitString(random_string_1, fuzzed_data_provider.ConsumeIntegral<char>());
{
(void)Untranslated(random_string_1);
const bilingual_str bs1{random_string_1, random_string_2}; |
Oh, I didn't see that in your previous comment. It compiles, I added another commit 👌 |
a30f721 to
8a9f940
Compare
|
Concept ACK - this is low-overhead for dropping more Boost dependence. Surprised we can't also nuke anything from |
|
Concept ACK. I don't think this boost use is very bad (it's a header-only library) but if we can replace it with functionality we already have internally, who not.
Well, |
8a9f940 to
ed24bf5
Compare
|
Force-pushed the alternative variant as suggested by @martinus, changing |
ed24bf5 to
8949509
Compare
maflcko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of forcing devs to remember to construct a std::string manually each time before calling the function.
src/test/util_tests.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems dangerous to offer a function that silently does the wrong thing when passed the "wrong" type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. This is fixed now by using an explicit interface instead of an alias, as suggested by martinus in #22953 (comment).
8949509 to
36ac60d
Compare
|
Windows build error seems unrelated, in |
|
See #18548
On Thu, 23 Sep 2021 at 08:54, Martin Leitner-Ankerl < ***@***.***> wrote:
Windows build error seems unrelated, in wallet_address_types.py:
OSError: [WinError 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#22953 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH3PXPQDTK3VEUUU4R2KEY3UDK6JVANCNFSM5D246WWQ>
.
--
…--
Hennadii Stepanov
|
36ac60d to
049e609
Compare
|
Rebased on master. |
…-util.cpp 3bb9627 refactor: remove unused boost header include in bitcoin-util.cpp (Sebastian Falbesoner) Pull request description: This header was included since the introduction of bitcoin-util in commit 13762bc, but boost was actually never used (see `git log -S boost ./src/bitcoin-util.cpp`). Cherry-picked out of bitcoin#22953, which currently needs rebase. This commit could just be merged on its own. ACKs for top commit: MarcoFalke: review ACK 3bb9627 Tree-SHA512: 201ee1aa4d49074056654203db73a473479c2b92c49df8dbf8e35979f85178013c66540a665f0f6dc0a2efef88eb091e2b088bebff85d840033dffd8ae719349
This helper uses spanparsing::Split internally and enables to replace all calls to boost::split where only a single separator is passed. Co-authored-by: Martin Ankerl <[email protected]> Co-authored-by: MarcoFalke <[email protected]>
1d066fc to
a62e844
Compare
|
Rebased on master again (there was a conflict in src/rest.cpp after #24098 has been merged). |
|
@martinus want to take another look here? |
|
Code review ACK a62e844. Ran all tests. I also like that with After that PR there is only one usage of |
…r (boos… …t::split replacement)
f849e63 fuzz: SplitString with multiple separators (Martin Leitner-Ankerl) d1a9850 http: replace boost::split with SplitString (Martin Leitner-Ankerl) 0d7efcd core_read: Replace boost::split with SplitString (Martin Leitner-Ankerl) b7ab9db Extend Split to work with multiple separators (Martin Leitner-Ankerl) Pull request description: As a followup of #22953, this removes the remaining occurrences of `boost::split` and replaces them with our own `SplitString`. To be able to do so, this extends the function `spanparsing::Split` to work with multiple separators. Finally this removes 3 more files from `lint-includes.py`. ACKs for top commit: theStack: Code-review ACK f849e63 Tree-SHA512: f37d4dbe11cab2046e646045b0f018a75f978d521443a2c5001512737a1370e22b09247d5db0e5c9e4153229a4e2d66731903c1bba3713711c4cae8cedcc775d
…litString f849e63 fuzz: SplitString with multiple separators (Martin Leitner-Ankerl) d1a9850 http: replace boost::split with SplitString (Martin Leitner-Ankerl) 0d7efcd core_read: Replace boost::split with SplitString (Martin Leitner-Ankerl) b7ab9db Extend Split to work with multiple separators (Martin Leitner-Ankerl) Pull request description: As a followup of bitcoin#22953, this removes the remaining occurrences of `boost::split` and replaces them with our own `SplitString`. To be able to do so, this extends the function `spanparsing::Split` to work with multiple separators. Finally this removes 3 more files from `lint-includes.py`. ACKs for top commit: theStack: Code-review ACK f849e63 Tree-SHA512: f37d4dbe11cab2046e646045b0f018a75f978d521443a2c5001512737a1370e22b09247d5db0e5c9e4153229a4e2d66731903c1bba3713711c4cae8cedcc775d
backport: merge bitcoin#18289, bitcoin#19090, bitcoin#22859, bitcoin#18758, bitcoin#21016, bitcoin#22953, bitcoin#25027, bitcoin#20461, bitcoin#25025, bitcoin#25057, bitcoin#25068 (deboostification)
This PR adds a simple string split helper
SplitStringthat takes use of the spanparsingSplitfunction that was first introduced in #13697 (commit fe8a7dc). This enables to replace most calls toboost::split, in the cases where only a single separator character is used. Note that while previous attempts to replaceboost::splitwere controversial (e.g. #13751), this one has a trivial implementation: it merely uses an internal helper (that is unit tested and in regular use with output descriptiors) and converts its result from spans to strings. As a drawback though, not allboost::splitinstances can be tackled.As a possible optimization, one could return a vector of
std::string_views (available since C++17) instead of strings, to avoid copies. This would need more carefulness on the caller sites though, to avoid potential lifetime issues, and it's probably not worth it, considering that none of the places where strings are split are really performance-critical.