l2a5b1
commented at 12:19 pm on July 24, 2018:
contributor
This pull request drops the boost/algorithm/string/split.hpp dependency from the project. It replaces boost::split with a custom template function Split that has a similar API and returns exactly the same results as boost::split.
In addition this pull request refactors an instance of boost::algorithm::trim_right, which was implicitly made available via the boost/algorithm/string.hpp dependency to prevent having to introduce a new dependency boost/algorithm/string/trim.hpp.
The #include <boost/algorithm/string/predicate.hpp> in src/wallet/rpcdump.cpp will be addressed in #13656 if that PR is merged after this one; or in this PR should #13656 be merged first.
The test vectors in the accompanying unit test have been validated against boost::split.
fanquake added the label
Refactoring
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
fanquake added this to the "In progress" column in a project
MarcoFalke
commented at 12:44 pm on July 24, 2018:
member
Hmm, this is just replacing a boost header with a self-written header that offers the same functionality and comes with unit tests. Do we really want to re-implement all of boost we are currently using? I mean straightforward replacing boost:: with std:: is fine, but re-implementing boost should be one of our low priority goals, no?
l2a5b1
commented at 2:56 pm on July 24, 2018:
contributor
The goal was not to re-implement boost. This is an attempt to drop the boost::split dependency with as little changes to the existing codebase as possible.
I have eliminated boost where it was trivial to do: boost::algorithm::is_any_of has been refactored to a std::string and boost::algorithm::trim_right to a std::string:erase.
There are many ways to simplify this function: e.g. no support for both std::vector and std::set; no support for boost::token_compress_on.
I didn’t want to go there initially, because I didn’t want to break things and ease review by providing something that is straightforward to validate. Right now it is easy to swap the boost::split and Split functions in the unit test to validate if the external behaviour of Split still works as expected.
That said, I don’t have strong opinion on this. I am happy to make changes where needed; or close the pull request if this is not what you’re looking for.
I just hope I haven’t messed up π
l2a5b1 force-pushed
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
l2a5b1 force-pushed
on Jul 24, 2018
DrahtBot
commented at 3:00 pm on July 25, 2018:
member
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#17037 (Testschains: Many regtests with different genesis and default datadir by jtimon)
#16887 (Abstract out some of the descriptor Span-parsing helpers by sipa)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
practicalswift
commented at 3:56 pm on July 25, 2018:
contributor
Concept ACK
The short self-contained non-Boost version is easier to reason about.
Every small step counts in our journey to get rid of our Boost dependencies! :-)
MarcoFalke
commented at 4:45 pm on July 25, 2018:
member
@practicalswift Sure, but it has to go through review to verify that it supports everything that was supported by boost the way we used it.
We are stuck with boost until we adopt C++17, since implementing std::optional or std::variant on our own wouldn’t be too wise, imo. So at least there is no rush in getting rid of non-trivial boost features that are not already in std::.
practicalswift
commented at 6:58 pm on July 25, 2018:
contributor
@MarcoFalke Agreed β no rush in removing non-trivial Boost features. Letβs do it incrementally starting with the low hanging fruit :-)
l2a5b1
commented at 7:05 pm on July 25, 2018:
contributor
So at least there is no rush in getting rid of non-trivial boost features that are not already in std::@MarcoFalke, apologies about this. I have totally misinterpreted the objective of the Boost migration.
l2a5b1 closed this
on Jul 25, 2018
MarcoFalke
commented at 7:22 pm on July 25, 2018:
member
Sorry for being so pessimistic about it. You can definitely leave this open and see what other reviewers think. I was just saying that we have about 3 years time to finish up everything.
MarcoFalke reopened this
on Jul 25, 2018
l2a5b1
commented at 7:40 pm on July 25, 2018:
contributor
practicalswift
commented at 7:57 pm on July 25, 2018:
contributor
@251Labs Text splitting should be in the low hanging fruit category I mentioned :-)
sipa
commented at 0:17 am on July 26, 2018:
member
Note there is a function with similar functionality in #13697, but operating on Span<const char>s rather than std::strings.
practicalswift
commented at 7:17 am on July 26, 2018:
contributor
@251Labs See if you can base your work in this PR on @sipa:s code in #13697 :-)
in
src/utilsplitstring.h:41
in
81999762a8outdated
36+ const auto begin = str.cbegin();
37+ const auto end = str.cend();
38+
39+ for (auto it = begin; it < end;) {
40+ bool foundSeparator = false;
41+ auto tokenIt = it;
nit: this would be a bit simpler in implementation if it always referred to the end of the token, and you stashed the beginning here and continued to operate on it:
0for (auto it = begin; it < end; ++it) {
1bool foundSeparator = false;
2auto start = it;
3while (it < end &&!(foundSeparator = anyOfSeparator.find(*it) != std::string::npos)) {
4++it;
5 }
6if (it != begin && (!mergeEmpty || start != it)) {
7*insertIt = std::string(start, it);
8 }
9if (foundSeparator && (it == begin || it == end -1)) {
10*insertIt ="";
11 }
12}
I have updated the PR and addressed your feedback.
Please note that the separate if statements below are intentional and cover the case where it equals begin and it equals end - 1, in which case two empty tokens need to be inserted to be boost compliant.
I have added the static_assert, because I didn’t want to make any assumptions for containers other than std::vector<std::string> and std::set<std::string>. The compile time assert is there for the next person who wants to use this function with another type of container.
I am going to update the message to make the purpose of the assert clear. This may already address your feedback sufficiently. If it doesn’t, please let me know and I am happy to remove the static_assert.
l2a5b1 force-pushed
on Aug 3, 2018
l2a5b1 force-pushed
on Aug 4, 2018
l2a5b1
commented at 4:28 pm on August 4, 2018:
contributor
I had been thinking about this a while back when @sipa and @practicalswift hinted at the same, but didn’t see an easy non-invasive way to do this, because the code that uses boost::split relies on specific external behaviour thatstd::vector<Span<const char>> Split(const Span<const char>& sp, char sep) doesn’t provide.
For example:
torcontroller.cpp expects the result in a std::set<std::string> instead of std::vector<std::string>;
httprpc.cpp depends on boost::split’s capability to split on different separators;
core_read.cpp depends on boost::split’s capability to merge adjacent separators
Right now, the refactor seems trivial enough, because only the boost::split calls had to be replaced with calls to a custom function with identical external behaviour (as asserted by the unit tests, which can also be ran with the original boost::split function). No other code had to be refactored.
Dropping split from this pull request and trying to make the codebase work with std::vector<Span<const char>> Split(const Span<const char>& sp, char sep) seems very invasive, but maybe I am missing something obvious.
Alternatively, one split function could use the other one, e.g. std::vector<Span<const char>> Split(const Span<const char>& sp, char sep) from descriptor could use template<typename ContainerT>void Split(ContainerT& tokens, const std::string& str, const std::string& anyOfSeparator, bool mergeEmpty = false) from this pull request.
I am not sure about the options here. Thoughts?
in
src/utilsplitstring.h:42
in
5502212561outdated
37+ const auto end = str.cend();
38+
39+ for (auto it = begin; it < end; ++it) {
40+ bool foundSeparator = false;
41+ auto start = it;
42+ while (it < end && !(foundSeparator = anyOfSeparator.find(*it) != std::string::npos)) {
practicalswift
commented at 9:32 am on September 5, 2018:
Try to reformulate without mixing assignment and comparison. That would make it easier to read and reason about :-)
Hey, thanks @practicalswift! Yes, absolutely. Please see 3704125.
l2a5b1 force-pushed
on Sep 5, 2018
laanwj
commented at 4:57 pm on September 5, 2018:
member
@251Labs fair enough, if it’s so much hassle it is probably better to add another function, and leave the other split() private to the descriptor parsing code
DrahtBot added the label
Needs rebase
on Sep 24, 2018
l2a5b1 force-pushed
on Sep 25, 2018
DrahtBot removed the label
Needs rebase
on Sep 25, 2018
in
src/utilsplitstring.h:52
in
7bf4e39caboutdated
47+
48+ if (it != begin && (!mergeEmpty || start != it)) {
49+ *insertIt = std::string(start, it);
50+ }
51+
52+ if (foundSeparator) {
practicalswift
commented at 5:22 am on September 26, 2018:
foundSeparator can be unintialized here? Please initialize foundSeparator to false above.
l2a5b1
commented at 7:35 am on September 26, 2018:
I don’t think foundSeparator can be uninitialized at this point, because the outer for-loop of the Split function guarantees that it will go at least once through the inner for-loop where it is initialized, but I will revert back to explicit initialization of this variable for clarity.
l2a5b1 force-pushed
on Sep 26, 2018
l2a5b1
commented at 8:23 am on September 26, 2018:
contributor
Thanks again @practicalswift, I have addressed your feedback in 99074d1
DrahtBot added the label
Needs rebase
on Nov 5, 2018
l2a5b1 force-pushed
on Nov 5, 2018
l2a5b1 force-pushed
on Nov 5, 2018
DrahtBot removed the label
Needs rebase
on Nov 5, 2018
l2a5b1 force-pushed
on Nov 6, 2018
l2a5b1 force-pushed
on Nov 6, 2018
l2a5b1
commented at 8:28 am on November 6, 2018:
contributor
Rebased 5005a8f
DrahtBot added the label
Needs rebase
on Nov 13, 2018
l2a5b1 force-pushed
on Nov 14, 2018
DrahtBot removed the label
Needs rebase
on Nov 14, 2018
l2a5b1 force-pushed
on Nov 14, 2018
l2a5b1 force-pushed
on Nov 14, 2018
l2a5b1 force-pushed
on Nov 14, 2018
l2a5b1 force-pushed
on Nov 14, 2018
l2a5b1 force-pushed
on Nov 14, 2018
l2a5b1 force-pushed
on Nov 14, 2018
l2a5b1 force-pushed
on Nov 14, 2018
in
src/util/splitstring.h:23
in
549afaa918outdated
18+ * @param[in] str The string to tokenize.
19+ * @param[in] anyOfSeparator A string with valid separators.
20+ * @param[in] mergeEmpty Set to true to merge adjacent separators (empty tokens); otherwise false (default).
21+ */
22+template<typename ContainerT>
23+void Split(ContainerT& tokens, const std::string& str, const std::string& anyOfSeparator, bool mergeEmpty = false)
sipa
commented at 0:26 am on November 15, 2018:
member
Concept ACK
l2a5b1 force-pushed
on Nov 15, 2018
l2a5b1 force-pushed
on Nov 15, 2018
l2a5b1
commented at 9:19 am on November 16, 2018:
contributor
e236c47 renames lower camel case variable names to snake_case equivalents.
l2a5b1 force-pushed
on Dec 29, 2018
l2a5b1
commented at 3:59 pm on December 29, 2018:
contributor
rebased ab2a18d
practicalswift
commented at 4:55 pm on December 29, 2018:
contributor
@251Labs Looks good. Iβm near a utACK - could you rebase on top of #15043 and add a fuzzer for the split method which would split random input on random delimiter(s) and then make sure the result matches the result from boost::split?
luke-jr changes_requested
luke-jr
commented at 11:37 pm on December 29, 2018:
member
NACK. Using dependencies is strictly better than reinventing stuff for no reason.
Empact
commented at 11:40 pm on December 29, 2018:
member
@luke-jr I thought we’ve been working toward a long term goal of removing boost entirely. Is that not right?
practicalswift
commented at 11:52 pm on December 29, 2018:
contributor
In contrast to @luke-jr: let me re-iterate my concept ACK. IMO we should take small incremental steps towards a code base free from Boost which is exactly what this PR does.
luke-jr
commented at 2:21 am on December 30, 2018:
member
The goal is to replace Boost with standard C++ features. If C++ doesn’t have a standard replacement yet, then it only makes sense to keep using Boost for that until it does.
DrahtBot added the label
Needs rebase
on Jan 9, 2019
l2a5b1 force-pushed
on Jan 14, 2019
DrahtBot removed the label
Needs rebase
on Jan 14, 2019
l2a5b1 force-pushed
on Jan 14, 2019
l2a5b1 force-pushed
on Jan 14, 2019
l2a5b1 force-pushed
on Jan 14, 2019
l2a5b1 force-pushed
on Jan 14, 2019
l2a5b1 force-pushed
on Jan 14, 2019
l2a5b1
commented at 3:45 pm on January 15, 2019:
contributor
IMO we should take small incremental steps towards a code base free from Boost which is exactly what this PR does.
+1
could you rebase on top of #15043 and add a fuzzer for the split method
That is a good suggestion!
Happy to follow up on that once it is merged and this PR is not NACked.
@luke-jr:
NACK. Using dependencies is strictly better than reinventing stuff for no reason.
Perhaps when it is rocket-science, battle-proven, fully audited secure tech that the dependency provides. For trivial functions that’s nonsense.
If C++ doesn’t have a standard replacement yet, then it only makes sense to keep using Boost for that until it does.
If the goal is replace Boost then this PR should not be controversial:
There is no std counterpart, nor will there be one for years if ever at all;
The split function is straightforward, there is hardly anything going on under the hood;
The external behavior of the custom split function is the same as boost::split, making it easy to reason about at the call sites;
You can even validate this PR against boost::split by running the unit tests.
What more can you ask for? :)
tiendq
commented at 8:08 am on January 17, 2019:
none
@251Labs@MarcoFalke I just came here from a Boost to C++ 11 migration project page. Where can I find more information of migration plan? (if there is one). Just curious to know in case I could give it a hand :)
MarcoFalke
commented at 3:21 pm on January 17, 2019:
member
The migration is pretty much just to replace boost:: with std::. Though, the low hanging fruits in C++11 have already been collected (see https://github.com/bitcoin/bitcoin/projects/3).
MarcoFalke
commented at 3:22 pm on January 17, 2019:
member
If boost::split was the last dependency we used from boost, this pull could be merged, but I don’t see that coming any time soon.
DrahtBot added the label
Needs rebase
on Feb 15, 2019
DrahtBot removed the label
Needs rebase
on Feb 15, 2019
l2a5b1 force-pushed
on Feb 15, 2019
l2a5b1
commented at 8:03 pm on February 15, 2019:
contributor
rebased d7ff8eb
DrahtBot added the label
Needs rebase
on Apr 10, 2019
l2a5b1 force-pushed
on Apr 10, 2019
DrahtBot removed the label
Needs rebase
on Apr 10, 2019
l2a5b1
commented at 5:38 pm on April 10, 2019:
contributor
rebased 388257f
DrahtBot added the label
Needs rebase
on Apr 17, 2019
l2a5b1 force-pushed
on Apr 18, 2019
DrahtBot removed the label
Needs rebase
on Apr 18, 2019
l2a5b1 force-pushed
on Apr 18, 2019
l2a5b1
commented at 8:09 pm on April 18, 2019:
contributor
rebased 9f811c3
DrahtBot added the label
Needs rebase
on Apr 30, 2019
l2a5b1 force-pushed
on May 1, 2019
DrahtBot removed the label
Needs rebase
on May 1, 2019
l2a5b1 force-pushed
on May 7, 2019
l2a5b1
commented at 3:20 pm on May 7, 2019:
contributor
rebased 402da1c
Travis build error is unrelated:
0Error! Initial build successful, but not enough time remains to run later build stages and tests.
1Please manually re-run this job by using the travis restart button or asking a bitcoin maintainer
2to restart. The next run should not time out because the build cache has been saved.
promag
commented at 3:28 pm on May 7, 2019:
member
DrahtBot added the label
Needs rebase
on Jun 6, 2019
l2a5b1 force-pushed
on Jun 6, 2019
DrahtBot removed the label
Needs rebase
on Jun 6, 2019
l2a5b1
commented at 7:45 pm on June 6, 2019:
contributor
Rebased 6186091
DrahtBot added the label
Needs rebase
on Jun 18, 2019
l2a5b1 force-pushed
on Jun 20, 2019
DrahtBot removed the label
Needs rebase
on Jun 20, 2019
l2a5b1
commented at 7:04 am on June 21, 2019:
contributor
rebased 47a758b
DrahtBot added the label
Needs rebase
on Jun 27, 2019
l2a5b1 force-pushed
on Jun 28, 2019
DrahtBot removed the label
Needs rebase
on Jun 28, 2019
l2a5b1 force-pushed
on Jul 2, 2019
l2a5b1
commented at 2:05 pm on July 3, 2019:
contributor
Rebased 5730465
DrahtBot added the label
Needs rebase
on Jul 24, 2019
l2a5b1 force-pushed
on Aug 21, 2019
Drops boost/algorithm/string/split.hpp
This commit drops the `boost/algorithm/string/split.hpp` dependency from
the project.
It replaces `boost::split` with a custom function `Split` that has
an identical API and returns exactly the same results as `boost::split`
to ease refactoring.
In addition this commit refactors an instance of `boost::algorithm::trim_right`
which was implicitly made available via the `boost/algorithm/string.hpp`
dependency to prevent having to introduce a new dependency
`boost/algorithm/string/trim.hpp`.
7905ea0f16
l2a5b1 force-pushed
on Aug 21, 2019
DrahtBot removed the label
Needs rebase
on Aug 21, 2019
DrahtBot
commented at 10:40 pm on October 10, 2019:
member
DrahtBot added the label
Needs rebase
on Oct 10, 2019
fanquake
commented at 7:19 pm on October 31, 2019:
member
Thanks for the work you’ve done here. However the project has decided that while it remains impossible to remove all of our Boost usage, we are no longer going to remove smaller components by rewriting them ourselves.
fanquake closed this
on Oct 31, 2019
jnewbery moved this from the "In progress" to the "Later" column in a project
l2a5b1
commented at 11:42 pm on October 31, 2019:
contributor
12:08 < wumpus> #topic close Boost -> C++11 migration project for now (wumpus)
12:09 < warren> too much change required?
12:09 < wumpus> so it looks like all the low-hanging fruit for replacing boost has been picked now
12:10 < wumpus> what remains is hard to replace with C++11 (results in very verbose code, locale dependent legacy C, or introduces harder to maintain platform-specific code)
12:10 < jeremyrubin> It’s basically just multiindex and boost::thread now right?
12:10 < wumpus> it’s kind of a time wasting trap for developers now (regarding PRs like #17245)
12:10 < gribble> #17245 | wallet: Remove Boost from DecodeDumpTime by elichai . Pull Request #17245 . bitcoin/bitcoin . GitHub
12:10 < wumpus> nah, people stumble over the sleep and date/time handling stuff every time
12:11 < wumpus> #17307 might still be worthwhile
12:11 < gribble> #17307 | Stop using boost::thread_group . Issue #17307 . bitcoin/bitcoin . GitHub
12:11 < jeremyrubin> What about just checking in those specific copies of boost library code
12:11 < sipa> after 17307, won’t removing boost::threa dbe a lot easier?
12:11 < jeremyrubin> Or are they too large/dependent on the rest of boost types
12:11 < wumpus> but it needs to be focused, we don’t want to close the same PRs time after time that don’t really make it
12:12 < wumpus> I think having that project open sends the wrong messag
12:12 < wumpus> we can’t replace boost right now
12:12 < jeremyrubin> 17307, having worked on replacing thread_group in the checkqueue before, scares me a bit
12:12 < sipa> agree there
12:12 < wumpus> there’s no need to replace, say, small string handling functions with our own impelentation, before we have a vision to replace the rest
12:13 < sipa> right; until we have a reasonable way to remove multiindex (which i’m not sure will ever happen), getting rid of boost entirely is not really useful
12:13 < dongcarl> Just so it’s out there… we can maybe run bcp at some point when there’s only one or two things left https://www.boost.org/doc/libs/1_71_0/tools/bcp/doc/html/index.html
12:13 < jeremyrubin> dongcarl: ++
12:13 < sipa> i do think there are individual improvements possible that dkn’t go all the way, like removing boost::thread
12:14 < jnewbery> I agree that if it’s not a priority then the project should be closed. It can always be re-opened later if necessary.
12:14 < wumpus> but in any case it doesn’t seem like it needs a big coordinated project anymore
12:14 < sipa> how many non-headers-only boost libs are we still using?
12:14 < sipa> wumpus: agree
12:14 < jnewbery> (leaving a comment in the project description explaining why it’s been closed)
12:14 < wumpus> when C++17 is allowed, it can be reopened
12:15 < fanquake> Guess #13751 can be closed with the same rationale as well then.
12:15 < gribble> #13751 | Utils and libraries: Drops the boost/algorithm/string/split.hpp dependency by l2a5b1 . Pull Request #13751 . bitcoin/bitcoin . GitHub
12:15 < wumpus> fanquake: yes
fanquake removed the label
Needs rebase
on Aug 20, 2020
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-01-21 21:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me