Update string and net utils for future HTTP operations #34905

pull pinheadmz wants to merge 4 commits into bitcoin:master from pinheadmz:http-utils-2 changing 12 files +84 −88
  1. pinheadmz commented at 6:32 PM on March 23, 2026: member

    This is a follow-up to #34242 and is the first few commits of #32061

    As review and refinement of the replacement HTTP server progresses, some new utilities were needed and added. This PR updates those utilities as work continues on #32061.

    LineReader

    In order to enforce strict limits on the total size of headers in HTTPRequest, we add a method to LineReader to give us the total amount of data that has been read from the buffer so far. See #32061 (review)

    CaseInsensitiveEqual

    HTTP headers are case-insensitive. An early version of #32061 used an unordered_map for this and therefore we needed a comparator struct. However that unordered_map was replaced by a simpler std::vector of std::pair so we can remove the struct and use methods that already exist in the codebase.

    StringToBytes

    StringToBuffer was introduced in #34242 to test LineReader but review of #32061 indicated that it would be more optimal to return a span of bytes instead of a vector. See #32061 (review)

    Split DynSock constructor for two usecases: listening / accepting sockets

    See #32061 (review). DynSock was introduced in #30988 and is not used anywhere in master yet. If it's used as a listening socket, it provides connected sockets. If it's used as a connected socket, it provides I/O pipes. By making the queue of connected sockets optional we can clean up the ownership / lifetime if the class members.

  2. DrahtBot commented at 6:32 PM on March 23, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK fjahr, vasild
    Stale ACK theStack, hodlinator

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34342 (cli: Replace libevent usage with simple http client by fjahr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  3. in src/test/util/str.cpp:11 in c65b981302 outdated
       6 | +#include <string>
       7 | +
       8 |  #include <test/util/str.h>
       9 |  
      10 | +/** Returns a span view of the string. */
      11 | +std::span<const std::byte> StringToBytes(std::string_view str LIFETIMEBOUND)
    


    fjahr commented at 8:53 PM on March 24, 2026:

    on c65b98130299d08361ca9ac669c8f3e210b410b6:

    Not sure about StringToBytes. Why can't you use MakeByteSpan from span.h? If StringToBytes is needed, why is it only used in the tests and not more widely because we have this std::as_bytes(std::span{str} pattern all over the code base. Personally, I am not sure we even need a funtion for this, it seems fine to just inline this and that seems to be the case all over the code base as well (though maybe it was written before MakeByteSpan). I think I would prefer either option over having another helper that duplicates code that exists elsewhere.

    If you want to keep it, it would be good to add a stronger rationale to the commit message :)


    vasild commented at 8:43 AM on March 25, 2026:

    No strong opinion, I would slightly prefer to use std directly:

    std::as_bytes(std::span{s});
    

    instead of

    StringToBytes(s);
    

    pinheadmz commented at 5:01 PM on March 27, 2026:

    Question about this: the utility is used with hard-coded string literals in the unit tests. Apparently (TIL) this enforces c-string and includes a null terminator. In other words:

            std::span<const std::byte> input{std::as_bytes(std::span{"1234"})};
            LineReader reader(input, /*max_line_length=*/128);
            BOOST_CHECK_EQUAL(reader.Consumed(), 0);
            BOOST_CHECK_EQUAL(reader.Remaining(), 4); // FAILS because there are 5 bytes
    

    That's fixed by forcing string_view:

            std::span<const std::byte> input{std::as_bytes(std::span{std::string_view("1234")})};
    

    And that's what we get from StringToBytes that we don't get from MakeByteSpan.

    So, options?

    1. Keep as-is
    2. Write the explicit bytes(span(view as above and delete the now-empty test/util/str.cpp
    3. Add something to span.h like an overload for MakeByteSpan() or new MakeByteSpanFromView() ?

    cc: @hodlinator


    hodlinator commented at 9:27 PM on March 27, 2026:

    It would be great if we could just use MakeByteSpan("c-string"), but as you uncovered, it would include the hidden null terminator. (Makes me itch for there to be a constraint on MakeByteSpan() to not accept string literals, char[N]). I lean towards something like option 3, putting it in span.h, but I prefer the name StringToBytes() over MakeByteSpanFromView().

    Tangent[^0]. [^0]: This got me looking at the string_view constructors. It appears that string_view{"c-string"} has linear complexity while "c-string"sv has constant complexity since the literal function receives the size (https://en.cppreference.com/w/cpp/string/basic_string_view/operator%2522%2522sv.html). But not sure the complexity makes any difference at all if construction can happen in a constexpr context with the string_view later being copied at runtime, not sure whether compilers are clever enough for that.</span>


    hodlinator commented at 9:30 PM on March 27, 2026:

    Maybe src/util/string.h would be a better home, if others agree on StringToBytes() being the least bad name.


    pinheadmz commented at 2:59 PM on March 31, 2026:

    Ok moved to string.h and kept the current name. This means test/util/str.cpp is empty and can be deleted.


    fjahr commented at 10:46 PM on April 1, 2026:

    Thanks for clarifying the rationale. I would have slightly preferred to have it in span.h next to MakeByteSpan() so that these options are more clearly linked and we don't see somebody trying MakeByteSpan(std::string_view(s)). But it's not a big deal so feel free to leave as is at this point.


    hodlinator commented at 12:34 PM on April 8, 2026:

    I tested adding a constraint to MakeByteSpan:

    template <typename V>
    auto MakeByteSpan(const V& v) noexcept
        // Use StringToBytes() to clearly ensure that null terminator at the end is
        // excluded. That would not be the case when converting a raw char array
        // directly to std::span.
        requires(!std::is_convertible_v<V, std::string_view>)
    {
        return std::as_bytes(std::span{v});
    }
    

    This lead me down the rabbit hole of having to use util::StringToBytes() instead of MakeByteSpan() in various serialization contexts. Which started breaking de-serialization of network messages as I understand it. Probably due to the null terminator no longer being included. :grimacing: Killed the experiment.

  4. in src/test/util/str.cpp:7 in 8e82b6c43f outdated
       3 | @@ -4,18 +4,3 @@
       4 |  
       5 |  #include <test/util/str.h>
       6 |  
       7 | -#include <cstdint>
    


    fjahr commented at 9:04 PM on March 24, 2026:

    Note to other reviewers: Yes this file is empty now. But something else is added here in the next commit. :)


    pinheadmz commented at 12:36 PM on March 31, 2026:

    removing file entirely now

  5. in src/test/util_string_tests.cpp:11 in 8e82b6c43f outdated
       3 | @@ -4,11 +4,11 @@
       4 |  
       5 |  #include <util/strencodings.h>
       6 |  #include <util/string.h>
       7 | -#include <vector>
       8 |  
       9 |  #include <boost/test/unit_test.hpp>
      10 |  #include <test/util/common.h>
      11 |  #include <tinyformat.h>
      12 | +#include <test/util/str.h>
    


    fjahr commented at 9:08 PM on March 24, 2026:

    nit: ordering


    pinheadmz commented at 3:11 PM on March 31, 2026:

    fixed the ordering here

  6. in src/test/util/str.cpp:5 in c65b981302 outdated
       1 | @@ -2,5 +2,13 @@
       2 |  // Distributed under the MIT software license, see the accompanying
       3 |  // file COPYING or http://www.opensource.org/licenses/mit-license.php.
       4 |  
       5 | +#include <cstdint>
    


    fjahr commented at 9:13 PM on March 24, 2026:

    I don't think cstdint is needed. Maybe this was supposed to be cstddef?


    fjahr commented at 9:19 PM on March 24, 2026:

    By IWYU would need to add #include <attributes.h>, #include <string_view> and #include <span> als well I think.


    pinheadmz commented at 2:59 PM on March 31, 2026:

    removing uneeded import

  7. in src/test/util/str.h:9 in c65b981302 outdated
       5 | @@ -6,6 +6,10 @@
       6 |  #define BITCOIN_TEST_UTIL_STR_H
       7 |  
       8 |  #include <string>
       9 | +#include <vector>
    


    fjahr commented at 9:32 PM on March 24, 2026:

    vector is unused here. But could add <cstddef> for std::byte here as well.

    Also s/span.h/span/.


    pinheadmz commented at 3:00 PM on March 31, 2026:

    removed this uneeded import

  8. in src/test/util/net.cpp:377 in 1da6da210c outdated
     373 | @@ -369,6 +374,7 @@ ssize_t DynSock::Send(const void* buf, size_t len, int) const
     374 |  
     375 |  std::unique_ptr<Sock> DynSock::Accept(sockaddr* addr, socklen_t* addr_len) const
     376 |  {
     377 | +    assert(m_accept_sockets && "Accept() called on non-listening DynSock");
    


    fjahr commented at 9:44 PM on March 24, 2026:

    nit: We don't usually use this pattern, I think Assert() may be more of a common choice here.


    vasild commented at 8:48 AM on March 25, 2026:

    Do you mean to s/assert/Assert/ and keep the && "error string" pattern? Or to change to Assert() and remove && "error string"?

    $ git grep ' [aA]ssert.*&&.*"'
    src/policy/fees/block_policy_estimator.cpp:184:    assert(_scale != 0 && "_scale must be non-zero");
    src/test/fuzz/threadpool.cpp:37:        assert(false && "Unexpected exception type");
    src/test/fuzz/util/net.cpp:140:    assert(false && "Move of Sock into FuzzedSock not allowed.");
    src/test/pcp_tests.cpp:101:        assert(false && "Move of Sock into PCPTestSock not allowed.");
    src/test/util/net.cpp:239:    assert(false && "Move of Sock into ZeroSock not allowed.");
    src/test/util/net.cpp:260:    assert(false && "Move of Sock into StaticContentsSock not allowed.");
    src/test/util/net.cpp:425:    assert(false && "Move of Sock into DynSock not allowed.");
    src/univalue/test/object.cpp:21:            assert(0 && "No exception caught"); \
    src/univalue/test/object.cpp:24:            assert(0 && "Wrong exception caught"); \
    src/util/subprocess.h:596:    assert (typ == PIPE && "STDOUT/STDERR not allowed");
    src/util/subprocess.h:629:    assert (typ == PIPE && "STDOUT/STDERR not allowed");
    src/util/subprocess.h:660:    assert ((typ == PIPE || typ == STDOUT) && "STDERR not allowed");
    src/wallet/wallet.cpp:765:        assert(copyFrom && "Oldest wallet transaction in range assumed to have been found.");
    

    the pattern && "error string" is indeed not widely used, but that alone is not a reason to avoid it. It is a convenient way to print a custom message if the assertion fails. Is there a particular reason this might be good to avoid?


    fjahr commented at 11:32 AM on March 25, 2026:

    Do you mean to s/assert/Assert/ and keep the && "error string" pattern? Or to change to Assert() and remove && "error string"?

    I meant to remove the string as well. Our usual pattern is to put such information into a comment next to the Assert() if necessary.

    the pattern && "error string" is indeed not widely used, but that alone is not a reason to avoid it. It is a convenient way to print a custom message if the assertion fails. Is there a particular reason this might be good to avoid?

    On Assert() vs assert(): The docs don't make it clear enough, but here is a comment that makes it clearer: #20255 (comment) In this particular case it doesn't make a practical difference I think (hence it's a nit) but of course there is some value in consistency across the code base. On the && pattern: I think it's not favored because it's a hack and the string needs to be maintained whereas Assert() should print the necessary context to start debugging automatically. At least that is my understanding.

    cc @maflcko , maybe you can make a stronger case here or simply clarify if I got something wrong


    pinheadmz commented at 5:06 PM on March 31, 2026:

    There's many examples of assert(... && "message") in this file already as well. I think the pattern makes sense especially since it is a test utility. If anything, it's just gonna make some future developer's workflow move slightly faster. Going to keep as is for now.

  9. fjahr commented at 9:48 PM on March 24, 2026: contributor

    Concept ACK

  10. vasild approved
  11. vasild commented at 8:50 AM on March 25, 2026: contributor

    ACK 1da6da210c0411bf9f78d0b8a9ce08376bdb769b

    Happy to re-review if @fjahr's comments are addressed.

  12. DrahtBot requested review from fjahr on Mar 25, 2026
  13. pinheadmz force-pushed on Mar 31, 2026
  14. pinheadmz commented at 5:54 PM on March 31, 2026: member

    push to d071d82baa

    Address feedback from @fjahr and @vasild with a little help from @hodlinator

    • Moved the StringToBytes utility to src/string.h and cleaned up the now-empty file in test/util
  15. in src/test/util_string_tests.cpp:10 in e67244df59
       3 | @@ -4,10 +4,10 @@
       4 |  
       5 |  #include <util/strencodings.h>
       6 |  #include <util/string.h>
       7 | -#include <vector>
       8 |  
       9 |  #include <boost/test/unit_test.hpp>
      10 |  #include <test/util/common.h>
      11 | +#include <test/util/str.h>
    


    theStack commented at 2:47 PM on April 1, 2026:

    nit: in the latest version, this include isn't needed anymore


    pinheadmz commented at 4:46 PM on April 6, 2026:

    fixed this thanks

  16. theStack approved
  17. theStack commented at 2:49 PM on April 1, 2026: contributor

    Code-review ACK d071d82baa84ac317596646cc9901e26c0f7c774

  18. DrahtBot requested review from vasild on Apr 1, 2026
  19. fjahr commented at 10:46 PM on April 1, 2026: contributor

    Code review ACK d071d82baa84ac317596646cc9901e26c0f7c774

  20. DrahtBot added the label Needs rebase on Apr 3, 2026
  21. pinheadmz force-pushed on Apr 6, 2026
  22. pinheadmz commented at 4:59 PM on April 6, 2026: member

    push to 91d457c8f7f7ebb460bd7cbe507ab7323de7ca0c

    • Rebase on master and fix conflict
    • Address IWYU nit from @theStack
  23. DrahtBot added the label CI failed on Apr 6, 2026
  24. DrahtBot removed the label Needs rebase on Apr 6, 2026
  25. DrahtBot removed the label CI failed on Apr 7, 2026
  26. fjahr commented at 9:19 PM on April 7, 2026: contributor

    re-ACK 91d457c8f7f7ebb460bd7cbe507ab7323de7ca0c

    Verified only change was addressing the include nit.

  27. DrahtBot requested review from theStack on Apr 7, 2026
  28. theStack approved
  29. theStack commented at 11:15 PM on April 7, 2026: contributor

    re-ACK 91d457c8f7f7ebb460bd7cbe507ab7323de7ca0c

  30. in src/util/string.h:312 in 43b32b8e85
     307 | + * Returns a span view of the string.
     308 | + *
     309 | + * @param[in] str    the string_view to interpret as bytes
     310 | + * @returns          span of std::byte
     311 | + */
     312 | +inline std::span<const std::byte> StringToBytes(std::string_view str LIFETIMEBOUND)
    


    hodlinator commented at 8:23 AM on April 8, 2026:

    43b32b8 test: util: implement and use StringToBytes instead of StringToBuffer:

    I experimented with removing StringToBytes() altogether from the full #32061 PR by instead introducing a LineReader(string_view, ...) constructor. Last two commits on top of the PR here: https://github.com/bitcoin/bitcoin/compare/master...hodlinator:bitcoin:pr/32061_no_StringToBytes

    It enables tighter code in general, aside from one instance of:

    std::span result{std::as_bytes(std::span{std::string_view{R"({"result":865793,"error":null,"id":null})"}})};
    res.m_body.assign(result.begin(), result.end());
    

    My ideal would be the middle ground of introducing StringToBytes() but to only use it inside an added LineReader(string_view, ...) constructor and later while initializing res.m_body. But it's no blocker.

    Also considered proximity with SpanReader but couldn't find a good way of merging them.


    pinheadmz commented at 3:24 PM on April 8, 2026:

    Ah yes great idea. Worth updating the PR again, will push shortly


    hodlinator commented at 7:28 PM on April 8, 2026:

    Cheers!

    My LLM thought that the ARM 32 failures (https://github.com/bitcoin/bitcoin/actions/runs/24145447958/job/70457971267?pr=34905) might come from #define input on that platform. But I got similar ARM 32 failures on my example branch: https://github.com/hodlinator/bitcoin/actions/runs/24124846191/job/70386891528#step:10:4894 for the variable missing_a_colon. Maybe switching to = or () initialization instead of {} will help.


    pinheadmz commented at 8:13 PM on April 8, 2026:

    Got the same analysis from my LLM 🤣 I don't know where the #define input would be coming from but there are some stackoverflow posts about gcc/clang differences around string_view braces initialization so that might explain it. Trying = ...

  31. hodlinator approved
  32. hodlinator commented at 8:27 AM on April 8, 2026: contributor

    crACK 91d457c8f7f7ebb460bd7cbe507ab7323de7ca0c

    Special thanks for the DynSock change.

  33. pinheadmz force-pushed on Apr 8, 2026
  34. pinheadmz commented at 4:04 PM on April 8, 2026: member

    push to f2af90ce462b03a5d9d77d41851dc57d6f4f8269

    Took @hodlinator suggestion to remove StringToBytes entirely and use a string_view constructor for LineReader for these string-literal input test cases

  35. DrahtBot added the label CI failed on Apr 8, 2026
  36. DrahtBot commented at 4:49 PM on April 8, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task 32 bit ARM: https://github.com/bitcoin/bitcoin/actions/runs/24145447958/job/70457971267</sub> <sub>LLM reason (✨ experimental): CI failed because the build stopped on -Werror=unused-value errors in src/test/util_string_tests.cpp (unused std::string_view input values).</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  37. util: get number of bytes consumed from buffer by LineReader 8172099293
  38. string: replace AsciiCaseInsensitiveKeyEqual with CaseInsensitiveEqual
    This reverts commit eea38787b9be99c3f192cb83fc18358397e4ab52 from PR #34242
    
    We do not need comparators for HTTPHeaders since it is not using unordered_map anymore.
    We only need a simple, locale-independent, ascii-only compare function for
    a vector of key-value pairs of strings.
    
    We have CaseInsensitiveEqual already in test utils, this commit moves
    it to the strencodings module for use in the application code.
    b0ca400612
  39. util/test: Add string_view constructor to LineReader and remove StringToBuffer
    Co-Authored by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
    3de02abf3f
  40. Make DynSock accepted sockets queue optional, with precise lifetime
    When DynSock is used to represent a connected socket (e.g. a client)
    the data I/O pipes are needed but not the m_accepted_sockets Queue,
    because connected sockets do not create more connected sockets.
    
    When DynSock is used to represent a listening socket, the Queue
    is necessary to create connected sockets upon mocked connection, but
    the Queue does not need to be a std::shared_ptr as long as it
    is guaranteed to live as long as the DynSock.
    
    Co-Authored by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
    0e712b3812
  41. pinheadmz force-pushed on Apr 8, 2026
  42. pinheadmz commented at 8:14 PM on April 8, 2026: member

    Push to 0e712b3812

    • Fix string_view initialization in unit tests to make 32-bit ARM CI happy
  43. fjahr commented at 8:39 PM on April 8, 2026: contributor

    Code review ACK 0e712b3812d56a65c03c44016a3181fe979cf69f

    Nice, I like this latest version of the string_view saga more than the previously discussed solutions.

  44. DrahtBot requested review from theStack on Apr 8, 2026
  45. DrahtBot requested review from hodlinator on Apr 8, 2026
  46. DrahtBot removed the label CI failed on Apr 8, 2026
  47. vasild approved
  48. vasild commented at 4:50 AM on April 9, 2026: contributor

    ACK 0e712b3812d56a65c03c44016a3181fe979cf69f

  49. fanquake merged this on Apr 9, 2026
  50. fanquake closed this on Apr 9, 2026


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-22 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me