rest: Reject truncated hex txid early in getutxos parsing #30482

pull maflcko wants to merge 7 commits into bitcoin:master from maflcko:2407-rest-txid changing 16 files +98 −82
  1. maflcko commented at 7:22 am on July 19, 2024: member

    In rest_getutxos truncated txids such as aa or ff are accepted. This is brittle at best.

    Fix it by rejecting any truncated (or overlarge) input.


    Review note: This also starts a major refactor to rework hex parsing in Bitcoin Core, meaning that a few refactor commits are included as well. They are explained individually in the commit message and the work will be continued in the future.

  2. DrahtBot commented at 7:22 am on July 19, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK stickies-v, hodlinator

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. DrahtBot renamed this:
    rest: Reject truncated hex txid early in getutxos parsing
    rest: Reject truncated hex txid early in getutxos parsing
    on Jul 19, 2024
  4. DrahtBot added the label RPC/REST/ZMQ on Jul 19, 2024
  5. DrahtBot added the label CI failed on Jul 19, 2024
  6. DrahtBot commented at 9:14 am on July 19, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/27652526405

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. in src/test/pow_tests.cpp:89 in 0d57f058e4 outdated
    85@@ -86,7 +86,7 @@ BOOST_AUTO_TEST_CASE(CheckProofOfWork_test_negative_target)
    86     uint256 hash;
    87     unsigned int nBits;
    88     nBits = UintToArith256(consensus.powLimit).GetCompact(true);
    89-    hash.SetHex("0x1");
    90+    hash = uint256{1};
    


    stickies-v commented at 10:36 am on July 19, 2024:

    nit: or just initialize directly?

     0diff --git a/src/test/pow_tests.cpp b/src/test/pow_tests.cpp
     1index efca540240..0a9137636a 100644
     2--- a/src/test/pow_tests.cpp
     3+++ b/src/test/pow_tests.cpp
     4@@ -83,31 +83,28 @@ BOOST_AUTO_TEST_CASE(get_next_work_upper_limit_actual)
     5 BOOST_AUTO_TEST_CASE(CheckProofOfWork_test_negative_target)
     6 {
     7     const auto consensus = CreateChainParams(*m_node.args, ChainType::MAIN)->GetConsensus();
     8-    uint256 hash;
     9+    uint256 hash{1};
    10     unsigned int nBits;
    11     nBits = UintToArith256(consensus.powLimit).GetCompact(true);
    12-    hash = uint256{1};
    13     BOOST_CHECK(!CheckProofOfWork(hash, nBits, consensus));
    14 }
    15 
    16 BOOST_AUTO_TEST_CASE(CheckProofOfWork_test_overflow_target)
    17 {
    18     const auto consensus = CreateChainParams(*m_node.args, ChainType::MAIN)->GetConsensus();
    19-    uint256 hash;
    20+    uint256 hash{1};
    21     unsigned int nBits{~0x00800000U};
    22-    hash = uint256{1};
    23     BOOST_CHECK(!CheckProofOfWork(hash, nBits, consensus));
    24 }
    25 
    26 BOOST_AUTO_TEST_CASE(CheckProofOfWork_test_too_easy_target)
    27 {
    28     const auto consensus = CreateChainParams(*m_node.args, ChainType::MAIN)->GetConsensus();
    29-    uint256 hash;
    30+    uint256 hash{1};
    31     unsigned int nBits;
    32     arith_uint256 nBits_arith = UintToArith256(consensus.powLimit);
    33     nBits_arith *= 2;
    34     nBits = nBits_arith.GetCompact();
    35-    hash = uint256{1};
    36     BOOST_CHECK(!CheckProofOfWork(hash, nBits, consensus));
    37 }
    38 
    

    maflcko commented at 2:00 pm on July 19, 2024:
    Thanks, but I’ll leave it as-is for now.

    hodlinator commented at 11:40 am on July 24, 2024:
    Man, I was worried for a moment that this modification of switching to the uint8_t ctor was setting the other end of the base_blob array, but it’s the same. Will be glad to see the fragile function go. :+1:
  8. in src/uint256.h:68 in 0efc4f1afb outdated
    63+     * Writes the hex string into the base_blob and returns true iff all of the
    64+     * checks pass, false otherwise:
    65+     *   - All characters are hex
    66+     *   - Input length is WIDTH*2
    67+     */
    68+    [[nodiscard]] bool SetHex(std::string_view str)
    


    stickies-v commented at 12:50 pm on July 19, 2024:
    review note nit: perhaps worth adding in the commit message that this implementation is ~identical to ParseHashStr (even though it’s later deleted)?

    maflcko commented at 2:00 pm on July 19, 2024:
    Thanks, edited commit message
  9. in src/rest.cpp:801 in 5ac06abeb9 outdated
    791@@ -792,13 +792,14 @@ static bool rest_getutxos(const std::any& context, HTTPRequest* req, const std::
    792             if (txid_out.size() != 2) {
    793                 return RESTERR(req, HTTP_BAD_REQUEST, "Parse error");
    794             }
    795+            Txid txid;
    796             auto output{ToIntegral<uint32_t>(txid_out.at(1))};
    797 
    798-            if (!output || !IsHex(txid_out.at(0))) {
    799+            if (!txid.SetHex(txid_out.at(0)) || !output ) {
    


    stickies-v commented at 12:52 pm on July 19, 2024:

    Perhaps worth it to introduce the FromHex method you mentioned here already, as it’s not much extra LoC and probably makes for a more final interface? We can just do it for transaction_identifier now and do base_blob later?

     0diff --git a/src/rest.cpp b/src/rest.cpp
     1index b0d0d928f6..ff06c4191f 100644
     2--- a/src/rest.cpp
     3+++ b/src/rest.cpp
     4@@ -792,14 +792,14 @@ static bool rest_getutxos(const std::any& context, HTTPRequest* req, const std::
     5             if (txid_out.size() != 2) {
     6                 return RESTERR(req, HTTP_BAD_REQUEST, "Parse error");
     7             }
     8-            Txid txid;
     9             auto output{ToIntegral<uint32_t>(txid_out.at(1))};
    10 
    11-            if (!txid.SetHex(txid_out.at(0)) || !output ) {
    12+            auto txid {Txid::FromHex(txid_out.at(0))};
    13+            if (!txid || !output ) {
    14                 return RESTERR(req, HTTP_BAD_REQUEST, "Parse error");
    15             }
    16 
    17-            vOutPoints.emplace_back(txid, *output);
    18+            vOutPoints.emplace_back(txid.value(), *output);
    19         }
    20 
    21         if (vOutPoints.size() > 0)
    22diff --git a/src/util/transaction_identifier.h b/src/util/transaction_identifier.h
    23index c1b46ea39c..68182b6332 100644
    24--- a/src/util/transaction_identifier.h
    25+++ b/src/util/transaction_identifier.h
    26@@ -37,6 +37,11 @@ public:
    27     bool operator<(const Other& other) const { return Compare(other) < 0; }
    28 
    29     const uint256& ToUint256() const LIFETIMEBOUND { return m_wrapped; }
    30+    static std::optional<transaction_identifier> FromHex(std::string_view hex)
    31+    {
    32+        transaction_identifier result;
    33+        return result.SetHex(hex) ? std::make_optional(result) : std::nullopt;
    34+    }
    35     static transaction_identifier FromUint256(const uint256& id) { return {id}; }
    36 
    37     /** Wrapped `uint256` methods. */
    

    maflcko commented at 1:49 pm on July 19, 2024:
    On a second thought, I think the SetHex is preferable in this case, because the code using std::optional requires more typing (minimally more) when parsing and when unwrapping. Given the [[nodiscard]] approach, both should be equally safe, so I picked the approach using less code.

    hodlinator commented at 7:16 pm on July 21, 2024:

    (In response to #30482 (review)): To see how FromHex would look, I implemented a commit on top of this PR that does that transformation: 404c115ea0f53785a640818c2058ef0591c8c6ac

    It turned out well enough that I hope you will at least consider it. Could use more auto to reduce typing. In my mind it’s worth sacrificing some typing and a slightly larger diff to get clarity and RAII.


    maflcko commented at 2:31 pm on July 23, 2024:
    Ok, done, went with my original suggestion :sweat_smile:
  10. stickies-v commented at 1:02 pm on July 19, 2024: contributor

    Approach ACK 9984bdc83b4a187629f26f35e4ee00e35d2ac04d

    Straightforward improvement to make the code more robust, very nice.

  11. maflcko force-pushed on Jul 19, 2024
  12. DrahtBot removed the label CI failed on Jul 19, 2024
  13. in src/util/transaction_identifier.h:75 in 091f6aa17d outdated
    66@@ -66,9 +67,10 @@ using Txid = transaction_identifier<false>;
    67 /** Wtxid commits to all transaction fields including the witness. */
    68 using Wtxid = transaction_identifier<true>;
    69 
    70+/** DEPRECATED due to missing length-check and hex-check, please use the safer SetHex, or FromUint256 */
    


    paplorinc commented at 11:21 am on July 20, 2024:

    We might as well use the [[deprecated]] attribute instead:

    0[[deprecated("Missing length-check and hex-check, please use the safer SetHex, or FromUint256")]]
    

    Resulting in compilation warnings for the remaining usages:

    bitcoin/src/qt/coincontroldialog.cpp:345:25: warning: ‘TxidFromString’ is deprecated: Missing length-check and hex-check, please use the safer SetHex, or FromUint256 [-Wdeprecated-declarations] COutPoint outpt(TxidFromString(item->data(COLUMN_ADDRESS, TxHashRole).toString().toStdString()), item->data(COLUMN_ADDRESS, VOutRole).toUInt());


    maflcko commented at 6:40 am on July 22, 2024:
    The build logs may be too verbose then, but I’ll reconsider it in a follow-up, when more work has landed.

    stickies-v commented at 10:11 am on July 22, 2024:
    I don’t see the point of bloating the build logs with this, a tracking issue seems more appropriate if we’re worried it’ll be forgotten.

    paplorinc commented at 10:45 am on July 22, 2024:
    I don’t see why silencing warnings and putting them to dead comments that cannot bite back is better than “bloating the logs”

    maflcko commented at 11:49 am on July 22, 2024:
    Again, I may fix this in a follow-up. Right now the focus is on the bugfix explained in the pull request description.

    paplorinc commented at 12:09 pm on July 22, 2024:
    👍, I follow-up PR makes sense here
  14. in src/rest.cpp:221 in 091f6aa17d outdated
    217@@ -218,8 +218,9 @@ static bool rest_headers(const std::any& context,
    218     }
    219 
    220     uint256 hash;
    221-    if (!ParseHashStr(hashStr, hash))
    222+    if (!hash.SetHex(hashStr)) {
    


    paplorinc commented at 11:25 am on July 20, 2024:

    When a method both mutates and returns, we often extract it, to clarify that it’s not just a simple pure check.

    0    bool isValid = hash.SetHex(hashStr);
    1    if (!isValid) {
    

    maflcko commented at 6:42 am on July 22, 2024:

    isValid

    For new code, that’d need to be snake_case. Other than that, I think it is clear that a mutable method called Set...(...) mutates, so there is no need to change the code here. Also, the dev guide doesn’t mention it.

  15. in src/rest.cpp:801 in 091f6aa17d outdated
    794@@ -792,13 +795,14 @@ static bool rest_getutxos(const std::any& context, HTTPRequest* req, const std::
    795             if (txid_out.size() != 2) {
    796                 return RESTERR(req, HTTP_BAD_REQUEST, "Parse error");
    797             }
    798+            Txid txid;
    799             auto output{ToIntegral<uint32_t>(txid_out.at(1))};
    800 
    801-            if (!output || !IsHex(txid_out.at(0))) {
    802+            if (!txid.SetHex(txid_out.at(0)) || !output ) {
    


    paplorinc commented at 11:31 am on July 20, 2024:
    What’s the reason for inverting the clauses, previously if !output we short-circuited the check, now we’re attempting to set the hex value even when output is empty.

    maflcko commented at 6:44 am on July 22, 2024:
    This shouldn’t make a difference in the happy path. Optimizing for the non-happy path seems useless, as no user should be using it (outside of a few test cases).

    paplorinc commented at 10:06 am on July 22, 2024:
    Sure, but you did change it, so unless there’s any particular reason, I think the original order makes more sense

    maflcko commented at 2:32 pm on July 23, 2024:
    The format string is txid_num, so I kept the order.
  16. in src/uint256.cpp:25 in 091f6aa17d outdated
    28-        psz++;
    29-
    30-    // skip 0x
    31-    if (psz[0] == '0' && ToLower(psz[1]) == 'x')
    32-        psz += 2;
    33+    str = util::TrimStringView(str);
    


    paplorinc commented at 11:38 am on July 20, 2024:
  17. in src/uint256.cpp:23 in 091f6aa17d outdated
    17@@ -18,39 +18,31 @@ std::string base_blob<BITS>::GetHex() const
    18 }
    19 
    20 template <unsigned int BITS>
    21-void base_blob<BITS>::SetHex(const char* psz)
    22+void base_blob<BITS>::SetHexFragile(std::string_view str)
    23 {
    24     std::fill(m_data.begin(), m_data.end(), 0);
    


    paplorinc commented at 11:43 am on July 20, 2024:

    We shouldn’t need zeroing, if we’re processing the values exhaustively - but if we do, we could use

    0    SetNull();
    

    maflcko commented at 6:58 am on July 22, 2024:
    I am not touching this line of code, so it seems unrelated to refactor it.

    paplorinc commented at 11:03 am on July 22, 2024:
    posted to #30436, please resolve it here
  18. in src/uint256.cpp:27 in 091f6aa17d outdated
    23 {
    24     std::fill(m_data.begin(), m_data.end(), 0);
    25 
    26-    // skip leading spaces
    27-    while (IsSpace(*psz))
    28-        psz++;
    


    paplorinc commented at 11:44 am on July 20, 2024:
    previously we’ve only removed leading whitespaces, but TrimStringView removed trailing ones as well. Not sure if this changes the overall behavior, though.

    maflcko commented at 6:46 am on July 22, 2024:
    It does not, because trailing non-hex was always skipped.
  19. in src/util/string.h:87 in 091f6aa17d outdated
    82@@ -83,8 +83,8 @@ std::vector<T> Split(const Span<const char>& sp, char sep)
    83 
    84 [[nodiscard]] inline std::string_view RemovePrefixView(std::string_view str, std::string_view prefix)
    85 {
    86-    if (str.substr(0, prefix.size()) == prefix) {
    87-        return str.substr(prefix.size());
    88+    if (str.starts_with(prefix)) {
    89+        str.remove_prefix(prefix.size());
    


    paplorinc commented at 11:50 am on July 20, 2024:

    I find it quite confusing that we’re sneaking in a mutation of the copy, while previously we just returned a copy directly (like TrimStringView). We could const the parameters if we did the same as before:

    0        return str.substr(prefix.size());
    

    i.e.

    0[[nodiscard]] inline std::string_view RemovePrefixView(const std::string_view str, const std::string_view prefix)
    

    Could you please add a test case that checks the integrity of the original:

    0    auto original = "0x123";
    1    BOOST_CHECK_EQUAL(RemovePrefixView(original, "0x"), "123");
    2    BOOST_CHECK_EQUAL(original, "0x123"); // Ensure original string is intact
    

    Same for TrimStringView, just to be sure.


    maflcko commented at 6:50 am on July 22, 2024:
    Should probably be done in #30482 (comment), or in a separate pull reqeust. (The method already exists, and if the tests do not exists, one can add them today on master).

    paplorinc commented at 11:03 am on July 22, 2024:
    posted to #30436, please resolve it here
  20. in src/uint256.h:74 in 091f6aa17d outdated
    71+    {
    72+        if (WIDTH * 2 != str.size() || !IsHex(str)) return false;
    73+        SetHexFragile(str);
    74+        return true;
    75+    }
    76+    /** Like SetHex, but accepting any invalid input, thus fragile and deprecated */
    


    paplorinc commented at 1:22 pm on July 20, 2024:
    0    [[deprecated("Accepting any invalid input, thus fragile, please use the safer SetHex instead")]]
    

    maflcko commented at 6:57 am on July 22, 2024:
    (Same as above)
  21. in src/bitcoin-tx.cpp:268 in 091f6aa17d outdated
    264@@ -265,7 +265,7 @@ static void MutateTxAddInput(CMutableTransaction& tx, const std::string& strInpu
    265 
    266     // extract and validate TXID
    267     uint256 txid;
    268-    if (!ParseHashStr(vStrInputParts[0], txid)) {
    269+    if (!txid.SetHex(vStrInputParts[0])) {
    


    paplorinc commented at 2:04 pm on July 20, 2024:
    SetHex sounds like we’re setting the field hex to the given parameter. But this isn’t a simple setter, it parses and makes decisions (e.g. trim, skip values after an invalid char), so it’s more like a factory method, right? In other cases we prefix these with From instead, see from_hex

    maflcko commented at 6:52 am on July 22, 2024:
    No, it is not a factory method, because it does not return an object. I kept the previous name, but I am happy to change it, if there is a reason.

    paplorinc commented at 10:11 am on July 22, 2024:
    The reason is that it’s not a simple setter, but a complex method that changes the internal state.

    maflcko commented at 7:11 am on July 23, 2024:

    it parses and makes decisions (e.g. trim, skip values after an invalid char),

    No, it does not trim or skip values. Either is sets the internal state, or it does not.


    paplorinc commented at 9:49 am on July 23, 2024:
    will leave it up to you, but if I see a Set.* I assume it’s a property setter (which has an equivalent Getter), not that something is calculated and internal state (maybe) modified accordingly.

    maflcko commented at 9:53 am on July 23, 2024:

    will leave it up to you, but if I see a Set.* I assume it’s a property setter (which has an equivalent Getter)

    Yes, there is a corresponding GetHex.


    paplorinc commented at 10:09 am on July 23, 2024:
    You mean it’s a “virtual property” - I can work with that, thanks, resolve it please. (though it’s weird that the getter seems to reverse the hex back to big-endian, weird, but I guess that’s for another PR)
  22. in src/uint256.h:63 in 091f6aa17d outdated
    72+        if (WIDTH * 2 != str.size() || !IsHex(str)) return false;
    73+        SetHexFragile(str);
    74+        return true;
    75+    }
    76+    /** Like SetHex, but accepting any invalid input, thus fragile and deprecated */
    77+    void SetHexFragile(std::string_view str);
    


    paplorinc commented at 2:10 pm on July 20, 2024:
    This isn’t actually fragile (i.e. breaks easily), quite the opposite, it’s rather lenient, i.e. accepts unvalidated input. Could we name it as FromHexUnchecked or FromHexLenient instead?

    maflcko commented at 6:57 am on July 22, 2024:
    With “fragile” I mean that it breaks code that uses it and leads to bugs. See for example the bug that was fixed in this pull request.

    paplorinc commented at 10:09 am on July 22, 2024:
    I understood, that’s why I recommended the name change. We don’t say trim is fragile, just because more values become valid after they’re trimmed, we call it lenient, right?

    maflcko commented at 11:48 am on July 22, 2024:

    because more values become valid

    Those values are not “valid”. For example, why would turning aa into 00aa, or aa00 be valid? Both should be rejected and code using it is fragile.

    Happy to rename it to SetHexUnchecked, but I don’t think it matters much, as the code is deprecated anyway.


    paplorinc commented at 12:08 pm on July 22, 2024:
    I think it would help a bit, but I’m more concerned about the setter name, when in fact it’s a complex method, changing internal state, it’s why I recommended something like FromHexUnchecked (the naming suggestion also applies to the non-deprected method calling this, of course).

    hodlinator commented at 8:17 am on July 23, 2024:

    -Fragile feels slightly off to me as well. -Unchecked seems to imply that the function itself may crash. -Lenient feels closer to the truth but sounds safe to use.

    I recently came across MakeRandDeterministicDANGEROUS(), maybe something like that suffix would work?


    maflcko commented at 8:24 am on July 23, 2024:

    DANGEROUS

    That may also imply the function is crashing or has some internal hidden danger, which seems odd because it simply lacks a return value to indicate success or failure? Maybe just ...Deprecated()?


    hodlinator commented at 8:40 am on July 23, 2024:

    It is dangerous because input isn’t length-checked, meaning extra hex chars may be ignored, and an accidentally short copy-paste of hex chars will be silently padded with zeroes.

    Maybe just …Deprecated()?

    As in #30482?notification_referrer_id=NT_kwDOCkdNarUxMTU3NjYyODE4MzoxNzI0NDUwMzQ#discussion_r1685726873, I wonder what the policy around deprecation is. Are there prior examples of deprecation on a non-RPC level you can point to?


    maflcko commented at 8:48 am on July 23, 2024:

    Only src/util/time.h, but those are just type-related; And src/logging.h, but those are just aliases; So I don’t think they apply here.

    Happy to rename it to SetHexDeprecated, but I don’t think it matters much, as the code is deprecated anyway.


    paplorinc commented at 9:41 am on July 23, 2024:
    But production code calls it indirectly, so it’s not exactly deprecated, just direct access is discouraged, right?

    maflcko commented at 9:42 am on July 23, 2024:
    Yes, happy to rename it to SetHexDiscouraged, but I don’t think it matters much, as the code is deprecated to be called directly anyway.

    paplorinc commented at 9:45 am on July 23, 2024:
    I’m fine with that if you are

    hodlinator commented at 11:51 am on July 23, 2024:

    Only src/util/time.h, but those are just type-related; And src/logging.h, but those are just aliases; So I don’t think they apply here.

    Thanks, those are good examples. I guess they are/were used more widely and directly compared to SetHex(), making deprecation a more obvious approach. If you deem this sufficiently big to break up into multiple PRs, I trust your judgement.

    SetHexDeprecated or SetHexDiscouraged work for me.

  23. in src/test/uint256_tests.cpp:156 in 091f6aa17d outdated
    152@@ -155,18 +153,20 @@ BOOST_AUTO_TEST_CASE( comparison ) // <= >= < >
    153     BOOST_CHECK( R2S < MaxS );
    154 }
    155 
    156-BOOST_AUTO_TEST_CASE( methods ) // GetHex SetHex begin() end() size() GetLow64 GetSerializeSize, Serialize, Unserialize
    157+BOOST_AUTO_TEST_CASE( methods ) // GetHex SetHexFragile begin() end() size() GetLow64 GetSerializeSize, Serialize, Unserialize
    


    paplorinc commented at 2:32 pm on July 20, 2024:
    shouldn’t this remain SetHex here?

    maflcko commented at 7:16 am on July 23, 2024:
    Thanks, fixed the comment in this test!

    paplorinc commented at 9:45 am on July 23, 2024:
    you may want to revert it in the commit that introduced it instead

    maflcko commented at 9:54 am on July 23, 2024:

    you may want to revert it in the commit that introduced it instead

    It is a scripted-diff, so it is correct and self-consistent, no?


    paplorinc commented at 9:57 am on July 23, 2024:
    ah, got it, sure!
  24. in src/test/uint256_tests.cpp:100 in 091f6aa17d outdated
     95@@ -102,6 +96,8 @@ BOOST_AUTO_TEST_CASE( basics ) // constructors, equality, inequality
     96     BOOST_CHECK(uint256S("0x"+MaxL.ToString()) == MaxL);
     97     BOOST_CHECK(uint256S(R1L.ToString()) == R1L);
     98     BOOST_CHECK(uint256S("   0x"+R1L.ToString()+"   ") == R1L);
     99+    BOOST_CHECK(uint256S("   0x"+R1L.ToString()+"-trash;%^&   ") == R1L);
    100+    BOOST_CHECK(uint256S("\t \n  \n \f\n\r\t\v\t   0x"+R1L.ToString()+"  \t \n  \n \f\n\r\t\v\t ") == R1L);
    


    paplorinc commented at 2:34 pm on July 20, 2024:

    I see that BOOST_CHECK was use throughout, but we could use BOOST_CHECK_EQUAL in a few places for better error messages:

    0    BOOST_CHECK_EQUAL(uint256S("\t \n  \n \f\n\r\t\v\t   0x"+R1L.ToString()+"  \t \n  \n \f\n\r\t\v\t "), R1L);
    

    maflcko commented at 6:58 am on July 22, 2024:
    Happy to review a pull request doing this, but I won’t be doing it myself.

    paplorinc commented at 11:02 am on July 22, 2024:
    posted to #30436, please resolve it here
  25. in src/uint256.cpp:40 in 091f6aa17d outdated
    47     while (digits > 0 && p1 < pend) {
    48-        *p1 = ::HexDigit(psz[--digits]);
    49+        *p1 = ::HexDigit(str[--digits]);
    50         if (digits > 0) {
    51-            *p1 |= ((unsigned char)::HexDigit(psz[--digits]) << 4);
    52+            *p1 |= ((unsigned char)::HexDigit(str[--digits]) << 4);
    


    paplorinc commented at 4:29 pm on July 20, 2024:
    I see this uses little-endian encoding, while e.g. TryParseHex uses big-endian, right? Do you happen to know what the reason is? Should we maybe document this?

    maflcko commented at 7:00 am on July 22, 2024:
    I am not changing the endian behavior in this instance, so maybe a separate pull can improve the documentation?

    paplorinc commented at 11:03 am on July 22, 2024:
    posted to #30436, please resolve it here
  26. in src/uint256.cpp:34 in 091f6aa17d outdated
    39-        digits++;
    40+    for (const char c : str) {
    41+        if (::HexDigit(c) == -1)
    42+            break;
    43+        ++digits;
    44+    }
    


    paplorinc commented at 4:48 pm on July 20, 2024:

    previous version was cleaner in my opinion. Since str is random access, we could simplify to:

    0    size_t digits = 0;
    1    while (digits < trimmed.size() && ::HexDigit(trimmed[digits]) != -1) ++digits;
    

    maflcko commented at 7:01 am on July 22, 2024:

    paplorinc commented at 11:03 am on July 22, 2024:
    posted to #30436, please resolve it here
  27. in src/uint256.cpp:21 in 091f6aa17d outdated
    17@@ -18,39 +18,31 @@ std::string base_blob<BITS>::GetHex() const
    18 }
    19 
    20 template <unsigned int BITS>
    21-void base_blob<BITS>::SetHex(const char* psz)
    22+void base_blob<BITS>::SetHexFragile(std::string_view str)
    


    paplorinc commented at 4:49 pm on July 20, 2024:

    applying the suggestions below would look like this:

     0template <unsigned int BITS>
     1void base_blob<BITS>::SetHexFragile(const std::string_view str)
     2{
     3    const auto trimmed = util::RemovePrefixView(util::TrimStringView(str), "0x");
     4
     5    size_t digits = 0;
     6    while (digits < trimmed.size() && ::HexDigit(trimmed[digits]) != -1) ++digits;
     7
     8    for (auto it = m_data.begin(); it < m_data.end(); ++it) {
     9        unsigned char temp = (digits > 0) ? ::HexDigit(trimmed[--digits]) : 0;
    10        *it = (digits > 0) ? temp | (::HexDigit(trimmed[--digits]) << 4) : temp;
    11    }
    12}
    

    maflcko commented at 7:01 am on July 22, 2024:

    paplorinc commented at 12:08 pm on July 22, 2024:
    posted to #30436, please resolve it here
  28. in src/test/uint256_tests.cpp:120 in 801404719d outdated
    116@@ -115,6 +117,8 @@ BOOST_AUTO_TEST_CASE( basics ) // constructors, equality, inequality
    117     BOOST_CHECK(uint160S("0x"+MaxS.ToString()) == MaxS);
    118     BOOST_CHECK(uint160S(R1S.ToString()) == R1S);
    119     BOOST_CHECK(uint160S("   0x"+R1S.ToString()+"   ") == R1S);
    120+    BOOST_CHECK(uint160S("   0x"+R1S.ToString()+"-trash;%^&   ") == R1S);
    


    paplorinc commented at 4:54 pm on July 20, 2024:

    let’s add a test where TryParseHash would behave differently, e.g. "0123456789 ABCDEF" where TryParseHex continues, SetHexFragile stops:

    0TryParseHex: 0123456789abcdef
    1SetHexFragile: efcdab8967452301000000000000000000000000000000000000000000000000
    

    maflcko commented at 6:55 am on July 22, 2024:

    TryParseHex

    This pull request doesn’t modify (or mention) this helper, so I think it is unrelated. Also, the test already exist in:

    0src/test/util_tests.cpp:    result = ParseHex("12 34 56 78");
    1src/test/util_tests.cpp:    result = ParseHex(" 89 34 56 78");
    2src/test/util_tests.cpp:    result = ParseHex("     Ff        aA    ");
    3src/test/util_tests.cpp:    result = ParseHex("");
    4src/test/util_tests.cpp:    BOOST_CHECK_EQUAL(ParseHex("AAF F").size(), 0);
    5src/test/util_tests.cpp:    BOOST_CHECK_EQUAL(ParseHex(with_embedded_null).size(), 0);
    6src/test/util_tests.cpp:    BOOST_CHECK_EQUAL(ParseHex("1234 invalid 1234").size(), 0);
    7src/test/util_tests.cpp:    BOOST_CHECK_EQUAL(ParseHex("12 3").size(), 0);
    

    paplorinc commented at 11:02 am on July 22, 2024:
    posted to #30436, please resolve it here
  29. paplorinc changes_requested
  30. paplorinc commented at 5:02 pm on July 20, 2024: contributor
    Nice cleanup, please see my questions and observations
  31. in src/util/transaction_identifier.h:78 in 091f6aa17d outdated
    66@@ -66,9 +67,10 @@ using Txid = transaction_identifier<false>;
    67 /** Wtxid commits to all transaction fields including the witness. */
    68 using Wtxid = transaction_identifier<true>;
    69 
    70+/** DEPRECATED due to missing length-check and hex-check, please use the safer SetHex, or FromUint256 */
    71 inline Txid TxidFromString(std::string_view str)
    72 {
    73-    return Txid::FromUint256(uint256S(str.data()));
    74+    return Txid::FromUint256(uint256S(str));
    


    hodlinator commented at 12:22 pm on July 21, 2024:

    What is your reasoning for deprecating rather than removing and correcting call-sites to not use TxidFromString - is that part of why this PR is a draft? I can’t see many other deprecated things in the codebase beyond RPCs.

    If you keep it, TxidFromString should probably be renamed TxidFromStringFragile to follow SetHexFragile, but I understand you want to avoid touching src/qt/ source files as much as possible. Maybe one could at least change the implementation to make it clearer that it is “fragile” there?

    0    uint256 wrapped;
    1    wrapped.SetHexFragile(str);
    2    return Txid::FromUint256(wrapped);
    

    maflcko commented at 7:05 am on July 22, 2024:

    What is your reasoning for deprecating rather than removing and correcting call-sites

    I don’t think fixing all bugs in all places in a single pull request makes sense. Often, context specific fixes are needed, so separate pull requests can be used.

    is that part of why this PR is a draft?

    It is draft, because it includes a merge commit of other commits.

    If you keep it, TxidFromString should probably be renamed TxidFromStringFragile to follow SetHexFragile

    Sure, but this will be done in a follow-up.


    paplorinc commented at 9:42 am on July 23, 2024:
    Makes sense to do the deprecation removal in a separate PR
  32. hodlinator commented at 1:42 pm on July 21, 2024: contributor
    @paplorinc the first 3 commits here come from pending PR #30436, so comments on those changes are probably better to make there.
  33. in src/rest.cpp:622 in 091f6aa17d outdated
    618@@ -617,7 +619,7 @@ static bool rest_deploymentinfo(const std::any& context, HTTPRequest* req, const
    619 
    620         if (!hash_str.empty()) {
    621             uint256 hash;
    622-            if (!ParseHashStr(hash_str, hash)) {
    623+            if (!hash.SetHex(hash_str)) {
    


    hodlinator commented at 7:21 pm on July 21, 2024:
    Noticed ParseHashV(hash_str, ...) is needlessly called a few lines below instead of using the already verified hash value.

    maflcko commented at 7:23 am on July 23, 2024:
    Nice, removed!
  34. hodlinator commented at 7:46 pm on July 21, 2024: contributor
    Good to see the added functional test conditions for truncated txids!
  35. maflcko force-pushed on Jul 23, 2024
  36. maflcko force-pushed on Jul 23, 2024
  37. maflcko force-pushed on Jul 23, 2024
  38. maflcko commented at 2:35 pm on July 23, 2024: member
    Pushed and replied to all feedback, except for #30482 (review), which I’ll do later
  39. maflcko force-pushed on Jul 23, 2024
  40. DrahtBot added the label CI failed on Jul 23, 2024
  41. DrahtBot commented at 2:38 pm on July 23, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/27809611813

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  42. in src/rest.cpp:221 in ba06593793 outdated
    216@@ -217,9 +217,10 @@ static bool rest_headers(const std::any& context,
    217         return RESTERR(req, HTTP_BAD_REQUEST, strprintf("Header count is invalid or out of acceptable range (1-%u): %s", MAX_REST_HEADERS_RESULTS, raw_count));
    218     }
    219 
    220-    uint256 hash;
    221-    if (!ParseHashStr(hashStr, hash))
    222+    auto hash{uint256::FromHex(hashStr)};
    223+    if (!hash) {
    


    paplorinc commented at 2:51 pm on July 23, 2024:
    is this exactly the same as before?

    maflcko commented at 6:47 am on July 24, 2024:
    Why not?

    paplorinc commented at 9:38 am on July 24, 2024:
    I was asking because ParseHashStr returned a bool before and FromHex returns an optional, was wondering if it had some weirdness that I didn’t know about (like how JavaScript’s truthy has), e.g. take the internal value into consideration in some cases or whatever. I checked it now (was reviewing the other one from my phone), it seems to behave in the same way indeed.
  43. in src/test/blockfilter_tests.cpp:167 in ba06593793 outdated
    163@@ -165,11 +164,9 @@ BOOST_AUTO_TEST_CASE(blockfilters_json_test)
    164             tx_undo.vprevout.emplace_back(txout, 0, false);
    165         }
    166 
    167-        uint256 prev_filter_header_basic;
    168-        BOOST_CHECK(ParseHashStr(test[pos++].get_str(), prev_filter_header_basic));
    169+        uint256 prev_filter_header_basic{*Assert(uint256::FromHex(test[pos++].get_str()))};
    


    paplorinc commented at 2:53 pm on July 23, 2024:
    hmmm, what’s the point of the Assert inside?

    maflcko commented at 6:48 am on July 24, 2024:
    To avoid UB on a nullopt deref. But happy to remove it, if you think there is a benefit in removing it.

    paplorinc commented at 9:50 am on July 24, 2024:

    But happy to remove it, if you think

    ok, but seems a bit passive-aggressive to resolve all such comments in that case.


    Wouldn’t this achieve the same in a more idiomatic way?

    0        auto prev_filter_header_basic{uint256::FromHex(test[pos++].get_str()).value()};
    

    This will fail for

    0static std::optional<uint256> FromHex(std::string_view str) { return std::nullopt; }
    

    with

    std::bad_optional_access: bad_optional_access

    which is more specific than

    signal: SIGABRT (application abort requested)


    maflcko commented at 10:41 am on July 24, 2024:

    ok, but seems a bit passive-aggressive to resolve all such comments in that case.

    I am sorry you feel that way, but I don’t see value in review comments or questions that can be answered by the person asking the question. (Assert is pretty well documented, see git grep Assert doc/developer-notes.md)

    Regardless, I did answer the question and then resolved the conversation, which seems reasonable for a question.

    which is more specific than

    I think both are fine. As long as the test fails in some way, an error is visible and finding it should be trivial in either case.

    So I consider it a style-nit.

  44. in src/uint256.h:62 in ba06593793 outdated
    58@@ -58,8 +59,8 @@ class base_blob
    59     friend constexpr bool operator<(const base_blob& a, const base_blob& b) { return a.Compare(b) < 0; }
    60 
    61     std::string GetHex() const;
    62-    void SetHex(const char* psz);
    63-    void SetHex(const std::string& str);
    64+    /** Unlike FromHex this accepts any invalid input, thus fragile and deprecated */
    


    paplorinc commented at 2:54 pm on July 23, 2024:
    not sure I get the SetHexFragile/FromHex naming duality here :/

    maflcko commented at 6:50 am on July 24, 2024:
    The duality will be removed in a follow-up, when the deprecated on is removed
  45. in src/uint256.h:104 in ba06593793 outdated
     99+template <class uintN_t>
    100+std::optional<uintN_t> FromHex(std::string_view str)
    101+{
    102+    if (uintN_t::size() * 2 != str.size() || !IsHex(str)) return std::nullopt;
    103+    uintN_t rv;
    104+    rv.SetHexFragile(str);
    


    paplorinc commented at 2:55 pm on July 23, 2024:
    after we get rid of the deprecation, these checks should ideally be done while iterating over the values to avoid multiple iterations (e.g. how Base58 parsing is done)

    maflcko commented at 6:50 am on July 24, 2024:
    Maybe. Seems best to leave for the follow-up that does it.

    paplorinc commented at 8:03 am on July 24, 2024:
    Yes, definitely different PR
  46. DrahtBot removed the label CI failed on Jul 23, 2024
  47. DrahtBot added the label Needs rebase on Jul 23, 2024
  48. maflcko marked this as ready for review on Jul 24, 2024
  49. test: refactor: Replace SetHex with uint256 constructor directly
    This avoids a hex-decoding and makes the next commit smaller.
    fafe4b8051
  50. scripted-diff: Rename SetHex to SetHexDeprecated
    SetHex is fragile, because it accepts any non-hex input or any length of
    input, without error feedback. This can lead to issues when the input is
    truncated or otherwise corrupted.
    
    Document the problem by renaming the method.
    
    In the future, the fragile method should be removed from the public
    interface.
    
    -BEGIN VERIFY SCRIPT-
     sed -i 's/SetHex/SetHexDeprecated/g' $( git grep -l SetHex ./src )
    -END VERIFY SCRIPT-
    fa103db2bb
  51. maflcko force-pushed on Jul 24, 2024
  52. maflcko force-pushed on Jul 24, 2024
  53. in src/bitcoin-tx.cpp:288 in aaaa66d348 outdated
    284@@ -285,7 +285,7 @@ static void MutateTxAddInput(CMutableTransaction& tx, const std::string& strInpu
    285     }
    286 
    287     // append to transaction input list
    288-    CTxIn txin(Txid::FromUint256(txid), vout, CScript(), nSequenceIn);
    289+    CTxIn txin(*txid, vout, CScript(), nSequenceIn);
    


    hodlinator commented at 8:04 am on July 24, 2024:

    nit: Missed opportunity to convert to curly braces? :)

    Same below for COutPoint.


    maflcko commented at 8:27 am on July 24, 2024:
    Heh, may do if I retouch for other reasons.

    maflcko commented at 8:45 am on July 24, 2024:

    No, see:

     0bitcoin-tx.cpp:288:23: error: non-constant-expression cannot be narrowed from type 'int64_t' (aka 'long') to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
     1  288 |     CTxIn txin{*txid, vout, CScript(), nSequenceIn};
     2      |                       ^~~~
     3bitcoin-tx.cpp:288:23: note: insert an explicit cast to silence this issue
     4  288 |     CTxIn txin{*txid, vout, CScript(), nSequenceIn};
     5      |                       ^~~~
     6      |                       static_cast<uint32_t>( )
     7bitcoin-tx.cpp:637:34: error: non-constant-expression cannot be narrowed from type 'int' to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
     8  637 |             COutPoint out{*txid, nOut};
     9      |                                  ^~~~
    10bitcoin-tx.cpp:637:34: note: insert an explicit cast to silence this issue
    11  637 |             COutPoint out{*txid, nOut};
    12      |                                  ^~~~
    13      |                                  static_cast<uint32_t>( )
    142 errors generated.
    15make[2]: *** [Makefile:16088: bitcoin_tx-bitcoin-tx.o] Error 1
    16make[2]: *** Waiting for unfinished jobs....
    

    hodlinator commented at 7:19 pm on July 24, 2024:

    Attempted cleaning that up, but didn’t uncover anything useful to the end users, unlike #30444, so probably not worth following up on:

     0diff --git a/src/bitcoin-tx.cpp b/src/bitcoin-tx.cpp
     1index 89c03c1647..dab8b81128 100644
     2--- a/src/bitcoin-tx.cpp
     3+++ b/src/bitcoin-tx.cpp
     4@@ -274,8 +274,8 @@ static void MutateTxAddInput(CMutableTransaction& tx, const std::string& strInpu
     5 
     6     // extract and validate vout
     7     const std::string& strVout = vStrInputParts[1];
     8-    int64_t vout;
     9-    if (!ParseInt64(strVout, &vout) || vout < 0 || vout > static_cast<int64_t>(maxVout))
    10+    const auto vout{ToIntegral<uint32_t>(strVout)};
    11+    if (!vout || *vout > maxVout)
    12         throw std::runtime_error("invalid TX input vout '" + strVout + "'");
    13 
    14     // extract the optional sequence number
    15@@ -285,7 +285,7 @@ static void MutateTxAddInput(CMutableTransaction& tx, const std::string& strInpu
    16     }
    17 
    18     // append to transaction input list
    19-    CTxIn txin(*txid, vout, CScript(), nSequenceIn);
    20+    CTxIn txin{*txid, *vout, CScript(), nSequenceIn};
    21     tx.vin.push_back(txin);
    22 }
    23 
    24@@ -634,7 +634,7 @@ static void MutateTxSign(CMutableTransaction& tx, const std::string& flagStr)
    25             if (nOut < 0)
    26                 throw std::runtime_error("vout cannot be negative");
    27 
    28-            COutPoint out(*txid, nOut);
    29+            COutPoint out{*txid, static_cast<uint32_t>(nOut)};
    30             std::vector<unsigned char> pkData(ParseHexUV(prevOut["scriptPubKey"], "scriptPubKey"));
    31             CScript scriptPubKey(pkData.begin(), pkData.end());
    

    maflcko commented at 5:40 am on July 25, 2024:
    0+            COutPoint out{*txid, static_cast<uint32_t>(nOut)};
    

    Thanks for looking again. For reference, getInt has a built in check, so you could use getInt<uint32_t> and avoid this cast. That’d be a behavior change for out-of-range values, but the values are out of range anyway, so should be fine.

  54. in src/test/uint256_tests.cpp:176 in aaaa66d348 outdated
    175+    BOOST_CHECK(TmpL == R2L);
    176+    TmpL = *uint256::FromHex(ZeroL.ToString());
    177+    BOOST_CHECK(TmpL == uint256());
    178 
    179-    TmpL.SetHex(R1L.ToString());
    180+    TmpL = *uint256::FromHex(R1L.ToString());
    


    hodlinator commented at 8:12 am on July 24, 2024:

    Not sure whether we should only do FromHex testing or do it in addition to SetHexDeprecated testing.

    Sure, we want to remove SetHexDeprecated but it is unclear when or how easily it could be done, do you have an exploratory branch doing the removal?


    maflcko commented at 8:28 am on July 24, 2024:
    It is still tested in this test case (see above), and it other test cases (indirectly). Am I missing something, or is there other missing test coverage?

    hodlinator commented at 9:25 am on July 24, 2024:
    There’s only 1 remaining explicit call to SetHexDeprecated in this test case, before there were 7 calls to SetHex. But I guess almost the same tests are done indirectly in the basics test case above where SetHexDeprecated is being called via uint<N>S.
  55. in src/uint256.h:109 in aaaa66d348 outdated
    104+    if (uintN_t::size() * 2 != str.size() || !IsHex(str)) return std::nullopt;
    105+    uintN_t rv;
    106+    rv.SetHexDeprecated(str);
    107+    return rv;
    108+}
    109+} // namespace detail
    


    hodlinator commented at 8:22 am on July 24, 2024:
    Curious: Why introduce this namespace instead of making a protected method in base_blob as I did in my exploration (404c115ea0f53785a640818c2058ef0591c8c6ac)?

    maflcko commented at 8:33 am on July 24, 2024:

    I don’t think it matters much either way, because it is an internal implementation detail. It doesn’t need access to private or protected members, so standalone seems better. Otherwise, there could be confusion when calling base_blob<160>::FromHex<uint256>(hex), as to what values are used. Also, if the function is changed to access the private or protected members, it may be moved to the cpp file, or otherwise “optimized”, so spending too much time here now is probably not worth it.

    Happy to change, if there is a reason.


    hodlinator commented at 8:44 am on July 24, 2024:
    Hm.. yeah, to me the namespace itself is a cognitive speed-bump but I see your point about keeping to the public members.

    maflcko commented at 8:50 am on July 24, 2024:
    Ok, resolving for now. Happy to change if there is a convincing reason.
  56. hodlinator commented at 8:25 am on July 24, 2024: contributor
    Concept ACK aaaa66d348f874b8acd3a9f6208dd90dc322b16b
  57. maflcko force-pushed on Jul 24, 2024
  58. maflcko commented at 8:48 am on July 24, 2024: member

    Addressed all feedback.

    Can be tested by running the new regression test in test/functional/interface_rest.py from this pull on current master to see it fail.

  59. in src/uint256.h:129 in fa045bc327 outdated
    125@@ -105,6 +126,7 @@ class uint160 : public base_blob<160> {
    126  */
    127 class uint256 : public base_blob<256> {
    128 public:
    129+    static std::optional<uint256> FromHex(std::string_view str) { return detail::FromHex<uint256>(str); }
    


    hodlinator commented at 8:54 am on July 24, 2024:
    Why not have all FromHex methods be [[nodiscard]]? Seems bad to throw away work.

    maflcko commented at 9:00 am on July 24, 2024:

    The attribute should only be added to functions where ignoring the return value will lead to an error later on in the code. See #27774 (comment)

    Otherwise, if it was added to all functions that return something, it would instead be better to add it to no function and enforce the behavior via a language plugin, for example via clang-tidy.


    hodlinator commented at 9:13 am on July 24, 2024:
    In principle all pure functions like FromHex (that at most mutate input-parameters and don’t have asserts/throw) should be [[nodiscard]]. One could at least apply that rule to new code?

    paplorinc commented at 9:21 am on July 24, 2024:
    I guess it’s easy for the compiler to eliminate the call if it’s indeed pure and the result isn’t used - but even better if there’s a warning as well

    maflcko commented at 9:41 am on July 24, 2024:

    In principle all pure functions like FromHex (that at most mutate input-parameters and don’t have asserts/throw) should be [[nodiscard]]. One could at least apply that rule to new code?

    If there is a convincing reason to do this, the rule should be added to the dev notes. However, absent a convincing reason and absent the style rule being in the dev notes, I will leave this as-is for now.

  60. DrahtBot removed the label Needs rebase on Jul 24, 2024
  61. in src/test/uint256_tests.cpp:174 in fa045bc327 outdated
    172-    TmpL.SetHex(R2L.ToString());   BOOST_CHECK_EQUAL(TmpL, R2L);
    173-    TmpL.SetHex(ZeroL.ToString()); BOOST_CHECK_EQUAL(TmpL, uint256());
    174+    TmpL = *uint256::FromHex(R2L.ToString());
    175+    BOOST_CHECK(TmpL == R2L);
    176+    TmpL = *uint256::FromHex(ZeroL.ToString());
    177+    BOOST_CHECK(TmpL == uint256());
    


    hodlinator commented at 9:18 am on July 24, 2024:
    Should use BOOST_CHECK_EQUAL_COLLECTIONS().

    maflcko commented at 9:46 am on July 24, 2024:
    Why? This blob of code is untouched in this pull request, so if there is a reason to change it, it should be done in a separate pull request.

    maflcko commented at 9:52 am on July 24, 2024:
    Ok, previously it was BOOST_CHECK_EQUAL(TmpL, uint256());. I may change it to that, if I have to re-touch for other reasons. Otherwise, it can be done in one of the follow-ups.

    hodlinator commented at 10:08 am on July 24, 2024:
    Yeah, sorry, BOOST_CHECK_EQUAL(). Seems to be a mis-merge on your part when rebasing. Considering I took the time to add a whole extra commit to convert away from BOOST_CHECK in this file recently, I would be quicker to ACK if you fixed the merge.

    paplorinc commented at 10:12 am on July 24, 2024:
    +1, gives way better errors on failure

    maflcko commented at 10:24 am on July 24, 2024:
    done
  62. in src/test/uint256_tests.cpp:160 in faefc599a0 outdated
    156@@ -157,7 +157,7 @@ BOOST_AUTO_TEST_CASE( comparison ) // <= >= < >
    157                    uint256S("1000000000000000000000000000000000000000000000000000000000000002"));
    158 }
    159 
    160-BOOST_AUTO_TEST_CASE( methods ) // GetHex SetHexDeprecated begin() end() size() GetLow64 GetSerializeSize, Serialize, Unserialize
    161+BOOST_AUTO_TEST_CASE(methods) // GetHex SetHexDeprecated FromHex begin() end() size() GetLow64 GetSerializeSize, Serialize, Unserialize
    


    paplorinc commented at 9:19 am on July 24, 2024:

    nit: we’re not using SetHexDeprecated here, right?

    0BOOST_AUTO_TEST_CASE(methods) // GetHex FromHex begin() end() size() GetLow64 GetSerializeSize, Serialize, Unserialize
    

    maflcko commented at 10:31 am on July 24, 2024:
    It is
  63. in src/uint256.h:98 in faefc599a0 outdated
    90@@ -88,12 +91,30 @@ class base_blob
    91     }
    92 };
    93 
    94+namespace detail {
    95+/**
    96+ * Writes the hex string into a new uintN_t object and only returns a value
    97+ * iff all of the checks pass:
    98+ *   - All characters are hex
    99+ *   - Input length is uintN_t::size()*2
    


    paplorinc commented at 9:21 am on July 24, 2024:
    nit: the conditions in the code are actually reversed

    maflcko commented at 3:48 pm on July 24, 2024:
    Reordered, because I changed the comment anyway in another push.
  64. in src/rest.cpp:802 in faa4f91fe4 outdated
    791@@ -792,13 +792,14 @@ static bool rest_getutxos(const std::any& context, HTTPRequest* req, const std::
    792             if (txid_out.size() != 2) {
    793                 return RESTERR(req, HTTP_BAD_REQUEST, "Parse error");
    794             }
    795+            auto txid{Txid::FromHex(txid_out.at(0))};
    796             auto output{ToIntegral<uint32_t>(txid_out.at(1))};
    797 
    798-            if (!output || !IsHex(txid_out.at(0))) {
    799+            if (!txid || !output) {
    800                 return RESTERR(req, HTTP_BAD_REQUEST, "Parse error");
    


    paplorinc commented at 9:25 am on July 24, 2024:
    as hitned before, this seems like a slight behavior change: previously if txid_out.at(1) was invalid, txid_out.at(0) wasn’t checked. But I guess it’s just extra work at most, no side-effects should follow from running IsHex, right?

    maflcko commented at 11:05 am on July 24, 2024:

    no side-effects should follow from running IsHex, right?

    Yes, IsHex is pure.

  65. hodlinator changes_requested
  66. hodlinator commented at 9:30 am on July 24, 2024: contributor
    Without seeing an exploratory branch where SetHexDeprecated is removed, I think it might be better to call it SetHexDiscouraged after all, in case it turns out to be really problematic to remove it for some unforeseen reason.
  67. in src/test/blockfilter_tests.cpp:168 in fa045bc327 outdated
    163@@ -165,11 +164,9 @@ BOOST_AUTO_TEST_CASE(blockfilters_json_test)
    164             tx_undo.vprevout.emplace_back(txout, 0, false);
    165         }
    166 
    167-        uint256 prev_filter_header_basic;
    168-        BOOST_CHECK(ParseHashStr(test[pos++].get_str(), prev_filter_header_basic));
    169+        uint256 prev_filter_header_basic{*Assert(uint256::FromHex(test[pos++].get_str()))};
    170         std::vector<unsigned char> filter_basic = ParseHex(test[pos++].get_str());
    


    paplorinc commented at 9:47 am on July 24, 2024:
    aren’t we mixing big and little endian here? Is this by design? As mentioned in the other PR, I think these unintuitive differences should definitely be documented…

    maflcko commented at 10:42 am on July 24, 2024:
    I am not changing this line of code, or the behavior of this test, so a separate pull request seems more appropriate.
  68. maflcko commented at 9:49 am on July 24, 2024: member

    Without seeing an exploratory branch where SetHexDeprecated is removed, I think it might be better to call it SetHexDiscouraged after all, in case it turns out to be really problematic to remove it for some unforeseen reason.

    Thank you for the suggestion. However, outside of src/qt, and internally, the function is used in a single place uint256S, so the name shouldn’t matter much. There is already a lengthy bike-shedding thread about the naming in #30482 (review), with you concluding that the current name is fine. So I don’t really have the motivation and time to go back and forth on this style nit, unless there is a convincing reason. Even if it was “problematic to remove it for some unforeseen reason”, a 9 line scripted-diff can be applied then.

    I appreciate the feedback, but when it comes to trivial style nits, I don’t consider it worth it to spend days on it myself.

  69. hodlinator commented at 10:09 am on July 24, 2024: contributor

    I appreciate the feedback, but when it comes to trivial style nits, I don’t consider it worth it to spend days on it myself.

    Fair.

  70. maflcko force-pushed on Jul 24, 2024
  71. maflcko force-pushed on Jul 24, 2024
  72. in src/test/uint256_tests.cpp:176 in fa479be4d8 outdated
    174+    BOOST_CHECK_EQUAL(TmpL, R2L);
    175+    TmpL = *uint256::FromHex(ZeroL.ToString());
    176+    BOOST_CHECK_EQUAL(TmpL, uint256());
    177 
    178-    TmpL.SetHexDeprecated(R1L.ToString());
    179+    TmpL = *uint256::FromHex(R1L.ToString());
    


    stickies-v commented at 10:30 am on July 24, 2024:

    Would be more robust to use .value() here. Suggestion that also skips unnecessary assignment to TmpL:

     0diff --git a/src/test/uint256_tests.cpp b/src/test/uint256_tests.cpp
     1index 52a8dbc200..430fc14892 100644
     2--- a/src/test/uint256_tests.cpp
     3+++ b/src/test/uint256_tests.cpp
     4@@ -168,12 +168,10 @@ BOOST_AUTO_TEST_CASE(methods) // GetHex SetHexDeprecated FromHex begin() end() s
     5     // Verify previous values don't persist when setting to truncated string.
     6     TmpL.SetHexDeprecated("21");
     7     BOOST_CHECK_EQUAL(TmpL.ToString(), "0000000000000000000000000000000000000000000000000000000000000021");
     8-    TmpL = *uint256::FromHex(R2L.ToString());
     9-    BOOST_CHECK_EQUAL(TmpL, R2L);
    10-    TmpL = *uint256::FromHex(ZeroL.ToString());
    11-    BOOST_CHECK_EQUAL(TmpL, uint256());
    12+    BOOST_CHECK_EQUAL(uint256::FromHex(R2L.ToString()).value(), R2L);
    13+    BOOST_CHECK_EQUAL(uint256::FromHex(ZeroL.ToString()).value(), uint256());
    14 
    15-    TmpL = *uint256::FromHex(R1L.ToString());
    16+    TmpL = uint256::FromHex(R1L.ToString()).value();
    17     BOOST_CHECK_EQUAL_COLLECTIONS(R1L.begin(), R1L.end(), R1Array, R1Array + R1L.size());
    18     BOOST_CHECK_EQUAL_COLLECTIONS(TmpL.begin(), TmpL.end(), R1Array, R1Array + TmpL.size());
    19     BOOST_CHECK_EQUAL_COLLECTIONS(R2L.begin(), R2L.end(), R2Array, R2Array + R2L.size());
    

    maflcko commented at 3:45 pm on July 24, 2024:
    Ok, done in both places
  73. in src/util/transaction_identifier.h:75 in eeeeb0474d outdated
    71@@ -66,6 +72,7 @@ using Txid = transaction_identifier<false>;
    72 /** Wtxid commits to all transaction fields including the witness. */
    73 using Wtxid = transaction_identifier<true>;
    74 
    75+/** DEPRECATED due to missing length-check and hex-check, please use the safer SetHex, or FromUint256 */
    


    stickies-v commented at 10:37 am on July 24, 2024:
    in eeeeb0474d439c8d7e5ea9a680119245a455b5a0: should now be “please use the safer FromHex” since recent force push

    maflcko commented at 3:44 pm on July 24, 2024:
    Good catch. Fixed the typo!
  74. in src/uint256.h:64 in fa479be4d8 outdated
    60@@ -59,6 +61,7 @@ class base_blob
    61 
    62     // Hex string representations are little-endian.
    63     std::string GetHex() const;
    64+    /** Unlike FromHex this accepts any invalid input, thus it is fragile and deprecated */
    


    stickies-v commented at 11:19 am on July 24, 2024:
    nit: should we add this docstring to uint256S too?

    maflcko commented at 3:44 pm on July 24, 2024:
    Yes, done.
  75. in src/rest.cpp:801 in fa93c830d8 outdated
    794@@ -792,13 +795,14 @@ static bool rest_getutxos(const std::any& context, HTTPRequest* req, const std::
    795             if (txid_out.size() != 2) {
    796                 return RESTERR(req, HTTP_BAD_REQUEST, "Parse error");
    797             }
    798+            auto txid{Txid::FromHex(txid_out.at(0))};
    799             auto output{ToIntegral<uint32_t>(txid_out.at(1))};
    800 
    801-            if (!output || !IsHex(txid_out.at(0))) {
    802+            if (!txid || !output) {
    


    stickies-v commented at 3:07 pm on July 24, 2024:

    Since this is a potentially breaking interface change for downstream applications, adding release notes might be appropriate?

    edit: perhaps interface change is not the right description, since hex txids really always have required 64 hex chars by definition. Not sure where we draw the line here. I’d be okay with not having release notes for this too, but doesn’t hurt to have it either. Up to you.

     0diff --git a/doc/release-notes-30482.md b/doc/release-notes-30482.md
     1new file mode 100644
     2index 0000000000..6eb8af3f69
     3--- /dev/null
     4+++ b/doc/release-notes-30482.md
     5@@ -0,0 +1,6 @@
     6+Updated REST APIs
     7+-----------------
     8+- Parameter validation for `/rest/getutxos` has been improved by
     9+  rejecting truncated or overly large txids by raising an
    10+  HTTP_BAD_REQUEST "Parse error". Previously, these malformed txids
    11+  would be silently ignored.
    

    maflcko commented at 3:43 pm on July 24, 2024:
    Sure, done for both pulls.
  76. stickies-v approved
  77. stickies-v commented at 3:08 pm on July 24, 2024: contributor

    ACK fa93c830d8674d01731868c1224190c3cd02d411

    Left a couple of suggestions but nothing blocking.

    Returning an early “Parse error” for malformed txids instead of silently ignoring them is a better user interface, and the underlying hex parse refactoring is a very welcome improvement.

  78. DrahtBot requested review from hodlinator on Jul 24, 2024
  79. refactor: Implement strict uint256::FromHex()
    This is a safe replacement of the previous SetHex, which now returns an
    optional to indicate success or failure.
    
    The code is similar to the ParseHashStr helper, which will be removed in
    a later commit.
    fad2991ba0
  80. refactor: Expose FromHex in transaction_identifier
    This is needed for the next commit.
    fab6ddbee6
  81. rest: Reject truncated hex txid early in getutxos parsing fa90777245
  82. refactor: Replace ParseHashStr with FromHex
    No need to have two functions with different names that achieve the
    exact same thing.
    fa7b57e5f5
  83. doc: Add release notes for two pull requests fac0c3d4bf
  84. maflcko force-pushed on Jul 24, 2024
  85. stickies-v commented at 4:10 pm on July 24, 2024: contributor
    re-ACK fac0c3d4bfc97b94f0594f7606650921feef2c8a - only doc and test updates to address review comments, thanks!
  86. hodlinator approved
  87. hodlinator commented at 5:23 pm on July 24, 2024: contributor

    ACK fac0c3d4bfc97b94f0594f7606650921feef2c8a

    make check passed and test_runner.py passed.

    Happy you went with FromHex()!

  88. fanquake merged this on Jul 25, 2024
  89. fanquake closed this on Jul 25, 2024

  90. maflcko deleted the branch on Jul 25, 2024
  91. in src/util/transaction_identifier.h:76 in fac0c3d4bf
    71@@ -66,6 +72,7 @@ using Txid = transaction_identifier<false>;
    72 /** Wtxid commits to all transaction fields including the witness. */
    73 using Wtxid = transaction_identifier<true>;
    74 
    75+/** DEPRECATED due to missing length-check and hex-check, please use the safer FromHex, or FromUint256 */
    76 inline Txid TxidFromString(std::string_view str)
    


    maflcko commented at 4:56 pm on July 25, 2024:

    If someone is looking for follow-up work, a test-only change could be to replace TxidFromString(a) with Txid::FromHex(a).value() in src/test.

    0git grep TxidFromString ./src/test/
    1git grep WtxidFromString
    

    stickies-v commented at 8:11 pm on July 25, 2024:
    I’ve started working on a branch that removes (W)TxidFromString entirely, it seems the qt callsites would work well with FromHex too.

    stickies-v commented at 4:18 pm on July 26, 2024:
    Done in #30532. Also started working on a similar branch to remove uint256S.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me