doc: Clarify comments about endianness after #30526 #31596

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/docblob changing 4 files +26 โˆ’13
  1. ryanofsky commented at 11:39 pm on January 2, 2025: contributor

    This is a documentation-only change following up on suggestions made in the #30526 review.

    Motivation for this change is that I was recently reviewing #31583, which reminded me how confusing the arithmetic blob code was and made me want to write better comments.

  2. DrahtBot commented at 11:39 pm on January 2, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31596.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK TheCharlatan, Sjors, BrandonOdiwuor, achow101
    Concept ACK l0rinc

    If your review is incorrectly listed, please react with ๐Ÿ‘Ž to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Docs on Jan 2, 2025
  4. Sjors commented at 8:57 am on January 3, 2025: member
    ACK 13464c0c44645e0ed88d9d72c3ad697dca800e10
  5. in src/uint256.h:88 in 13464c0c44 outdated
    86+     * last), the GetHex() output will match the way the number would normally
    87+     * be written in base-16 (with most significant digits first and least
    88+     * significant digits last).
    89+     *
    90+     * This means, for example, that ArithToUint256(num).GetHex() can be used to
    91+     * used to display an arith_uint256 num value as a number, because
    


    l0rinc commented at 9:38 am on January 3, 2025:
    0     * This means, for example, that ArithToUint256(num).GetHex() can be used to
    1     * display an arith_uint256 num value as a number, because
    

    ryanofsky commented at 2:07 pm on January 3, 2025:

    re: #31596 (review)

    Thanks, took suggestion

  6. in src/arith_uint256.h:29 in 13464c0c44 outdated
    25@@ -26,6 +26,7 @@ class base_uint
    26 protected:
    27     static_assert(BITS / 32 > 0 && BITS % 32 == 0, "Template parameter BITS must be a positive multiple of 32.");
    28     static constexpr int WIDTH = BITS / 32;
    29+    /** Number represented in least-significant-first base 2^32 digits. */
    


    l0rinc commented at 10:02 am on January 3, 2025:

    I had to read this several times (didn’t understand what a “least-significant-first base” is at first), consider the following:

    0    /** Integer with 32-bit digits, least-significant first. */
    

    ryanofsky commented at 1:51 pm on January 3, 2025:

    re: #31596 (review)

    Thanks, agree that is clearer, took the suggestion but kept “represented”

  7. in src/policy/packages.h:94 in 13464c0c44 outdated
    88@@ -89,8 +89,9 @@ bool IsChildWithParents(const Package& package);
    89  */
    90 bool IsChildWithParentsTree(const Package& package);
    91 
    92-/** Get the hash of these transactions' wtxids, concatenated in lexicographical order (treating the
    93- * wtxids as little endian encoded uint256, smallest to largest). */
    94+/** Get the hash of these transactions' wtxids, with wtxids treated as little
    95+ * endian numbers and sorted in increasing numeric order (i.e. with byte-reversed
    96+ * wtxid hashes sorted lexicographically.) */
    


    l0rinc commented at 10:16 am on January 3, 2025:

    According to Wiki little endian is preferred with a dash: https://en.wikipedia.org/wiki/Endianness

    But the part I find confusing is rather expressing 3 concepts in a single sentence - can we break it up a bit?

    0/** Get the hash of the wtxids of these transactions, treating each wtxid hash 
    1 * as a little-endian number and sorting them in ascending numeric order.
    2 * This is equivalent to reversing the bytes of each wtxid hash and sorting them lexicographically.
    3 */
    

    ryanofsky commented at 1:59 pm on January 3, 2025:

    re: #31596 (review)

    Thanks I basically just took your first sentence. I dropped the second sentence, because I think it’s not essential, and the phrasing is potentially misleading because the the wtxid bytes are only reversed when they are sorted, not when they hashed after being sorted.

    Should be ok not to mention here that you can sort little endian numbers by reversing the bytes.

  8. in src/uint256.h:90 in 13464c0c44 outdated
    88+     * significant digits last).
    89+     *
    90+     * This means, for example, that ArithToUint256(num).GetHex() can be used to
    91+     * used to display an arith_uint256 num value as a number, because
    92+     * ArithToUint256() converts the number to a blob in little-endian format,
    93+     * so the arith_uint256 class doesn't need to have its own number parsing
    


    l0rinc commented at 10:21 am on January 3, 2025:
    Could we be more compact here? Personally I’m not a fan of long documentation - though the reason usually is that if we need to explain something heavily it’s a sign of poorly written code - not something we can really change here.

    Sjors commented at 10:53 am on January 3, 2025:
    I think it’s a useful comment.

    remcoros commented at 5:05 pm on January 3, 2025:
    ‘can be used to used to display’ -> ‘can be used to display’

    l0rinc commented at 5:15 pm on January 3, 2025:
    This was already fixed

    remcoros commented at 5:19 pm on January 3, 2025:
    ๐Ÿ‘ my bad, github didn’t show it as outdated and was looking at the convo tab.
  9. in src/rpc/mining.cpp:636 in 13464c0c44 outdated
    632@@ -633,8 +633,8 @@ static RPCHelpMan getblocktemplate()
    633                     {RPCResult::Type::OBJ, "", "",
    634                     {
    635                         {RPCResult::Type::STR_HEX, "data", "transaction data encoded in hexadecimal (byte-for-byte)"},
    636-                        {RPCResult::Type::STR_HEX, "txid", "transaction id encoded in little-endian hexadecimal"},
    637-                        {RPCResult::Type::STR_HEX, "hash", "hash encoded in little-endian hexadecimal (including witness data)"},
    638+                        {RPCResult::Type::STR_HEX, "txid", "txid hash (excluding witness data) with hash bytes reversed, encoded as hex"},
    


    l0rinc commented at 10:30 am on January 3, 2025:

    documenting txid as “it’s the txid hash” is a bit redundant and confusing. In https://github.com/bitcoin/bitcoin/blob/master/src/rpc/mempool.cpp#L135 (and a few other places) the description is "The transaction hash in hex". Could we unify the rest as well to something simple like

    0                        {RPCResult::Type::STR_HEX, "txid", "transaction id hash (without witness) shown in byte-reversed hex"},
    

    ryanofsky commented at 2:07 pm on January 3, 2025:

    re: #31596 (review)

    Thanks, I like “shown in byte-reversed hex” as a shorter alternative, and tried to incorporate your other ideas while not sounding ambiguous. If reviewers are happy with the new phrasing I’ll add new commit changing other RPCs to use it. The other RPCs do seem ok as-is because they don’t mention “little-endian hexadecimal”, but it’s good to be consistent.


    ryanofsky commented at 2:08 am on January 7, 2025:

    re: #31596 (review)

    Could we unify the rest as well

    To follow up, I looked at occurrences of "txid", "wtxid", and "hash" (quoted strings) in the RPC code and found dozens of fields that could be updated to be consistent. It seems more challenging than I expected to update all the documentation, and now I’m also wondering if might actually be harmful to mention reversed bytes in some cases but not others, because this might give the wrong impression that bytes are not always reversed for every hash field.

    I wonder if maybe a better approach to provide more consistency could be to introduce new RPC types like STR_TXID, and STR_WTXID and use them instead of the more generic STR_HEX type, so specific documentation about the meanings of these hashes could be incorporated into the type documentation and not repeated in field descriptions.

  10. l0rinc commented at 10:35 am on January 3, 2025: contributor
    Thanks for improving these! Since we canโ€™t usually change the behavior, itโ€™s crucial to share tribal knowledge. I have a few suggestions for reducing duplication, simplifying sentence structure, and applying these changes to similar places. However, I understand if you prefer to limit the scope
  11. doc: Clarify comments about endianness after #30526
    This is a documentation-only change following up on suggestions made in the
    #30526 review.
    
    Motivation for this change is that I was recently reviewing #31583, which
    reminded me how confusing the arithmetic blob code was and made me want to
    write better comments.
    3e0a992a3f
  12. ryanofsky force-pushed on Jan 3, 2025
  13. ryanofsky commented at 2:26 pm on January 3, 2025: contributor

    Updated 13464c0c44645e0ed88d9d72c3ad697dca800e10 -> 3e0a992a3f0f2b15b7be5049dc4f3134b4b0bc40 (pr/docblob.1 -> pr/docblob.2, compare) with suggestions

    Thanks for the reviews!

  14. DrahtBot added the label CI failed on Jan 3, 2025
  15. in src/policy/packages.h:93 in 3e0a992a3f
    88@@ -89,8 +89,9 @@ bool IsChildWithParents(const Package& package);
    89  */
    90 bool IsChildWithParentsTree(const Package& package);
    91 
    92-/** Get the hash of these transactions' wtxids, concatenated in lexicographical order (treating the
    93- * wtxids as little endian encoded uint256, smallest to largest). */
    94+/** Get the hash of the concatenated wtxids of transactions, with wtxids
    95+ * treated as a little-endian numbers and sorted in ascending numeric order.
    


    l0rinc commented at 4:57 pm on January 3, 2025:
    0 * treated as little-endian numbers and sorted in ascending numeric order.
    

    ryanofsky commented at 2:03 am on January 7, 2025:

    re: #31596 (review)

    Thanks, fixed in a followup branch.

  16. in src/rpc/mining.cpp:637 in 3e0a992a3f
    632@@ -633,8 +633,8 @@ static RPCHelpMan getblocktemplate()
    633                     {RPCResult::Type::OBJ, "", "",
    634                     {
    635                         {RPCResult::Type::STR_HEX, "data", "transaction data encoded in hexadecimal (byte-for-byte)"},
    636-                        {RPCResult::Type::STR_HEX, "txid", "transaction id encoded in little-endian hexadecimal"},
    637-                        {RPCResult::Type::STR_HEX, "hash", "hash encoded in little-endian hexadecimal (including witness data)"},
    638+                        {RPCResult::Type::STR_HEX, "txid", "transaction hash excluding witness data, shown in byte-reversed hex"},
    639+                        {RPCResult::Type::STR_HEX, "hash", "transaction hash including witness data, shown in byte-reversed hex"},
    


    TheCharlatan commented at 9:22 am on January 6, 2025:
    Nit: Call it hexadecimal instead of hex like done in the "data" field for better consistency?

    ryanofsky commented at 2:07 am on January 7, 2025:

    re: #31596 (review)

    Nit: Call it hexadecimal instead of hex like done in the "data" field for better consistency?

    I would probably prefer to go the other way since the term “hex” occurs more frequently than “hexadecimal” in RPC code and documentation (115 vs 11 times). I’m hoping we might be able to improve consistency by moving this information from descriptions of RPC fields to descriptions of RPC types and left another comment about this.

  17. TheCharlatan approved
  18. TheCharlatan commented at 9:34 am on January 6, 2025: contributor
    ACK 3e0a992a3f0f2b15b7be5049dc4f3134b4b0bc40
  19. DrahtBot requested review from Sjors on Jan 6, 2025
  20. Sjors approved
  21. Sjors commented at 9:40 am on January 6, 2025: member
    ACK 3e0a992a3f0f2b15b7be5049dc4f3134b4b0bc40
  22. DrahtBot removed the label CI failed on Jan 6, 2025
  23. BrandonOdiwuor approved
  24. BrandonOdiwuor commented at 4:23 pm on January 6, 2025: contributor
    LGTM ACK 3e0a992a3f0f2b15b7be5049dc4f3134b4b0bc40
  25. achow101 commented at 9:49 pm on January 6, 2025: member
    ACK 3e0a992a3f0f2b15b7be5049dc4f3134b4b0bc40
  26. l0rinc commented at 9:51 pm on January 6, 2025: contributor
    There’s still a typo there, could we fix it before merging? Otherwise I’d ACK as well.
  27. achow101 merged this on Jan 6, 2025
  28. achow101 closed this on Jan 6, 2025

  29. ryanofsky referenced this in commit 775681a490 on Jan 7, 2025
  30. ryanofsky commented at 2:19 am on January 7, 2025: contributor
    Thanks for the reviews. Since there are more followups that could be made here, I created a branch to keep track of them. I don’t want to create too much churn with documentation PRs so I won’t open another one right away, but I can implement any suggestions made here, open a pr if requested, or wait for a related PR to be opened to attach these changes to.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me