Replace uint256/uint160 by opaque blobs where possible #5478

pull laanwj wants to merge 15 commits into bitcoin:master from laanwj:2014_12_the_blob2 changing 103 files +1406 −1167
  1. laanwj commented at 3:44 pm on December 15, 2014: member

    This pull replaces almost all uses of uint256 and all uses of uint160 to use opaque byte blobs blob256 and blob160 with only the following operations:

    • Default initialization to 0, or from a vector of bytes
    • Assignment from other blobXXXs
    • IsNull() compare to special all-zeros value
    • SetNull() clear to special all-zeros value: Bitcoin needs IsNull() / SetNull() because we often use the all-zeroes value as a marker for errors and empty values.
    • < for sorting in maps.
    • != == test for (in)equality
    • GetHex/SetHex/ToString because they’re just so useful
    • begin()/end() raw access to the data
    • size() returns a fixed size
    • GetSerializeSize() Serialize Unserialize serialization just reads and writes the bytes
    • GetCheapHash() this is similar to GetLow64() and returns part of the value as uint64_t, for cheap hashing when the contents are assumed distributed uniformy random.

    uint256 (used for proof-of-work mainly), on the other hand, loses all functionality like raw bytes access and serialization. Its memory should not be accessed directly. This is necessary for #888 (big endian support). uint160 is completely removed as Bitcoin Core does no 160-bit integer arithmetic.

    Overall steps (see commits)

    Even though the diff is huge, I’ve tried to follow a logical and easy to review process:

    1. Introduce base_uint::SetNull and base_uint::IsNull() as well as other methods that will later be on base_blob, to prepare for migration

    2. Replace x=0 with .SetNull(), x==0 with IsNull(), x!=0 with !IsNull(). Replace uses of uint256(0) with uint256().

    3. Introduce blob256 and blob160 as well as conversion functions (only needed for blob256, we don’t ever compute with uint160).

    4. Replace GetLow64() with GetCheapHash()

    5. Rename uint256 and uint160 to blob256 and blob160 except where big integers are really necessary. For reviewing convenience I separated this out into

      A) pure renames uintXXX to blobXXX, can easily be verified (in reverse) with find -name *.h -print0 | xargs -0 sed -i ’s/blob256/uint256/g' find -name *.cpp -print0 | xargs -0 sed -i ’s/blob256/uint256/g' find -name *.h -print0 | xargs -0 sed -i ’s/blob160/uint160/g' find -name *.cpp -print0 | xargs -0 sed -i ’s/blob160/uint160/g'

      B) string conversions uint256(“string”) to blob256S(“string”)

      C) Added #includes and predeclared classes

      D) Necessary conversions between uint256 and blob256 Focus reviewing here

    6. Remove now-unused methods from base_uint and blob160/blob256, eg GetHash, also remove unused uint160.

  2. Temporarily add SetNull/IsNull/GetCheapHash to base_uint
    Eases step-by-step migration to blob.
    722cdf988a
  3. Replace direct use of 0 with SetNull and IsNull
    Replace x=0 with .SetNull(),
    x==0 with IsNull(), x!=0 with !IsNull().
    Replace uses of uint256(0) with uint256().
    fa06531ec1
  4. Replace GetLow64 with GetCheapHash 9e6b7622a4
  5. Add blob256.cpp/h to build 354700c19f
  6. Add UintToBlob256 and BlobToUint256
    Convert between blobs and uints, mostly for proof of work checks.
    ee72dde317
  7. Replace uint256(1) with static constant
    SignatureHash and its test function SignatureHashOld
    return uint256(1) as a special error signaling value.
    Return a local static constant with the same value instead.
    9b8432822d
  8. protect base_uint begin() and end()
    Clients outside the class have no business poking at the internals.
    1369788889
  9. blob256: make initialization from string explicit
    Avoid dangerous cases where 0 is interpreted as std::string(0).
    Keyword `explicit` does not help here.
    4008b7a683
  10. A: pure renames uintXXX to blobXXX d9569ed027
  11. B: string conversions cc28e23c33
  12. C: Includes and predeclared classes 9c57d5d5f9
  13. D: Necessary conversions between uint256 and blob256 65f2ff8404
  14. laanwj force-pushed on Dec 15, 2014
  15. paveljanik commented at 8:46 pm on December 15, 2014: contributor

    This appears to be unused now:

    0src/test/uint256_tests.cpp:const double R1Sdouble = 0.7096329412477836074; 
    
  16. laanwj commented at 5:16 am on December 16, 2014: member
    @paveljanik Thanks, I’ll remove it. All *S variables in uint256_tests.cpp are for testing uint160, which went away.
  17. Remove uint160
    No uint160 arithmetic is used at all. Also remove the tests.
    2a0f5812bf
  18. Remove now-unused methods from uint256 and base_uint cfe94532e7
  19. laanwj commented at 7:45 am on December 16, 2014: member
    Re: people complaining about rebasing their pulls, if the large diff in ‘A: pure renames’ is problematic, we could cheat by changing uint256 and uint160 to be blob types and introduce a new type for actual 256 bit integer arithmetic. But as clear type names are important I don’t really like this.
  20. Add tests for blob256 8f1563a3a5
  21. laanwj force-pushed on Dec 16, 2014
  22. laanwj added the label Improvement on Dec 16, 2014
  23. in src/blob256.h: in 8f1563a3a5
    52+        memset(data, 0, sizeof(data));
    53+    }
    54+
    55+    friend inline bool operator==(const base_blob& a, const base_blob& b) { return memcmp(a.data, b.data, sizeof(a.data)) == 0; }
    56+    friend inline bool operator!=(const base_blob& a, const base_blob& b) { return memcmp(a.data, b.data, sizeof(a.data)) != 0; }
    57+    friend inline bool operator<(const base_blob& a, const base_blob& b) {
    


    sipa commented at 11:24 am on December 16, 2014:
    Would it break anything if we defined the ordering of blob256 as platform-dependent? That would allow using memcmp for this operator too.

    sipa commented at 11:28 am on December 16, 2014:

    In fact, I think that would allow implementing blob* as wrappers around byte arrays, and leave all integer conversion to uint*.

    EDIT: Sorry, they already are byte-arrays; I should read more before commenting.

    EDIT2: In fact, I think the implementation below is already identical to just “memcmp(a.data, b.data, sizeof(a.data)) < 0”.


    laanwj commented at 11:45 am on December 16, 2014:
    Yes, it’s identical to memcmp, good catch. Will use that.
  24. in src/blob256.h: in 8f1563a3a5
    26+            data[i] = 0;
    27+    }
    28+
    29+    base_blob(const base_blob& b)
    30+    {
    31+        for (int i = 0; i < WIDTH; i++)
    


    sipa commented at 11:35 am on December 16, 2014:

    memcpy?

    Or just the default copy constructor, really.


    laanwj commented at 11:58 am on December 16, 2014:
    Good point. The default POD copy constructor (as well as assignment operator?) will do the right thing here.

    sipa commented at 12:13 pm on December 16, 2014:
    Yup, the assignment operator too. Though perhaps those are needed to make the inherited blob256/uint256 work; unsure.
  25. in src/blob256.h: in 8f1563a3a5
    21+    uint8_t data[WIDTH];
    22+public:
    23+    base_blob()
    24+    {
    25+        for (int i = 0; i < WIDTH; i++)
    26+            data[i] = 0;
    


    sipa commented at 11:36 am on December 16, 2014:
    memset?

    laanwj commented at 11:45 am on December 16, 2014:
    ACK
  26. in src/blob256.h: in 8f1563a3a5
    115+     * used when the contents are considered uniformly random. It is not appropriate
    116+     * when the value can easily be influenced from outside as e.g. a network adversary could
    117+     * provide values to trigger worst-case behavior.
    118+     * @note The result of this function is not stable between little and big endian.
    119+     */
    120+    uint64_t GetCheapHash() const
    


    sipa commented at 11:37 am on December 16, 2014:
    This can probably be local to blob256.h?

    laanwj commented at 11:44 am on December 16, 2014:
    How that? There are four uses of this function in addrman.cpp and one in main.h.

    sipa commented at 11:48 am on December 16, 2014:
    Sorry, that was confused. I mean making it specific to the blob256 type, as opposed to base_blob.

    laanwj commented at 12:31 pm on December 16, 2014:
    SGTM
  27. sipa commented at 12:44 pm on December 16, 2014: member

    I would actually like the PR as a whole to just introduce arith_uint256 or something (for the version with arithmetic semantics) and leave uint256/uint160 in place (for the version without). That will result in a much smaller patchset, and require much less rebasings while this is being reviewed.

    Perhaps later there can be mass rename that is trivial to review and merge.

  28. laanwj commented at 1:25 pm on December 16, 2014: member
    Ok, reluctantly agreed… As I say above already I hate the idea of using uint160/uint256 for what are not actually integers and introduce a yes_this_is_really_an_int256 for real uint256 arithmetic, but yes the diff will be much smaller.
  29. laanwj commented at 2:05 pm on December 16, 2014: member

    Closing, will reopen after reorganization.

    Continued in #5490

  30. laanwj closed this on Dec 16, 2014

  31. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-04-04 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me