Remove double serialization; use software encoder for fee estimation #21966

pull sipa wants to merge 6 commits into bitcoin:master from sipa:202105_softfloat changing 11 files +258 −160
  1. sipa commented at 1:38 am on May 17, 2021: member

    Based on #21981.

    This adds a software-based platform-independent float/double encoder/decoder (platform independent in the sense that it only uses arithmetic and library calls, but never inspects the binary representation). This should strengthen our guarantee that encoded float/double values are portable across platforms. It then removes the functionality to serialize doubles from serialize.h, and replaces its only (non-test) use for fee estimation data serialization with the software encoder.

    At least on x86/ARM, the only difference should be how certain NaN values are encoded/decoded (but not whether they are NaN or not).

    It comes with tests that verify on is_iec559 platforms (which are the only ones we support, at least for now) that the serialized bytes exactly match the binary representation of floats in memory (for non-NaN).

  2. sipa force-pushed on May 17, 2021
  3. sipa force-pushed on May 17, 2021
  4. DrahtBot added the label Build system on May 17, 2021
  5. DrahtBot added the label Utils/log/libs on May 17, 2021
  6. sipa removed the label Build system on May 17, 2021
  7. sipa force-pushed on May 17, 2021
  8. DrahtBot commented at 3:43 am on May 17, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #21981 (Remove unused float serialization by MarcoFalke)
    • #21969 (refactor: Switch serialize to uint8_t (Bundle 1/2) by MarcoFalke)
    • #10443 (Add fee_est tool for debugging fee estimation code by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  9. practicalswift commented at 9:40 am on May 17, 2021: contributor
    Strong Concept ACK
  10. laanwj commented at 10:23 am on May 17, 2021: member

    While I think this is intriguing technically, I’m ~0 on this conceptually

    I think using floating point in places where 100% precision or portability across platforms is important is mistaken in the first place. This gives the wrong impression.

    It raises deeper questions for me: do we really need floating point in Bitcoin at all? Narrower: do we really need to serialize/deserialize it? (clearly it’s already not used for anything consensus critical)

  11. MarcoFalke commented at 10:28 am on May 17, 2021: member
    Haven’t checked, but I presume it is only used for fees.dat?
  12. laanwj commented at 10:42 am on May 17, 2021: member
    If so, I’d prefer to find an alternative way of serializing those values specifically (as fixed-point or numerator / denominator pairs of integers) and dropping the general float/double (de)serialization.
  13. practicalswift commented at 2:08 pm on May 17, 2021: contributor

    Narrower: do we really need to serialize/deserialize it?

    As MarcoFalke discovered ser_float_to_uint32 and ser_uint32_to_float are currently unused (#21981), and AFAICT TxConfirmStats::Write and TxConfirmStats::Read are the only remaining users of ser_double_to_uint64 and ser_uint64_to_double.

    Looks promising! :)

  14. sipa commented at 6:10 pm on May 17, 2021: member

    @laanwj That"s fair. I agree that avoiding serialization of floating point values directly is much more desirable.

    My thinking here is that it effectively gives us a well-specified serialization with testable properties without needing to break compatibility with existing files.

    How about rebasing this PR on top of #21981, and making it remove serialization support for double too in serialization.h, and instead make the feedata writing/reading code invoke EncodeDouble/DecodeDouble directly?

  15. laanwj commented at 7:39 am on May 18, 2021: member

    How about rebasing this PR on top of #21981, and making it remove serialization support for double too in serialization.h, and instead make the feedata writing/reading code invoke EncodeDouble/DecodeDouble directly?

    I was about to comment this! Let’s make this code private to the single case where it is used? It keeps the current format but also prevents future serialization of these types :smile:

  16. sipa force-pushed on May 18, 2021
  17. sipa renamed this:
    Software float encoding
    Remove double serialization; use software encoder for fee estimation
    on May 18, 2021
  18. sipa force-pushed on May 18, 2021
  19. sipa commented at 7:55 pm on May 18, 2021: member
    @laanwj Done.
  20. practicalswift commented at 10:06 pm on May 18, 2021: contributor

    cr ACK 892522deba52f35fe5b08e2d7aebeb80600f9cdc: patch looks correct :)

    Very happy to see ser_double_to_uint64/ser_uint64_to_double go and src/compat/assumptions.h shrink :)

  21. laanwj commented at 8:23 am on May 19, 2021: member

    Thanks! Happy to see this now.

    Code review ACK 892522deba52f35fe5b08e2d7aebeb80600f9cdc Code review re-ACK 66545da2008cd9e806e41b74522ded259cd64f86

  22. in src/test/serfloat_tests.cpp:31 in ae87a8aa4f outdated
    23+    } else {
    24+        // Everything else is.
    25+        BOOST_CHECK(!std::isnan(f2));
    26+        uint64_t i2 = EncodeDouble(f2);
    27+        BOOST_CHECK_EQUAL(f, f2);
    28+        BOOST_CHECK_EQUAL(i, i2);
    


    MarcoFalke commented at 6:15 am on May 20, 2021:

    would it make sense to also assert equality with the “legacy” serialization helpers in both this unit test and the fuzz test?

    Otherwise we might run into a corrupted fees.dat after an upgrade. A problem worse than the problem this pull is trying to solve (making fees.dat architecture independent).


    MarcoFalke commented at 6:20 am on May 20, 2021:
    Oh sorry, I see that is what the “should produce exactly the same as the in-memory representation for non-NaN” test is doing.

    MarcoFalke commented at 6:23 am on May 20, 2021:
    Though, it looks like the fuzz test isn’t fuzzing the in-memory representation, only the round-trip?

    sipa commented at 11:15 pm on May 24, 2021:
    Added back.

    MarcoFalke commented at 10:22 am on June 7, 2021:
    Ah thanks, though the fuzzed_data_provider.ConsumeFloatingPoint only consumes “nice” floats, not odd ones. So the fuzzer is only testing the happy path.

    MarcoFalke commented at 11:56 am on June 7, 2021:
    Fixed in #22180
  23. in src/test/serfloat_tests.cpp:41 in ae87a8aa4f outdated
    36+    BOOST_CHECK_EQUAL(TestDouble(0.0), 0);
    37+    BOOST_CHECK_EQUAL(TestDouble(-0.0), 0x8000000000000000);
    38+    BOOST_CHECK_EQUAL(TestDouble(std::numeric_limits<double>::infinity()), 0x7ff0000000000000);
    39+    BOOST_CHECK_EQUAL(TestDouble(-std::numeric_limits<double>::infinity()), 0xfff0000000000000);
    40+
    41+    if (std::numeric_limits<float>::is_iec559) {
    


    MarcoFalke commented at 6:24 am on May 20, 2021:
    Any reason to check for float, but then test double?

    sipa commented at 11:15 pm on May 24, 2021:
    The technical term for this would be an “oops”. Fixed.
  24. Remove unused float serialization e40224d0c7
  25. Add platform-independent float encoder/decoder 2be4cd94f4
  26. Add unit tests for serfloat module bda33f98e2
  27. Convert existing float encoding tests afd964d70b
  28. Convert uses of double-serialization to {En,De}codeDouble fff1cae43a
  29. Remove support for double serialization 66545da200
  30. sipa force-pushed on May 24, 2021
  31. in src/test/fuzz/float.cpp:24 in 66545da200
    37-        stream << f;
    38-        float f_deserialized;
    39-        stream >> f_deserialized;
    40-        assert(f == f_deserialized);
    41+        uint64_t encoded = EncodeDouble(d);
    42+        if constexpr (std::numeric_limits<double>::is_iec559) {
    


    practicalswift commented at 7:08 pm on May 25, 2021:

    Note to other reviewers: We assume this to hold true in assumptions.h:

    0static_assert(std::numeric_limits<double>::is_iec559, "IEEE 754 double assumed");
    
  32. in src/test/serfloat_tests.cpp:50 in 66545da200
    45+    BOOST_CHECK_EQUAL(TestDouble(2.0), 0x4000000000000000ULL);
    46+    BOOST_CHECK_EQUAL(TestDouble(4.0), 0x4010000000000000ULL);
    47+    BOOST_CHECK_EQUAL(TestDouble(785.066650390625), 0x4088888880000000ULL);
    48+
    49+    // Roundtrip test on IEC559-compatible systems
    50+    if (std::numeric_limits<double>::is_iec559) {
    


    practicalswift commented at 7:08 pm on May 25, 2021:

    Note to other reviewers: We assume this to hold true in assumptions.h:

    0static_assert(std::numeric_limits<double>::is_iec559, "IEEE 754 double assumed");
    

    sipa commented at 7:11 pm on May 25, 2021:
    I should remove that assumption now, it’s no longer needed.

    practicalswift commented at 7:30 pm on May 25, 2021:
    I think we need that assumption as long as we’re doing floating-point division by zero? I still think we do that in ConnectBlock, CreateTransaction and EstimateMedianVal :)

    sipa commented at 7:35 pm on May 25, 2021:
    Ah, good point.

    laanwj commented at 7:52 am on May 26, 2021:

    Maybe a weaker assumption could do there. In any case, it’s out of scope for this PR. Happy to leave it as it is for the foreseeable future unless anyone has good reason to work on porting bitcoind to non-IEC559 platforms.

    (in which case I heartily suggest: please get rid of all floating point code. it shouldn’t be necessary in financial-ish code)

  33. practicalswift commented at 7:09 pm on May 25, 2021: contributor
    cr re-ACK 66545da2008cd9e806e41b74522ded259cd64f86
  34. laanwj merged this on May 26, 2021
  35. laanwj closed this on May 26, 2021

  36. sidhujag referenced this in commit efa10235fa on May 27, 2021
  37. MarcoFalke commented at 11:36 am on June 7, 2021: member
    review ACK
  38. kittywhiskers referenced this in commit 15c67f4fef on Jun 16, 2021
  39. kittywhiskers referenced this in commit c8cd8bfcaa on Jun 16, 2021
  40. kittywhiskers referenced this in commit cf8d5a958b on Jun 16, 2021
  41. kittywhiskers referenced this in commit 43f1909376 on Jun 16, 2021
  42. kittywhiskers referenced this in commit f42dcbbfb1 on Jun 24, 2021
  43. kittywhiskers referenced this in commit d129516f22 on Jun 25, 2021
  44. kittywhiskers referenced this in commit e8d09709c9 on Jun 25, 2021
  45. kittywhiskers referenced this in commit eecf91cd44 on Jun 26, 2021
  46. kittywhiskers referenced this in commit 441a37f26b on Jun 26, 2021
  47. kittywhiskers referenced this in commit 7d319145ab on Jun 27, 2021
  48. kittywhiskers referenced this in commit f946c68f83 on Jun 27, 2021
  49. UdjinM6 referenced this in commit 7d664c7c53 on Jun 27, 2021
  50. gades referenced this in commit 4915046193 on May 1, 2022
  51. gwillen referenced this in commit 95ce7de90f on Jun 1, 2022
  52. DrahtBot locked this on Aug 18, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-27 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me