Add parameter feature to serialization and use it for CAddress #19503

pull sipa wants to merge 6 commits into bitcoin:master from sipa:202007_ser_options changing 11 files +381 −46
  1. sipa commented at 4:40 am on July 13, 2020: member

    This introduces a concept of “serialization parameters”. Every serializer can define its own parameter type, and a value of that type must be provided when serializing/deserializing, e.g. using ss << WithParams(parameter, value), causing value to be serialized to ss, with parameter parameter. Note these parameters are automatically passed down the stack, so for example a vector can be serialized with a parameter, which will be available to all the elements’ serializers.

    In this PR, the feature is only used for CAddress serialization, but I plan to also convert SERIALIZE_TRANSACTION_NO_WITNESS in a follow-up. Eventually I hope to get rid of all GetVersion/GetType and replace it with this approach, which is type-safe (code won’t compile if the correct parameter isn’t passed somewhere), and avoids collisions between flags.

  2. fanquake added the label P2P on Jul 13, 2020
  3. sipa requested review from ryanofsky on Jul 13, 2020
  4. DrahtBot commented at 5:51 am on July 13, 2020: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19065 (tests: Add fuzzing harness for CAddrMan. Fill some fuzzing coverage gaps. by practicalswift)
    • #14053 (Add address-based index (attempt 4?) by marcinja)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  5. practicalswift commented at 8:17 am on July 13, 2020: contributor

    Concept ACK

    Getting rid of GetVersion/GetType and replacing them with a compile-time checked approach would be very nice.

  6. DrahtBot added the label Needs rebase on Jul 13, 2020
  7. in src/protocol.h:362 in 2149a5f528 outdated
    353@@ -354,28 +354,43 @@ static inline bool MayHaveUsefulAddressDB(ServiceFlags services)
    354     return (services & NODE_NETWORK) || (services & NODE_NETWORK_LIMITED);
    355 }
    356 
    357+enum class AddrSerialization {
    358+    DISK,
    359+
    360+    // The only time we serialize a CAddress object without nTime is in
    361+    // the initial VERSION messages which contain two CAddress records.
    362+    // At that point, the serialization version is INIT_PROTO_VERSION.
    


    jnewbery commented at 10:35 am on July 13, 2020:
    This sentence can be removed now. It’s meaningless without a serialization version.

    sipa commented at 8:13 pm on July 13, 2020:
    Arguably there still is a serialization version, but the link between that and CAddress serialization is purely historical. Removed by incorporating @MarcoFalke’s changes below.
  8. jnewbery commented at 10:37 am on July 13, 2020: member
    Concept ACK. Can be rebased on master now that #19486 is merged to drop the first two commits.
  9. in src/protocol.h:387 in 908c6feb36 outdated
    389             // nTime.
    390-            READWRITE(obj.nTime);
    391+            if (s.GetVersion() >= INIT_PROTO_VERSION) {
    392+                READWRITE(obj.nTime);
    393+            }
    394         }
    


    MarcoFalke commented at 4:50 pm on July 13, 2020:

    in commit “distentangle …”:

    This is not compatible with the live network. When testing I get:

    02020-07-13T16:36:11Z [opencon] trying connection 188.42.33.44:18333 lastseen=1702.0hrs
    12020-07-13T16:36:11Z [opencon] Added connection peer=7
    22020-07-13T16:36:11Z [opencon] sending version (111 bytes) peer=7
    32020-07-13T16:36:11Z [opencon] send version message: version 70015, blocks=1746853, us=[::]:0, peer=7
    42020-07-13T16:36:11Z [msghand] received: reject (31 bytes) peer=7
    52020-07-13T16:36:11Z [msghand] Misbehaving: 188.42.33.44:18333 peer=7 (0 -> 1)
    62020-07-13T16:36:11Z [net] socket closed for peer=7
    72020-07-13T16:36:11Z [net] disconnecting peer=7
    82020-07-13T16:36:11Z [net] Cleared nodestate for peer=7
    

    sipa commented at 8:14 pm on July 13, 2020:
    There was a bug (>= INIT_PROTO_VERSION instead of != INIT_PROTO_VERSION), fixed.
  10. in src/protocol.h:386 in 2149a5f528 outdated
    399-            if (s.GetVersion() >= INIT_PROTO_VERSION) {
    400-                READWRITE(obj.nTime);
    401-            }
    402+        } else if (fmt == AddrSerialization::NETWORK_WITHTIME) {
    403+            READWRITE(obj.nTime);
    404         }
    


    MarcoFalke commented at 5:17 pm on July 13, 2020:

    in commit “user serialization …”:

    Still not compatible. Now I get:

    02020-07-13T17:01:39Z [opencon] trying connection 95.183.48.89:18333 lastseen=2468.3hrs
    12020-07-13T17:01:39Z [opencon] Added connection peer=2
    22020-07-13T17:01:39Z [opencon] sending version (103 bytes) peer=2
    32020-07-13T17:01:39Z [opencon] send version message: version 70015, blocks=1746969, us=[::]:0, peer=2
    42020-07-13T17:01:39Z [msghand] received: version (102 bytes) peer=2
    52020-07-13T17:01:39Z [msghand] ProcessMessages(version, 102 bytes): Exception 'CDataStream::read(): end of data: iostream error' (NSt8ios_base7failureB5cxx11E) caught
    62020-07-13T17:01:39Z [msghand] received: verack (0 bytes) peer=2
    72020-07-13T17:01:39Z [msghand] Misbehaving: 95.183.48.89:18333 peer=2 (0 -> 1)
    

    sipa commented at 8:14 pm on July 13, 2020:
    There was an unrelated bug to the previous one in this commit, with the same effect: addrFrom in the VERSION message was parsed with NETWORK_WITHTIME.
  11. in src/protocol.h:366 in 2149a5f528 outdated
    361+    // the initial VERSION messages which contain two CAddress records.
    362+    // At that point, the serialization version is INIT_PROTO_VERSION.
    363+    // After the version handshake, all ADDR messages are serialized with
    364+    // nTime.
    365+    NETWORK_NOTIME,
    366+    NETWORK_WITHTIME
    


    MarcoFalke commented at 5:22 pm on July 13, 2020:

    in the last commit:

    Could add a trailing semicolon? I’ve prepared a suggested diff for you that also switches to a switch-case so that the compiler can check all cases are covered, feel free to take or reject:

     0diff --git a/src/protocol.h b/src/protocol.h
     1index 500cf346e9..0f5786a1a9 100644
     2--- a/src/protocol.h
     3+++ b/src/protocol.h
     4@@ -356,14 +356,8 @@ static inline bool MayHaveUsefulAddressDB(ServiceFlags services)
     5 
     6 enum class AddrSerialization {
     7     DISK,
     8-
     9-    // The only time we serialize a CAddress object without nTime is in
    10-    // the initial VERSION messages which contain two CAddress records.
    11-    // At that point, the serialization version is INIT_PROTO_VERSION.
    12-    // After the version handshake, all ADDR messages are serialized with
    13-    // nTime.
    14     NETWORK_NOTIME,
    15-    NETWORK_WITHTIME
    16+    NETWORK_WITHTIME,
    17 };
    18 
    19 /** A CService with information about it as peer */
    20@@ -383,13 +377,23 @@ public:
    21     SERIALIZE_METHODS_PARAMS(CAddress, obj, AddrSerialization, fmt)
    22     {
    23         SER_READ(obj, obj.nTime = TIME_INIT);
    24-        if (fmt == AddrSerialization::DISK) {
    25+        switch (fmt) {
    26+        case AddrSerialization::DISK: {
    27             uint32_t disk_version = DISK_VERSION;
    28             READWRITE(disk_version);
    29             READWRITE(obj.nTime);
    30-        } else if (fmt == AddrSerialization::NETWORK_WITHTIME) {
    31+            break;
    32+        }
    33+        case AddrSerialization::NETWORK_WITHTIME: {
    34             READWRITE(obj.nTime);
    35+            break;
    36+        }
    37+        case AddrSerialization::NETWORK_NOTIME: {
    38+            // The only time we serialize a CAddress object without nTime is in
    39+            // the initial VERSION messages which contain two CAddress records.
    40+            break;
    41         }
    42+        } // no default case, so the compiler can warn about missing cases
    43         READWRITE(Using<CustomUintFormatter<8>>(obj.nServices), AsBase<CService>(obj));
    44     }
    45 
    

    sipa commented at 8:14 pm on July 13, 2020:
    Included.
  12. MarcoFalke commented at 5:22 pm on July 13, 2020: member

    Concept ACK, no idea why it is not compatible with the live network

    8d1b1df474db73d3010324f728cb8b7851036b4f

  13. in src/serialize.h:186 in 2149a5f528 outdated
    164@@ -165,11 +165,10 @@ enum
    165 };
    166 
    167 //! Convert the reference base type to X, without changing constness or reference type.
    168-template<typename X> X& ReadWriteAsHelper(X& x) { return x; }
    169-template<typename X> const X& ReadWriteAsHelper(const X& x) { return x; }
    170+template<typename X> X& AsBase(X& x) { return x; }
    


    vasild commented at 7:12 pm on July 13, 2020:
    Why not use static_cast instead of AsBase?

    sipa commented at 8:36 pm on July 13, 2020:
    It needs to maintain constness. During serialization, the object may be const T, while during deserialization it cannot be const. This function is effectively a static cast, but one that passes through constness.

    sipa commented at 5:30 pm on July 14, 2020:
    Added more comments to clarify this.

    vasild commented at 7:27 am on July 15, 2020:
    I checked that replacing s/AsBase/static_cast/ all over the place compiles and the few tests I ran pass.

    sipa commented at 4:28 pm on July 15, 2020:
    Yes, that’s expected - but it may lead to UB. static_cast may cast away the constness of a reference, in a place where it is illegal to do so. So AsBase is just a more restrictive, safer, version of static_cast.
  14. in src/serialize.h:1132 in 2149a5f528 outdated
    1128@@ -1114,4 +1129,71 @@ size_t GetSerializeSizeMany(int nVersion, const T&... t)
    1129     return sc.size();
    1130 }
    1131 
    1132+/** Wrapper that overrides thet GetParams() function of a stream. */
    


    vasild commented at 7:23 pm on July 13, 2020:
    nit: s/thet/the/

    sipa commented at 5:30 pm on July 14, 2020:
    Fixed.
  15. dongcarl commented at 7:56 pm on July 13, 2020: contributor

    Concept ACK as it will allow for a previously hacky solution to be neater. See #19477 for more details

    Perhaps a worthwhile followup: Completing the example documentations of our serialization “helpers” (seems like there’s already a few examples, but not all of them are documented)

  16. sipa force-pushed on Jul 13, 2020
  17. sipa commented at 8:18 pm on July 13, 2020: member
    Rebased after #19486 merge. Addressed comments.
  18. DrahtBot removed the label Needs rebase on Jul 13, 2020
  19. MarcoFalke commented at 7:28 am on July 14, 2020: member
    Looks like the fuzzers don’t compile on some of the commits
  20. naumenkogs commented at 7:55 am on July 14, 2020: member
    Concept ACK.
  21. sipa force-pushed on Jul 14, 2020
  22. sipa force-pushed on Jul 14, 2020
  23. sipa commented at 5:32 pm on July 14, 2020: member
    @dongcarl Agree - I’ve added more comments to explain the feature for now, but a more comprehensive overview of all the things the serialization framework supports would be a nice further improvement. @MarcoFalke Hopefully fixed. I also added all variants of deserialization to the fuzzers.
  24. sipa force-pushed on Jul 14, 2020
  25. sipa force-pushed on Jul 14, 2020
  26. vasild commented at 9:35 am on July 15, 2020: contributor

    Code coverage report for this PR from unit + functional tests (lines not modified by this PR are dimmed and files not modified by this PR are omitted from the report):

  27. jnewbery commented at 9:48 am on July 15, 2020: member

    just net_processing.cpp:3942 is “interesting” (it is modified by this PR and not covered by tests).

    That line is pretty much a no-op and could be removed. It’s ensuring that we never send more than 1000 CAddress records in a single ADDR message. However, the ADDR message is populated from vAddrToSend, which itself is limited to 1000 entries (see the PushAddress() function). So we could just remove that entire if block (and potentially replace it with an assert(vAddr.size() <= 1000) after the for loop).

  28. in src/serialize.h:219 in 3d3b095434 outdated
    220- * has the advantage that the constness of the object becomes a template parameter, and
    221- * thus allows a single implementation that sees the object as const for serializing
    222- * and non-const for deserializing, without casts.
    223+/** Variant of FORMATTER_METHODS that supports a declared parameter type.
    224+ *
    225+ * With a formatter has a declared parameter type, it must be invoked directly or
    


    jnewbery commented at 10:11 am on July 15, 2020:
    s/With/When ?

    sipa commented at 11:26 pm on July 16, 2020:
    “If” is even better, I think? Done.
  29. in src/protocol.h:380 in 3d3b095434 outdated
    375     explicit CAddress(CService ipIn, ServiceFlags nServicesIn) : CService{ipIn}, nServices{nServicesIn} {};
    376 
    377-    SERIALIZE_METHODS(CAddress, obj)
    378+    SERIALIZE_METHODS_PARAMS(CAddress, obj, AddrSerialization, fmt)
    379     {
    380         SER_READ(obj, obj.nTime = TIME_INIT);
    


    jnewbery commented at 1:27 pm on July 15, 2020:
    I think this line can be removed. nTime is already initialized on construction so this is a no-op.

    MarcoFalke commented at 4:41 pm on July 15, 2020:
    This has been added here on purpose: #19020 (review)


    vasild commented at 6:31 pm on July 16, 2020:

    Could it happen that we unserialize into the same object twice? E.g.

    0s >> addr;
    1... do something with addr, possibly change nTime ...
    2s >> addr;
    3// if we don't explicitly set nTime during unserialization,
    4// then here we may end up with some leftover nTime?
    

    sipa commented at 6:34 pm on July 16, 2020:
    Yes, that’s the reason to prefer fully initializing all fields when deserializing (or at least, all fields that are subject to serialization in the first place).

    sipa commented at 10:59 pm on July 16, 2020:

    @jnewbery I think in general we should aim to support deserializing into an already existing (and not just default-initialized) objects, as it’s hard to enforce the opposite.

    CAddrInfo is a bit of a special case, as its memory-only fields are “managed” by CAddrMan more than being a standalone object. In the other cases, I think it makes sense to make deserialization reset those fields.


    jnewbery commented at 8:42 am on July 17, 2020:
    Thanks for the additional background. My immediate reaction is that it seems dangerous to re-deserialize into an existing object and that it could lead to subtle, confusing bugs. However, we’ve definitely wondered off-topic for this PR, so I’m going to mark this conversation as resolved. If the expectation really is that the deserialization methods overwrite all fields, a future PR could update the documentation for SERIALIZE_METHODS to explicitly state that.

    ryanofsky commented at 10:40 pm on July 22, 2020:

    re: #19503 (review)

    Thanks for the additional background. My immediate reaction is that it seems dangerous to re-deserialize into an existing object and that it could lead to subtle, confusing bugs.

    FWIW, this is my opinion too. I think it would be nice to figure out some SERIALIZE_METHODS alternative for new code that adds a serialize method and a deserialize_type constructor, instead of serialize and unserialize methods.

  30. in src/protocol.h:366 in 3d3b095434 outdated
    366     static constexpr uint32_t TIME_INIT{100000000};
    367 
    368+    /** The disk serialization for CAddress includes a version number.
    369+     *  Traditionally the number CLIENT_VERSION was used, but it has since
    370+     *  been disentangled from it. */
    371+    static constexpr uint32_t DISK_VERSION{210000};
    


    jnewbery commented at 2:11 pm on July 15, 2020:
    Any reason for this magic number other than blocks-per-epoch? Wouldn’t it be better to set this at 0 initially so we have maximum flexibility for this field?

    MarcoFalke commented at 4:42 pm on July 15, 2020:
    It is the CLIENT_VERSION of the next major release. The number should be equal to or greater than the largest CLIENT_VERSION of any previously released version.
  31. in src/protocol.h:380 in 3d3b095434 outdated
    379     {
    380         SER_READ(obj, obj.nTime = TIME_INIT);
    381-        int nVersion = s.GetVersion();
    382-        if (s.GetType() & SER_DISK) {
    383-            READWRITE(nVersion);
    384+        switch (fmt) {
    


    jnewbery commented at 2:19 pm on July 15, 2020:

    I actually preferred this as a series of if/else if blocks. A switch statement seems appropriate when the different cases are disjoint, which might not always be the case here. There could be some future param that adds a field to the end of the serialization, which would need to be included in this switch statement, but would also require an if statement at the end.

    Both are fine, but making this a switch seems needlessly verbose.


    sipa commented at 11:26 pm on July 16, 2020:
    I agree. I’ve reverted it to the if/then/else style.
  32. jnewbery commented at 3:29 pm on July 15, 2020: member
    concept ACK and code review ACK of the changes in protocol.h. The syntax/interface for WithParams() and SERIALIZE_METHODS_PARAMS seems reasonable, but I find it very difficult to understand the template code in serialize.h so I wasn’t able to give it a thorough review.
  33. vasild commented at 3:30 pm on July 15, 2020: contributor

    In order to check that this works as intended and to understand it better I wrote some unit tests that exercise it.

    Some observations, correct me if I am wrong:

    • I think it is impossible to use two separate methods Serialize() and Unserialize() with this mechanism and one has to use:
    0SERIALIZE_METHODS_PARAMS(...)
    1{
    2    if (ser_action.ForRead()) {
    3        ... all unserialize goes here ...
    4    } else {
    5        ... all serialize goes here ...
    6    }
    7}
    
    • If there are two classes Base and Derived (the second one derived from the first), both using this mechanism, then for the derived class one has to create a formatter config that specifies how to format both Base and Derived:
     0Derived derived;
     1BaseAndDerivedFormat fmt;
     2
     3fmt.m_base_format = ...
     4fmt.m_derived_format = ...
     5
     6stream << WithParams(fmt, derived);
     7
     8// and then in `Derived`:
     9
    10SERIALIZE_METHODS_PARAMS(Derived, obj, DerivedAndBaseFormat, fmt)
    11{
    12    READWRITE(WithParams(fmt.m_base_format, AsBase<Base>(obj)));
    13
    14    ... ser/unser this Derived object according to fmt.m_derived_format ...
    15}
    
      0#include <util/strencodings.h>
      1
      2#include <algorithm>
      3#include <ios>
      4#include <string>
      5
      6enum class BaseFormat {
      7    DEC,
      8    HEX,
      9};
     10
     11/// (Un)serialize a number as a string either from/to decimal or hexadecimal.
     12class Base
     13{
     14public:
     15    uint8_t m_base_data;
     16
     17    Base() : m_base_data(17) {}
     18    explicit Base(uint8_t data) : m_base_data(data) {}
     19
     20    SERIALIZE_METHODS_PARAMS(Base, obj, BaseFormat, fmt)
     21    {
     22        if (ser_action.ForRead()) {
     23            std::string str;
     24            s >> str;
     25            const uint8_t data = (uint8_t)std::stoul(str, 0, fmt == BaseFormat::DEC ? 10 : 16);
     26            SER_READ(obj, obj.m_base_data = data);
     27        } else {
     28            s << strprintf(fmt == BaseFormat::DEC ? "%02hhu" : "%02hhX", obj.m_base_data);
     29        }
     30    }
     31};
     32
     33class DerivedAndBaseFormat
     34{
     35public:
     36    BaseFormat m_base_format;
     37
     38    enum class DerivedFormat {
     39        LOWER,
     40        UPPER,
     41    } m_derived_format;
     42};
     43
     44class Derived : public Base
     45{
     46public:
     47    std::string m_derived_data;
     48
     49    SERIALIZE_METHODS_PARAMS(Derived, obj, DerivedAndBaseFormat, fmt)
     50    {
     51        READWRITE(WithParams(fmt.m_base_format, AsBase<Base>(obj)));
     52
     53        if (ser_action.ForRead()) {
     54            std::string str;
     55            s >> str;
     56            SER_READ(obj, obj.m_derived_data = str);
     57        } else {
     58            s << (fmt.m_derived_format == DerivedAndBaseFormat::DerivedFormat::LOWER ?
     59                      ToLower(obj.m_derived_data) :
     60                      ToUpper(obj.m_derived_data));
     61        }
     62    }
     63};
     64
     65BOOST_AUTO_TEST_CASE(with_params_base)
     66{
     67    Base b{15};
     68
     69    CDataStream stream(SER_DISK, PROTOCOL_VERSION);
     70
     71    stream << WithParams(BaseFormat::DEC, b);
     72    BOOST_CHECK_EQUAL(stream.str(), "\2" "15");
     73
     74    b.m_base_data = 0;
     75    stream >> WithParams(BaseFormat::DEC, b);
     76    BOOST_CHECK_EQUAL(b.m_base_data, 15);
     77
     78    stream.clear();
     79
     80    stream << WithParams(BaseFormat::HEX, b);
     81    BOOST_CHECK_EQUAL(stream.str(), "\2" "0F");
     82
     83    b.m_base_data = 0;
     84    stream >> WithParams(BaseFormat::HEX, b);
     85    BOOST_CHECK_EQUAL(b.m_base_data, 0x0F);
     86}
     87
     88BOOST_AUTO_TEST_CASE(with_params_vector_of_base)
     89{
     90    std::vector<Base> v{Base{15}, Base{255}};
     91
     92    CDataStream stream(SER_DISK, PROTOCOL_VERSION);
     93
     94    stream << WithParams(BaseFormat::DEC, v);
     95    BOOST_CHECK_EQUAL(stream.str(), "\2\2" "15" "\3" "255");
     96
     97    v[0].m_base_data = 0;
     98    v[1].m_base_data = 0;
     99    stream >> WithParams(BaseFormat::DEC, v);
    100    BOOST_CHECK_EQUAL(v[0].m_base_data, 15);
    101    BOOST_CHECK_EQUAL(v[1].m_base_data, 255);
    102
    103    stream.clear();
    104
    105    stream << WithParams(BaseFormat::HEX, v);
    106    BOOST_CHECK_EQUAL(stream.str(), "\2\2" "0F" "\2" "FF");
    107
    108    v[0].m_base_data = 0;
    109    v[1].m_base_data = 0;
    110    stream >> WithParams(BaseFormat::HEX, v);
    111    BOOST_CHECK_EQUAL(v[0].m_base_data, 0x0F);
    112    BOOST_CHECK_EQUAL(v[1].m_base_data, 0xFF);
    113}
    114
    115BOOST_AUTO_TEST_CASE(with_params_derived)
    116{
    117    Derived d;
    118    d.m_base_data = 15;
    119    d.m_derived_data = "xY";
    120
    121    DerivedAndBaseFormat fmt;
    122
    123    CDataStream stream(SER_DISK, PROTOCOL_VERSION);
    124
    125    fmt.m_base_format = BaseFormat::DEC;
    126    fmt.m_derived_format = DerivedAndBaseFormat::DerivedFormat::LOWER;
    127    stream << WithParams(fmt, d);
    128
    129    fmt.m_base_format = BaseFormat::HEX;
    130    fmt.m_derived_format = DerivedAndBaseFormat::DerivedFormat::UPPER;
    131    stream << WithParams(fmt, d);
    132
    133    BOOST_CHECK_EQUAL(stream.str(), "\2" "15" "\2" "xy" "\2" "0F" "\2" "XY");
    134}
    
  34. in src/serialize.h:1215 in 3d3b095434 outdated
    1206+    T m_object;
    1207+
    1208+public:
    1209+    explicit ParamsWrapper(const Params& params, T obj) : m_params(params), m_object(obj) {}
    1210+
    1211+    //! Serialize to another ParamsStream: optimize by skipping it.
    


    jnewbery commented at 3:38 pm on July 15, 2020:
    This may be because I don’t have a good understanding of serialize.h in general so perhaps you can just ignore this, but I don’t understand what’s being skipped here.

    sipa commented at 4:47 pm on July 15, 2020:
    It’s an attempt at avoiding crazy template expansion in the compiler when multiple levels of serializers each use WithParams (the Stream object that would hit the bottom would be ParamsStream<ParamsInner, ParamsStream<ParamsMid, ParamsStream<ParamsOuter, BaseStream»>, and all its methods need instantiating). This optimization will just result in ParamsStream<ParamsInner, BaseStream>.
  35. in src/protocol.h:391 in 3d3b095434 outdated
    392-            (nVersion != INIT_PROTO_VERSION && !(s.GetType() & SER_GETHASH))) {
    393+        case AddrSerialization::NETWORK_WITHTIME:
    394+            READWRITE(obj.nTime);
    395+            break;
    396+        case AddrSerialization::NETWORK_NOTIME:
    397             // The only time we serialize a CAddress object without nTime is in
    


    jnewbery commented at 3:44 pm on July 15, 2020:
    Thinking about it a bit more, I actually think this whole comment can be removed. It made sense when the serialization was based on a protocol version that we don’t support anywhere else (and was added to document something that caused confusion for reviewers: #14033 (comment)), but when the serialization is based on a parameter that we pass in explicitly, it’s easy enough to grep for that parameter to see where it’s used.

    sipa commented at 11:25 pm on July 16, 2020:
    Done.
  36. sipa commented at 4:34 pm on July 15, 2020: member

    @vasild:

    • It should be possible to use separate Serialize and Unserialize member functions. Do you have a specific example that does not work?

    • This should not work in general:

    0SERIALIZE_METHODS_PARAMS(...)
    1{
    2    if (ser_action.ForRead()) {
    3        ... all unserialize goes here ...
    4    } else {
    5        ... all serialize goes here ...
    6    }
    7}
    

    because *this will be const during serialization, which means that the “ForRead” branch would not compile in the Serialize version. You can use SER_READ or SER_WRITE though, to specify statements that are only executed during reading or writing (in a type-safe way).

    • For Base/Derived classes, an alternative is having DerivedParams inherit from BaseParams (it can’t be an enum in that case, though), or otherwise have a conversion operator from one to the other. In that case you don’t need WithParams to transmute the parameters.
  37. MarcoFalke commented at 4:45 pm on July 15, 2020: member
    Will review when travis is green
  38. sipa commented at 4:58 pm on July 15, 2020: member
    @MarcoFalke I’ve been pushing fixes, and GitHub’s UI doesn’t seem to be picking them up…
  39. sipa commented at 4:59 pm on July 15, 2020: member
  40. MarcoFalke commented at 5:04 pm on July 15, 2020: member
    No idea what’s going on. Since this has no reviews, you could try rebasing once more on master and force push? :man_shrugging:
  41. sipa force-pushed on Jul 15, 2020
  42. sipa force-pushed on Jul 15, 2020
  43. sipa commented at 5:26 pm on July 15, 2020: member
    Ah it was just behind; it picked them up now.
  44. in src/serialize.h:1252 in f69c79636e outdated
    1248@@ -1249,3 +1249,4 @@ template<typename Params, typename T>
    1249 static inline ParamsWrapper<Params, T&> WithParams(const Params& params, T&& t) { return ParamsWrapper<Params, T&>(params, t); }
    1250 
    1251 #endif // BITCOIN_SERIALIZE_H
    1252+
    


    MarcoFalke commented at 6:02 pm on July 15, 2020:
    Can be removed now

    sipa commented at 0:51 am on July 16, 2020:
    Done.
  45. MarcoFalke commented at 7:47 pm on July 15, 2020: member
    Travis still failing in the fuzzer. Also, needs rebase after #19360 (review)
  46. MarcoFalke added the label Needs rebase on Jul 15, 2020
  47. sipa force-pushed on Jul 15, 2020
  48. DrahtBot removed the label Needs rebase on Jul 15, 2020
  49. jnewbery commented at 9:22 pm on July 15, 2020: member
    I think this requires a change to netaddress.h now that #19360 has been merged: https://github.com/bitcoin/bitcoin/pull/19360/files#diff-76d15e11a95be7a4aee1eb89de6098caR165
  50. sipa force-pushed on Jul 15, 2020
  51. DrahtBot added the label Needs rebase on Jul 16, 2020
  52. DrahtBot commented at 7:31 am on July 16, 2020: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  53. vasild commented at 1:06 pm on July 16, 2020: contributor

    Alright, so this:

     0    SERIALIZE_METHODS_PARAMS(Base, obj, BaseFormat, fmt)
     1    {
     2        if (ser_action.ForRead()) {
     3            std::string str;
     4            s >> str;
     5            const int base = fmt == BaseFormat::DEC ? 10 : 16;
     6            const uint8_t data = (uint8_t)std::stoul(str, 0, base);
     7            SER_READ(obj, obj.m_base_data = data);
     8        } else {
     9            s << strprintf(fmt == BaseFormat::DEC ? "%02hhu" : "%02hhX", obj.m_base_data);
    10        }
    11    }
    

    is equivalent to this:

     0    template <typename Stream>
     1    void Serialize(Stream& s) const
     2    {
     3        s << strprintf(s.GetParams() == BaseFormat::DEC ? "%02hhu" : "%02hhX", m_base_data);
     4    }
     5
     6    template <typename Stream>
     7    void Unserialize(Stream& s)
     8    {
     9        std::string str;
    10        s >> str;
    11        const int base = s.GetParams() == BaseFormat::DEC ? 10 : 16;
    12        m_base_data = (uint8_t)std::stoul(str, 0, base);
    13    }
    

    I pushed the unit tests that exercise this feature at https://github.com/vasild/bitcoin/commit/8d89c1d43, feel free to pick (some of) them into this PR if you find them interesting, or if they may serve as an example of how to use this.

  54. in src/serialize.h:242 in fdf66d3258 outdated
    243+ *
    244+ * Note that WithParams(parameter, obj) can be invoked anywhere in the call
    245+ * stack; it is passed down recursively into all serialization code, until
    246+ * another WithParams overrides it.
    247+ *
    248+ * Compilation will fail in any context where serialization is invoke but
    


    vasild commented at 1:17 pm on July 16, 2020:
    s/is invoke/is invoked/

    sipa commented at 11:21 pm on July 16, 2020:
    Done.
  55. in src/net_processing.cpp:480 in fdf66d3258 outdated
    476@@ -477,7 +477,7 @@ static void PushNodeVersion(CNode& pnode, CConnman* connman, int64_t nTime)
    477     CAddress addrYou = (addr.IsRoutable() && !IsProxy(addr) ? addr : CAddress(CService(), addr.nServices));
    478     CAddress addrMe = CAddress(CService(), nLocalNodeServices);
    479 
    480-    connman->PushMessage(&pnode, CNetMsgMaker(INIT_PROTO_VERSION).Make(NetMsgType::VERSION, PROTOCOL_VERSION, (uint64_t)nLocalNodeServices, nTime, addrYou, addrMe,
    481+    connman->PushMessage(&pnode, CNetMsgMaker(INIT_PROTO_VERSION).Make(NetMsgType::VERSION, PROTOCOL_VERSION, (uint64_t)nLocalNodeServices, nTime, WithParams(AddrSerialization::NETWORK_NOTIME, addrYou), WithParams(AddrSerialization::NETWORK_NOTIME, addrMe),
    


    vasild commented at 1:33 pm on July 16, 2020:

    nit: this line is 257 columns and then it continues on the next line for another 103 columns. Could it be made more readable somehow?

    It does not need to pass INIT_PROTO_VERSION anymore to skip serializing the time, but I guess it cannot be simplified.


    sipa commented at 11:21 pm on July 16, 2020:

    Yeah, it was ridiculously long. I’ve split it up.

    Indeed, INIT_PROTO_VERSION is no longer needed here, but I think it’s better to just get rid of all of these version numbers once everything has been updated to not using them anymore.

  56. in src/protocol.h:357 in fdf66d3258 outdated
    353@@ -354,34 +354,45 @@ static inline bool MayHaveUsefulAddressDB(ServiceFlags services)
    354     return (services & NODE_NETWORK) || (services & NODE_NETWORK_LIMITED);
    355 }
    356 
    357+enum class AddrSerialization {
    


    vasild commented at 6:07 pm on July 16, 2020:
    nit, feel free to ignore: since this is also used in unserialization, the name could be misleading. AddrFormat?

    sipa commented at 11:20 pm on July 16, 2020:
    I’ve changed it to CAddress::Format.
  57. vasild approved
  58. vasild commented at 6:08 pm on July 16, 2020: contributor

    ACK fdf66d325

    Just some nits below and another nit at #19503 (review)

  59. Replace READWRITEAS macro with AsBase wrapping function 031d6db27c
  60. Support for serialization parameters ee70e2dabb
  61. Disentangle disk address version from client version 12a5a3c032
  62. Use serialization parameters for CAddress serialization 21446c5b5d
  63. Add deserialization fuzzers for address variants 4406d9abce
  64. sipa force-pushed on Jul 16, 2020
  65. sipa commented at 11:25 pm on July 16, 2020: member
    Rebased, added a few more comments, renamed AddrSerialization to CAddress::Format, and included @vasild’s unit test.
  66. sipa force-pushed on Jul 16, 2020
  67. test: add tests that exercise WithParams() 483ed45741
  68. sipa force-pushed on Jul 17, 2020
  69. fanquake removed the label Needs rebase on Jul 17, 2020
  70. vasild approved
  71. vasild commented at 1:00 pm on July 17, 2020: contributor

    ACK 483ed4574

    Thanks for adding input validation to the hex/dec examples in the unit tests!

  72. jonatack commented at 4:05 pm on July 22, 2020: contributor
    ACK 483ed45 code review over several sessions this week, verified the build at each commit, ran the 3 address deserialization fuzz tests as a sanity check. Thanks for the unit test coverage – a good addition and demonstrates use of the feature. Edit: am running a node with this branch.
  73. in src/test/serialize_tests.cpp:404 in 483ed45741
    399+            s >> str;
    400+            assert(str.size() == 2 || (fmt == BaseFormat::DEC && str.size() == 3));
    401+            uint32_t data;
    402+            bool ok;
    403+            if (fmt == BaseFormat::DEC) {
    404+                ok = ParseUint32(str, &data);
    


    ryanofsky commented at 11:02 pm on July 22, 2020:

    In commit “test: add tests that exercise WithParams()” (483ed457411d164ad71115dbfe1d2b4ff8b2e5f3)

    Think this needs to capitalize Int: ParseUInt32

  74. ryanofsky approved
  75. ryanofsky commented at 11:04 pm on July 22, 2020: contributor

    Code review ACK 483ed457411d164ad71115dbfe1d2b4ff8b2e5f3.

    This seems all right, but I think it’s significantly more complex than it needs to because it is making serialization with params something that can be done in conjunction with using a formatter, instead of just a special case of formatting. Simplification along lines of 106d8039b61ff4b771dc996e3ab9156019ba4470 branch) seems to work (only ran tests locally).

  76. sipa commented at 11:45 pm on July 22, 2020: member

    @ryanofsky That approach works and is a lot simpler, but I think it misses the “automatic recursion” property I was hoping for, that you could e.g. serialize a CBlock with specified CTransaction parameters, and it would transparently pass those parameters down. Or a block could define its own type that can be converted to the CTransaction parameters type, but not need to anything beyond that.

    Whether that’s a desirable depends how much you treat a higher-level structure as openly containing the lower-level structure, but I think in general most of our complex serializable data structures are like that.

  77. MarcoFalke commented at 1:07 pm on August 4, 2020: member

    re-ACK 483ed457411d164ad71115dbfe1d2b4ff8b2e5f3, though I did not review the serialize.h changes 🚤

    Changes since last review:

    • Rebase
    • Added comments/doc
    • Fix “off-by-one” issue/typo bug in commit “12a5a3c032 Disentangle disk address version from client version”
    • Properly deserialize addrFrom without time (Bugfix in commit “21446c5b5d Use serialization parameters for CAddress serialization”)
    • Fix and extend fuzzers
    • Add tests from vasild

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3re-ACK 483ed457411d164ad71115dbfe1d2b4ff8b2e5f3, though I did not review the serialize.h changes 🚤
     4
     5Changes since last review:
     6* Rebase
     7* Added comments/doc
     8* Fix "off-by-one" issue/typo bug in commit "12a5a3c032 Disentangle disk address version from client version"
     9* Properly deserialize addrFrom without time (Bugfix in commit "21446c5b5d Use serialization parameters for CAddress serialization")
    10* Fix and extend fuzzers
    11* Add tests from vasild
    12-----BEGIN PGP SIGNATURE-----
    13
    14iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
    15pUiZTQwAoa37uqI1WTtHnkBHX/T8VKa36Ba6GzwdKgCSLVl7uMr0qw5jL/UrHuG+
    16tHm4W6GH6fuxYSblm+r7ujigrAkD4/3gPsbOG5mHaYTDOtiL9/ZTHT4HT/nbyZBL
    17KmaVaLHihfZy8NS05IG8y4sB97ZWXmiAak4An6NdfmjtB5ahWVH5HuLxLERIM7nb
    18yLjW/5GVbnnMvhirK/nyMcAAXSlTY5twmjT2LYWQ0TnJnNF0djY0VMkvNF8RrvgZ
    19k3zVnE0J+MW5mv2KcKRCYouGXc+WK777s5gc7bVOfP0J0So9i20LP1pFGqdR+FhV
    200WF0aO27EJVePxev36zdaF4CuEFadP99NN5fKnY6rUwpd7+3hIRWZOH2958bbaSH
    213oE+M5tLKd+qsAIZ+Lz7W3XU/oK9DE3v8YRasuLzgAFXoiZVkYnISWp/N+FZv+/6
    22rhAFSyF45HGFkEAa40KdzPV/aT+LWi9UnZe+wa1RtOoiM25jDMTbyLBJiJNz3EoT
    23vjuegm6r
    24=lpi2
    25-----END PGP SIGNATURE-----
    

    Timestamp of file with hash c6299c40a36d62d1a32997018080b37d4feeef88382e17059f749fec43043e26 -

  78. MarcoFalke commented at 1:08 pm on August 4, 2020: member

    Note to myself, if this gets merged: It would be good to remove the now unused protocol version includes. I.e.

     0commit 7ffff5b3237fa418637d30ddb59534ab47d3f610 (HEAD)
     1Author: MarcoFalke <falke.marco@gmail.com>
     2Date:   Tue Aug 4 14:58:25 2020 +0200
     3
     4    Remove unused protocol version header from protocol.h
     5
     6diff --git a/src/bench/rpc_blockchain.cpp b/src/bench/rpc_blockchain.cpp
     7index 4b45264a3c..09bfcb60a5 100644
     8--- a/src/bench/rpc_blockchain.cpp
     9+++ b/src/bench/rpc_blockchain.cpp
    10@@ -8,6 +8,7 @@
    11 #include <rpc/blockchain.h>
    12 #include <streams.h>
    13 #include <validation.h>
    14+#include <version.h> // For PROTOCOL_VERSION
    15 
    16 #include <univalue.h>
    17 
    18diff --git a/src/protocol.h b/src/protocol.h
    19index a564508f89..58669e20b5 100644
    20--- a/src/protocol.h
    21+++ b/src/protocol.h
    22@@ -14,7 +14,6 @@
    23 #include <primitives/transaction.h>
    24 #include <serialize.h>
    25 #include <uint256.h>
    26-#include <version.h>
    27 
    28 #include <stdint.h>
    29 #include <string>
    

    and

     0commit 4550d37e268e7ffe34b3f5ea11d6b3f231d9065a
     1Author: MarcoFalke <falke.marco@gmail.com>
     2Date:   Tue Aug 4 12:56:55 2020 +0200
     3
     4    Remove unused protocol version header from undo.h
     5
     6diff --git a/src/undo.h b/src/undo.h
     7index a98f046735..1fb9ac0688 100644
     8--- a/src/undo.h
     9+++ b/src/undo.h
    10@@ -11,7 +11,6 @@
    11 #include <consensus/consensus.h>
    12 #include <primitives/transaction.h>
    13 #include <serialize.h>
    14-#include <version.h>
    15 
    16 /** Formatter for undo information for a CTxIn
    17  *
    
  79. DrahtBot added the label Needs rebase on Aug 17, 2020
  80. DrahtBot commented at 11:55 am on August 17, 2020: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  81. in src/protocol.h:369 in 483ed45741
    377+    };
    378+
    379+    SERIALIZE_METHODS_PARAMS(CAddress, obj, Format, fmt)
    380     {
    381         SER_READ(obj, obj.nTime = TIME_INIT);
    382-        int nVersion = s.GetVersion();
    


    MarcoFalke commented at 3:57 pm on August 27, 2020:
    One include can now be removed, see #19503 (comment)
  82. MarcoFalke commented at 3:58 pm on August 27, 2020: member
    Still needs rebase (trivial)
  83. jonatack commented at 10:03 am on September 11, 2020: contributor
    Needs rebase.
  84. vasild commented at 8:30 am on October 22, 2020: contributor

    This would resolve #19477.

    Also, ADDRV2_FORMAT could be replaced with the technique introduced in this PR.

  85. MarcoFalke commented at 2:21 pm on November 25, 2020: member
    Could also use LIFETIMEBOUND as per #19387 (comment)?
  86. DrahtBot commented at 11:21 am on December 15, 2021: contributor
    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  87. MarcoFalke commented at 12:19 pm on December 15, 2021: member
    @sipa Let us know if someone should pick this up. I might take a stab at it.
  88. sipa commented at 3:59 pm on December 19, 2021: member
    @MarcoFalke Not going to work on this for the time being.
  89. sipa closed this on Dec 19, 2021

  90. sipa added the label Up for grabs on Dec 19, 2021
  91. fanquake commented at 8:46 am on August 8, 2022: member
    Removing up for grabs as this has mostly been picked up. i.e #25284.
  92. fanquake removed the label Up for grabs on Aug 8, 2022
  93. bitcoin locked this on Aug 8, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me