Make Transport independent of CNetMessage and CSerializedNetMsg #30209

issue Sjors openend this issue on May 31, 2024
  1. Sjors commented at 8:26 am on May 31, 2024: member

    Currently class Transport uses CNetMessage in one place:

    https://github.com/bitcoin/bitcoin/blob/62f7f59ff495fbcbfc10c25e97bb0dc032647abf/src/net.h#L278-L285

    And CSerializedNetMsg here:

    https://github.com/bitcoin/bitcoin/blob/62f7f59ff495fbcbfc10c25e97bb0dc032647abf/src/net.h#L287-L295

    This creates a problem in my Stratum v2 Template Provider implementation in #29432 which introduces class Sv2Transport : Transport. Because the sv2 noise protocol (#29346) is very similar to BIP324 I simply copied V2Transport and modified the state machine a little bit. The test and fuzz code was also very reusable.

    But the above two methods don’t fit because of their use of CNetMessage and CSerializedNetMsg. Stratum v2 messages (Sv2NetMsg in #29432) don’t inherit from CNetMessage and I don’t think they should.

    So I’d like to refactor Transport to just return (decrypted) bytes and have the caller interpret them (as CNetMessage or Sv2NetMsg).

    Additionally I’m not sure what to do with the std::chrono::microseconds time, bool& reject_message arguments. So far I don’t use them in Sv2Transport.

    • reject_message: in v1 the transport sets this if the checksum fails. In v2 it’s used for the v1 fallback, but also for unrecognized msg types. This can be done by the caller after my proposed refactor, so the argument can go away.
    • time: this is passed into GetReceivedMessage by the caller and used to set msg.m_time = time;, so can also go away.

    From IRC I understand @sipa had some ideas on how to tackle this. I can take a stab at it myself, but if someone else does it I’m happy to review.

  2. willcl-ark added the label P2P on May 31, 2024
  3. Sjors commented at 2:39 pm on May 31, 2024: member
    Easier said than done… very rough draft in #30213
  4. sipa commented at 3:49 pm on May 31, 2024: member

    Two ideas to address this.

    Make the command type implicitly part of the message being sent.

    So instead of taking a CSerializedNetMsg in SetMessageToSend, the interface would simply take an std::vector<std::byte> for example, which in some way stores the message type. For example, by convention in V1Transaction and V2Transport, the command type could be the first 12 bytes. So from the perspective of net_processing, sending e.g. a ping message would look like sending 20-byte message of which the first 12 are “ping\x00\x00…” and the last 8 are the nonce. Sending a message shorter than 12 bytes, or whose first 12 bytes do not look like a command string, would cause failure at send time.

    Similarly, GetReceivedMessage would return an std::vector<std::byte>, whose first 12 bytes are the command string, as again, for all intents and purposes, the higher-level user of V1Transport and V2Transport would think of that as what is being sent/received.

    The Sv2Transport class would simply not have any such expectations, and send the provided message without any special treatment of the first 12 bytes of the message.

    I think this is the simplest way of accomplishing some reusability across V2Transport and Sv2Transport, but it’s somewhat ugly as the convention of “first 12 bytes” is sort of implicitly defined, and some non-trivial part of V2Transport just wouldn’t have an equivalent in Sv2Transport.

    Split command handling into another layer

    Here I’m imagining two layers:

    • A RawTransport interface with implementations V2RawTransport and Sv2Transport, with a fully byte-like interface like the Transport in the previous idea, but also no notion of 12-byte prefixes, neither in interface nor implementation. For V2RawTransport, the message would just be the 1-to-13 byte encoding of the command + payload, exactly matching BIP324 excluding the application layer specification. There would not be a V1RawTransport, as there just isn’t an equivalent of this layer for V1.
    • A P2PMessageTransport interface with implementations V1Transport and V2Transport, with an interface like today’s Transport, using CNetMessage and command strings and all.. V2Transport would contain a V2RawTransport for doing the heavy lifting, and only adding the command message encoding on top of it. The sv2 protocol stack would not have an equivalent here, as it’s not a p2p message transport.

    I think this would be a cleaner separation, and V2RawTransport and Sv2Transport would be very similar. A downside is that it may be nontrivial to do this with clean interfaces without introducing additional copying/allocations. In addition, the separation means the two layers need to be tested separately (probably duplicating some test code for the V2 case) - this could be seen as an advantage too, but it means extra work.

  5. sipa commented at 3:52 pm on May 31, 2024: member
    I’m happy to try taking a stab at both. The second idea would take more time, I expect.
  6. Sjors commented at 4:45 pm on May 31, 2024: member
    Perhaps a third approach is to shoehorn Sv2...Msg into CNetMessage. I haven’t tried that yet. They’re currently all separate structs. In terms of elegance it’s probably no worse than option (1) above.
  7. sipa commented at 4:58 pm on May 31, 2024: member
    @Sjors I guess that works! It could just ignore the command name, and always decode that to “”?
  8. Sjors commented at 6:07 pm on May 31, 2024: member

    By “command” you mean m_type?

     0class CNetMessage
     1{
     2public:
     3    DataStream m_recv;                   //!< received message data
     4    std::chrono::microseconds m_time{0}; //!< time of message receipt
     5    uint32_t m_message_size{0};          //!< size of the payload
     6    uint32_t m_raw_message_size{0};      //!< used wire size of the message (including header/checksum)
     7    std::string m_type;
     8
     9    ...
    10};
    

    Sv2Transport could set m_type to the empty string. The Template Provider can construct an Sv2NetMsg from a serialized Sv2NetHeader and std::vector<std::uint8_t> payload. Since the header is fixed length, m_recv can have both header and payload.

    The Sv2NetHeader would get deserialised twice in this scheme, first by Sv2Transport to get the length and then again to construct Sv2NetMsg, but that’s trivial overhead.

    I won’t be able to work on this until next Friday or so, but will give it a try and close this if it works.

  9. bitcoin deleted a comment on Jun 6, 2024
  10. Sjors commented at 4:12 pm on June 20, 2024: member
    WIP in #30315
  11. vasild commented at 3:56 am on June 21, 2024: contributor
    Concept ACK

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me