[ZMQ] append a message sequence number to every ZMQ notification #7762

pull jonasschnelli wants to merge 2 commits into bitcoin:master from jonasschnelli:2016/03/zmq_seq changing 6 files +74 −13
  1. jonasschnelli commented at 12:39 pm on March 29, 2016: contributor

    Currently, ZMQ listeners cannot detect if they have missed a message. This PR adds a sequence number to each message (each message type has its own counter).

    The sequence number is the last part of the multipart zmq message to not break the API, though, we could consider breaking the API in favor of moving the sequence number to the very beginning.

    Todo:

    • update release notes
  2. jonasschnelli added the label RPC on Mar 29, 2016
  3. jonasschnelli force-pushed on Mar 29, 2016
  4. jonasschnelli force-pushed on Mar 29, 2016
  5. laanwj commented at 3:11 pm on March 29, 2016: member
    Concept ACK, I like adding the sequence nr as an extra part. Also ok with using 32 bit sequence numbers.
  6. promag commented at 1:11 am on March 30, 2016: member

    I thought a subscriber can’t miss a message.

    Edit:

    But it can: http://stackoverflow.com/a/15821036. So what can a subscriber do if it detects a missing message?

  7. jonasschnelli commented at 6:03 am on March 30, 2016: contributor
    @promag: I’m not sure if I would design an listening application that fully rely on getting all notifications. With this PR, you could at least detect if you got all (also check if you got the first) notifications and if not, you could poll/sync your data over RPC.
  8. laanwj commented at 8:08 am on March 30, 2016: member

    I thought a subscriber can’t miss a message.

    The way we are using ZMQ, the sender will never block. This is sensible as we don’t want to hang the application due to a slow client. But this means at some point one of the send queues will fill up and new messages will be discarded.

    But it can: http://stackoverflow.com/a/15821036. So what can a subscriber do if it detects a missing message?

    That’s up to the application. Either stop with a fatal error, or re-start/re-sync. An event application usually does some synchronization at start-up, then starts processing incremental changes. After getting out of sync it can repeat the process. The important thing is being able to detect it and thus act on it.

  9. laanwj commented at 7:49 am on April 14, 2016: member
    This needs mention in doc/release-notes.md as well as doc/zmq.md (there, don’t forget to specify the size and endianness of the sequence number), I think it is ready for merge otherwise.
  10. in qa/rpc-tests/zmq_test.py: in 4120cebd8a outdated
    47@@ -47,11 +48,18 @@ def run_test(self):
    48         print "listen..."
    49         msg = self.zmqSubSocket.recv_multipart()
    50         topic = str(msg[0])
    51+        assert_equal(topic, "hashtx")
    52         body = msg[1]
    53+        nseq = msg[2]
    54+        msgSequence = struct.unpack('<I', str(msg[-1]))[-1]
    


    MarcoFalke commented at 9:40 am on April 15, 2016:
    What is the reason for using str() here?
  11. jonasschnelli force-pushed on Apr 15, 2016
  12. [ZMQ] refactor message string de821d56e1
  13. jonasschnelli force-pushed on Apr 15, 2016
  14. jonasschnelli force-pushed on Apr 15, 2016
  15. jonasschnelli commented at 1:10 pm on April 15, 2016: contributor
    Fixed @MarcoFalke nit. Mentioned the change in doc/release-notes.md as well as in doc/zmq.md
  16. in qa/rpc-tests/zmq_test.py: in fefecf8b76 outdated
    47@@ -47,11 +48,17 @@ def run_test(self):
    48         print "listen..."
    49         msg = self.zmqSubSocket.recv_multipart()
    50         topic = msg[0]
    51+        assert_equal(topic, "hashtx")
    


    MarcoFalke commented at 1:25 pm on April 15, 2016:

    Nit: Add the b prefix, which is a noop in py2?

    0b"hashtx"
    

    MarcoFalke commented at 2:07 pm on April 15, 2016:
    Thanks for fixing!
  17. in contrib/zmq/zmq_sub.py: in fefecf8b76 outdated
    19@@ -19,18 +20,21 @@
    20         msg = zmqSubSocket.recv_multipart()
    21         topic = str(msg[0])
    22         body = msg[1]
    23-
    24+        sequence = "Unknown";
    25+        if len(msg[-1]) == 4:
    26+          msgSequence = struct.unpack('<I', str(msg[-1]))[-1]
    


    MarcoFalke commented at 1:26 pm on April 15, 2016:
    Nit: Any reason to use str()?
  18. jonasschnelli force-pushed on Apr 15, 2016
  19. jonasschnelli commented at 2:01 pm on April 15, 2016: contributor
    @MarcoFalke: fixed nits.
  20. in src/zmq/zmqpublishnotifier.cpp: in 50a825a423 outdated
    14+static const char *MSG_RAWBLOCK  = "rawblock";
    15+static const char *MSG_RAWTX     = "rawtx";
    16+
    17 // Internal function to send multipart message
    18-static int zmq_send_multipart(void *sock, const void* data, size_t size, ...)
    19+static int zmq_send_multipart_keepalive(void *sock, const void* data, size_t size, ...)
    


    laanwj commented at 8:36 am on April 19, 2016:
    Why the rename to _keepalive?

    jonasschnelli commented at 8:42 am on April 19, 2016:
    I renamed it to _keepalive because it does not “close” the multipart message. All messaged are sent with ZMQ_SNDMORE and I wanted to increase the awareness of that by renaming the function.
  21. laanwj commented at 1:29 pm on April 19, 2016: member
    Here’s a slight simplification of the code - gets rid of some duplication, as well as the _keepalive change: https://github.com/laanwj/bitcoin/commit/fb843df8bb1848a34b3bb7275c415a0f3143114b
  22. [ZMQ] append a message sequence number to every ZMQ notification 0b25a9fb42
  23. jonasschnelli force-pushed on Apr 19, 2016
  24. jonasschnelli commented at 1:34 pm on April 19, 2016: contributor
    @laanwj: Thanks. Much better. Stolen fb843df8bb1848a34b3bb7275c415a0f3143114b, squashed and force pushed.
  25. laanwj merged this on Apr 19, 2016
  26. laanwj closed this on Apr 19, 2016

  27. laanwj referenced this in commit a1eb344ba8 on Apr 19, 2016
  28. dcousens commented at 4:11 am on April 21, 2016: contributor

    The way we are using ZMQ, the sender will never block. This is sensible as we don’t want to hang the application due to a slow client.

    Could we make that parameterizable? In certain cases I’d rather have message reliability than having the node to be guaranteed to be running in real time.

  29. sipa commented at 4:39 am on April 21, 2016: member
    I don’t believe ZMQ ever guarantees reliability.
  30. laanwj commented at 12:28 pm on April 21, 2016: member

    In certain cases I’d rather have message reliability than having the node to be guaranteed to be running in real time.

    Well notifications can be lost, through zmq or otherwise, for example if bitcoind needs to be restarted, or a myriad of other circumstances not under your control. At least you can detect it now:

    • sequence number goes to 0 (and wasn’t at 0xffffffff) -> bitcoind restarted
    • sequence number skips a beat -> send buffer was full

    Your application needs an (application dependent) way to resync anyhow. This is better than pretending to guarantee something.

  31. zkbot referenced this in commit 36df5a92f8 on Feb 9, 2017
  32. zkbot referenced this in commit dd8b38316f on Feb 9, 2017
  33. zkbot referenced this in commit 253c610783 on Feb 9, 2017
  34. codablock referenced this in commit da06de97dc on Sep 16, 2017
  35. codablock referenced this in commit a84e0f0d64 on Sep 19, 2017
  36. codablock referenced this in commit c3e590968d on Dec 20, 2017
  37. CryptoCentric referenced this in commit 0d1ff35869 on Feb 15, 2019
  38. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-06 19:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me