[contrib] message-capture-parser: fix out of bounds error for empty vectors #25367

pull theStack wants to merge 1 commits into bitcoin:master from theStack:202206-contrib-fix_message_capture_out_of_bounds changing 1 files +2 −1
  1. theStack commented at 0:21 am on June 14, 2022: member

    The script message-capture-parser.py currently throws an “out of bounds” error if a message containing an empty integer vector element is tried to converted to JSON (e.g. by the BIP157 message cfcheckpt with empty FilterHeaders vector):

    0Traceback (most recent call last):
    1  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 217, in <module>
    2    main()
    3  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 202, in main
    4    process_file(str(capture), messages, "recv" in capture.stem, progress_bar)
    5  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 162, in process_file
    6    msg_dict["body"] = to_jsonable(msg)
    7  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 85, in to_jsonable
    8    elif slot in HASH_INT_VECTORS and isinstance(val[0], int):
    9IndexError: list index out of range
    

    Fix this by using the all(...) predicate rather to access the first element val[0] (which in the error case doesn’t exist).

  2. theStack commented at 0:23 am on June 14, 2022: member

    Can be reproduced with the following input file:

     0$ xxd -p msgs_sent.dat
     10b74e2d1e1de050076657273696f6e000000000072000000801101004904
     2000000000000812d7e620000000000000000000000000000000000000000
     3000000000000000000004904000000000000000000000000000000000000
     4000000000000508c4c90436c3d431c2f5361746f7368693a32332e39392e
     53028746573746e6f646530292f0000000001f293e2d1e1de050077747869
     66472656c61790000000000008796e2d1e1de050073656e64616464727632
     70000000000007397e2d1e1de050076657261636b00000000000000000000
     8c2a1e2d1e1de050073656e64686561646572730000000000e1a3e2d1e1de
     9050073656e64636d706374000000090000000002000000000000005fa4e2
    10d1e1de050073656e64636d70637400000009000000000100000000000000
    1117a7e2d1e1de050070696e670000000000000000080000006d37706c0c0f
    12cf9aedaae2d1e1de05006765746865616465727300004500000080110100
    130106226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf1
    1488910f000000000000000000000000000000000000000000000000000000
    150000000000e1b5e2d1e1de050066656566696c7465720000000800000035
    16f08b00000000000d12e3d1e1de0500706f6e670000000000000000080000
    170001000000000000003c1de4d1e1de05006366636865636b707400000022
    180000000006226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2
    19b73cf188910f00bc20e4d1e1de0500706f6e670000000000000000080000
    20000200000000000000
    

    (Save the hex-lines into a file and convert it into the dump via $ xxd -p -r hexdump.txt > msgs_sent.dat).

  3. DrahtBot added the label Scripts and tools on Jun 14, 2022
  4. message-capture-parser: fix out of bounds error for empty vectors 42bbbba7c8
  5. in contrib/message-capture/message-capture-parser.py:82 in 3e2831d5ee outdated
    78@@ -79,7 +79,7 @@ def to_jsonable(obj: Any) -> Any:
    79             val = getattr(obj, slot, None)
    80             if slot in HASH_INTS and isinstance(val, int):
    81                 ret[slot] = ser_uint256(val).hex()
    82-            elif slot in HASH_INT_VECTORS and isinstance(val[0], int):
    83+            elif slot in HASH_INT_VECTORS and all(isinstance(a, int) for a in val):
    


    laanwj commented at 5:13 am on June 14, 2022:
    What do we want to happen in the case where val[0] is an integer and (some of) the other elements are not? Should this be an explicit error, or is it fine to implicitly fall through to to_jsonable?

    theStack commented at 12:23 pm on June 14, 2022:

    Good question. On master, this scenario would lead to a (confusing) error right now, since calling ser_uint256(...) with a non-integer type fails:

     0diff --git a/contrib/message-capture/message-capture-parser.py b/contrib/message-capture/message-capture-parser.py
     1index 9988478f1b..a6bac539de 100755
     2--- a/contrib/message-capture/message-capture-parser.py
     3+++ b/contrib/message-capture/message-capture-parser.py
     4@@ -168,6 +168,12 @@ def process_file(path: str, messages: List[Any], recv: bool, progress_bar: Optio
     5
     6
     7 def main():
     8+    from test_framework.messages import msg_cfcheckpt
     9+    invalid_msg = msg_cfcheckpt()
    10+    invalid_msg.headers = [23, 42, "foobar"]
    11+    print(to_jsonable(invalid_msg))
    12+    return 1
    13+
    14     parser = argparse.ArgumentParser(
    15         description=__doc__,
    16         epilog="EXAMPLE \n\t{0} -o out.json <data-dir>/message_capture/**/*.dat".format(sys.argv[0]),
    
     0$ ./contrib/message-capture/message-capture-parser.py
     1Traceback (most recent call last):
     2  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 220, in <module>
     3    main()
     4  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 174, in main
     5    print(to_jsonable(invalid_msg))
     6  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 83, in to_jsonable
     7    ret[slot] = [ser_uint256(a).hex() for a in val]
     8  File "/home/honey/bitcoin/./contrib/message-capture/message-capture-parser.py", line 83, in <listcomp>
     9    ret[slot] = [ser_uint256(a).hex() for a in val]
    10  File "/home/honey/bitcoin/./contrib/message-capture/../../test/functional/test_framework/messages.py", line 117, in ser_uint256
    11    rs += struct.pack("<I", u & 0xFFFFFFFF)
    12TypeError: unsupported operand type(s) for &: 'bytes' and 'int'
    

    On the PR it falls through to to_jsonable:

    0$ ./contrib/message-capture/message-capture-parser.py
    1{'filter_type': None, 'stop_hash': None, 'headers': [23, 42, 'foobar']}
    

    Options that I can think of right now:

    1. keep the PR as it is (being more tolerant if some internal data types are not matching, but potentially also covering bugs)
    2. change the all predicate to any to closely mimic the behaviour on master
    3. explicitly assert that all array elements have to be integers for the HASH_INT_VECTORS types (seems right, but could have unintended consequences that I am not aware of)

    laanwj commented at 12:24 pm on June 14, 2022:
    Thanks for checking! In that case, I like adding an explicit assertion best. It makes it immediately clear what the problem is.

    theStack commented at 12:35 pm on June 14, 2022:
    Okay, done. 👌 Didn’t check in detail, but I think this assert can’t be triggered by a malformed message-capture input, it would rather only point out that something is wrong internally (e.g. our deserialization routines), which is good.
  6. theStack force-pushed on Jun 14, 2022
  7. laanwj commented at 12:38 pm on June 14, 2022: member
    Code review ACK 42bbbba7c83d1e2baad18b4c6f05bad1358eb117
  8. MarcoFalke renamed this:
    message-capture-parser: fix out of bounds error for empty vectors
    [contrib] message-capture-parser: fix out of bounds error for empty vectors
    on Jun 14, 2022
  9. MarcoFalke merged this on Jun 14, 2022
  10. MarcoFalke closed this on Jun 14, 2022

  11. theStack deleted the branch on Jun 14, 2022
  12. sidhujag referenced this in commit 6fac38db8c on Jun 15, 2022
  13. theStack referenced this in commit 644772b9ef on Sep 4, 2022
  14. MarcoFalke referenced this in commit 3c5fb9691b on Sep 9, 2022
  15. sidhujag referenced this in commit e01a908af4 on Sep 11, 2022
  16. DrahtBot locked this on Jun 14, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me