scripts: In linearize, search for next position of magic bytes rather than fail #16802

pull takinbo wants to merge 1 commits into bitcoin:master from takinbo:master changing 1 files +5 −2
  1. takinbo commented at 7:26 am on September 4, 2019: contributor

    When using the linearize-data.py contrib script to export block data, there are edge cases where the script fails with an Invalid magic: 00000000 error. This error occurs due to the presence of padding bytes that occasionally appears between consecutive blocks in the block data file.

    There’s an ongoing conversation about this in #14986. sipa also admitted that it is a bug in #5028. Fortunately, this is not an issue in bitcoin core as it handles this type of situation gracefully and so no fix in bitcoin core is required.

    This PR is an improvement on how the script handles these “invalid magic bytes”. Rather than failing, this patch allows the script to search for the next occurrence of the magic bytes and then starts reading the block from there.

  2. laanwj commented at 9:03 am on September 4, 2019: member
    Concept ACK, I think adding the same robustness with regard to block positions to the linearize script makes sense.
  3. DrahtBot added the label Scripts and tools on Sep 4, 2019
  4. in contrib/linearize/linearize-data.py:219 in 866014f809 outdated
    212@@ -213,8 +213,8 @@ def run(self):
    213 
    214             inMagic = inhdr[:4]
    215             if (inMagic != self.settings['netmagic']):
    216-                print("Invalid magic: " + inMagic.hex())
    217-                return
    218+                self.inF.seek(-7, os.SEEK_CUR)
    


    promag commented at 10:59 pm on September 15, 2019:
    Care to add a comment here?

    takinbo commented at 11:45 pm on September 17, 2019:

    Certainly. When searching for the start of the next block within the block file, 8 bytes are read: first four bytes should consist of the magic bytes and the next four, the length of the block. Rather than quitting if the magic bytes are not found, this patch rewinds the file cursor 7 bytes so the routine can attempt the search again by reading and comparing the next 8 bytes skipping the first byte of the previous search. It continues this process until it finds the magic bytes or reaches the end of the file.

    There are rare occasions when there is a gap between blocks within the block file. This patch allows the script to compensate by finding the position to begin reading the next block while skipping the gap.


    promag commented at 0:19 am on September 18, 2019:
    I mean in the code, a brief comment.

    takinbo commented at 1:15 am on September 18, 2019:
    Right. Just added one now.
  5. promag commented at 10:59 pm on September 15, 2019: member
    Concept ACK, change LGTM.
  6. fanquake added the label Waiting for author on Sep 16, 2019
  7. fanquake removed the label Waiting for author on Sep 18, 2019
  8. luke-jr commented at 11:03 pm on September 20, 2019: member
    Concept ACK
  9. laanwj commented at 12:33 pm on September 30, 2019: member
    ACK after squash ACK 3284e6c09a84e9557ec72723ad636053d3ef7122
  10. laanwj added the label Waiting for author on Sep 30, 2019
  11. scripts: search for next position of magic bytes rather than fail
    document seek method for next position of magic bytes
    3284e6c09a
  12. takinbo force-pushed on Sep 30, 2019
  13. laanwj renamed this:
    scripts: search for next position of magic bytes rather than fail
    scripts: In linearize, search for next position of magic bytes rather than fail
    on Oct 2, 2019
  14. laanwj added the label Feature on Oct 2, 2019
  15. takinbo commented at 9:31 pm on October 7, 2019: contributor
    I have squashed the commits into one. Is there still anything I need to do on this PR?
  16. fanquake removed the label Waiting for author on Oct 7, 2019
  17. laanwj referenced this in commit df50fd194f on Oct 8, 2019
  18. laanwj merged this on Oct 8, 2019
  19. laanwj closed this on Oct 8, 2019

  20. sidhujag referenced this in commit 2624540adc on Oct 8, 2019
  21. decryp2kanon referenced this in commit 5782afac28 on May 18, 2020
  22. decryp2kanon referenced this in commit 6ab41d0142 on May 18, 2020
  23. decryp2kanon referenced this in commit bde07331d8 on May 18, 2020
  24. decryp2kanon referenced this in commit 7d062f2317 on Jun 14, 2020
  25. decryp2kanon referenced this in commit c4e813bf09 on Jun 14, 2020
  26. PastaPastaPasta referenced this in commit 2e6335c475 on Sep 11, 2021
  27. PastaPastaPasta referenced this in commit 158ca47c3c on Sep 11, 2021
  28. PastaPastaPasta referenced this in commit c928eae616 on Sep 12, 2021
  29. PastaPastaPasta referenced this in commit debed8d819 on Sep 12, 2021
  30. PastaPastaPasta referenced this in commit 0ab338a7bd on Sep 12, 2021
  31. PastaPastaPasta referenced this in commit e5c7de4e87 on Sep 14, 2021
  32. PastaPastaPasta referenced this in commit c9f2068cde on Sep 14, 2021
  33. PastaPastaPasta referenced this in commit bba59cabc3 on Sep 15, 2021
  34. DrahtBot locked this on Dec 16, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-06 16:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me