assumeutxo, rpc: Improve EOF error when reading snapshot metadata in loadtxoutset #28670

pull pablomartin4btc wants to merge 2 commits into bitcoin:master from pablomartin4btc:assumeutxo-improve-EOF-error-when-reading-snapshot-also-on-autofile changing 2 files +18 −3
  1. pablomartin4btc commented at 3:59 am on October 18, 2023: member

    Improve error messaging in loadtxoutset to provide a more user-friendly error message when the utxo snapshot file size is insufficient to contain the necessary metadata structure (40 bytes).

    • Error message output before the change (current master):
    0./src/bitcoin-cli -datadir=${AU_DATADIR} loadtxoutset ${AU_DATADIR}/utxo-111.dat
    1error code: -1
    2error message:
    3AutoFile::read: end of file: iostream error
    
    • Error message output after the change:
    0./src/bitcoin-cli -datadir=${AU_DATADIR} loadtxoutset ${AU_DATADIR}/utxo-111.dat
    1error code: -32603
    2error message:
    3Unable to load UTXO snapshot, couldn't read snapshot metadata from file /tmp/.test_utxo_3/utxo-111.dat.
    4The file may be corrupted or it was crafted in an incompatible format.
    

    Original strategy was to check the size of the file before de-serialisation and the current strategy was presented at that time as an alternative which has been preferred by reviewers and it looks simpler since streams source code files are no longer touched.

    This PR needs to be backported to 26.x

  2. DrahtBot commented at 3:59 am on October 18, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK furszy
    Approach ACK hernanmarino
    Stale ACK fjahr

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29612 (rpc: Optimize serialization and enhance metadata of dumptxoutset output by fjahr)
    • #27432 (contrib: add tool to convert compact-serialized UTXO set to SQLite database by theStack)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. pablomartin4btc force-pushed on Oct 18, 2023
  4. in src/rpc/blockchain.cpp:2779 in acb4d35e22 outdated
    2748@@ -2749,6 +2749,11 @@ static RPCHelpMan loadtxoutset()
    2749     }
    2750 
    2751     SnapshotMetadata metadata;
    2752+    // Check if the file size is at least as large as the expected size of SnapshotMetadata
    2753+    if (afile.size() < sizeof(SnapshotMetadata)) {
    2754+        throw JSONRPCError(RPC_INTERNAL_ERROR, strprintf("Unable to load UTXO snapshot, "
    


    kevkevinpal commented at 0:29 am on October 20, 2023:
    nit: should we specify that the file size is too small? A user might think their file is too large with this error message

    kevkevinpal commented at 0:30 am on October 20, 2023:
    just saw your alternative, that is preferred for me

    pablomartin4btc commented at 8:03 pm on October 20, 2023:
    My intention with the code change alternative was to avoid touching both stream files, header and .cpp, but sure, I could take the error message from there and use it here. Regarding the error message output, I’m open to suggestions there, I could specify that the file’s too small or the exact minimum amount of bytes that are required, or even ask to check the documentation, which will need to be updated.
  5. DrahtBot added the label CI failed on Dec 17, 2023
  6. pablomartin4btc force-pushed on Jan 8, 2024
  7. pablomartin4btc force-pushed on Jan 8, 2024
  8. pablomartin4btc commented at 11:38 pm on January 8, 2024: member
    Rebased + fix correct usage of utf8string() instead of u8string().
  9. DrahtBot removed the label CI failed on Jan 8, 2024
  10. furszy commented at 2:33 pm on January 9, 2024: member

    Would be good to add test coverage for this error.

    Also, the “file size being too small” error message isn’t really descriptive and it does not provide further guidance to solve the problem. The file could be corrupt or could had been crafted erroneously. I would go simpler for now and change “file size being too small” for “Unable to load UTXO snapshot, the file %s cannot be parsed.”

    I see your alternative solution better. More comprehensive.

  11. DrahtBot added the label CI failed on Jan 17, 2024
  12. pablomartin4btc force-pushed on Jan 17, 2024
  13. pablomartin4btc force-pushed on Jan 17, 2024
  14. pablomartin4btc force-pushed on Jan 17, 2024
  15. pablomartin4btc commented at 6:21 pm on January 17, 2024: member

    Updates:

    • Switched the logic strategy to the previously presented alternative due to preference of reviewers @kevkevinpal & @furszy (thanks!);
    • Added test coverage
  16. pablomartin4btc force-pushed on Jan 17, 2024
  17. DrahtBot removed the label CI failed on Jan 17, 2024
  18. in src/rpc/blockchain.cpp:2756 in 2b6d44a821 outdated
    2752+        afile >> metadata;
    2753+    } catch (const std::exception& e) {
    2754+        // Handle any exception, which includes reaching the end of the file e.g. file size < sizeOf(metadata)
    2755+        throw JSONRPCError(RPC_INTERNAL_ERROR, strprintf("Unable to load UTXO snapshot, "
    2756+            "couldn't read snapshot metadata from file %s.\nThis error is usually "
    2757+            "due to the file size being too small.\n%s", path.utf8string(), e.what()));
    


    furszy commented at 8:10 pm on January 17, 2024:
    I don’t to see how this part of the message would be helpful to users. “file size being too small” does not tell me anything. Would be better to say that the file is corrupt and/or was crafted in an incompatible format. Also, no need to add the internal exception message to the user facing error.

    pablomartin4btc commented at 8:24 pm on January 17, 2024:
    Makes sense, thanks!
  19. pablomartin4btc force-pushed on Jan 17, 2024
  20. pablomartin4btc commented at 11:47 pm on January 17, 2024: member
    Addressed @furszy’s feedback.
  21. DrahtBot added the label CI failed on Jan 18, 2024
  22. DrahtBot commented at 0:30 am on January 18, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/20595100031

  23. pablomartin4btc force-pushed on Jan 18, 2024
  24. DrahtBot removed the label CI failed on Jan 18, 2024
  25. in src/rpc/blockchain.cpp:2752 in 50c7a769bb outdated
    2745@@ -2746,7 +2746,15 @@ static RPCHelpMan loadtxoutset()
    2746     }
    2747 
    2748     SnapshotMetadata metadata;
    2749-    afile >> metadata;
    2750+    try {
    2751+        // Read data from the file
    2752+        afile >> metadata;
    2753+    } catch (const std::exception& e) {
    


    hebasto commented at 4:39 pm on January 18, 2024:

    To fix the MSVC CI job, suggesting:

    0    } catch (const std::exception&) {
    

    pablomartin4btc commented at 5:21 pm on January 18, 2024:
    done, thanks!
  26. pablomartin4btc force-pushed on Jan 18, 2024
  27. pablomartin4btc commented at 5:23 pm on January 18, 2024: member
    Addressed @hebasto’s suggestion in order to fix the MSVC CI job.
  28. willcl-ark added the label Docs on Jan 24, 2024
  29. willcl-ark added the label RPC/REST/ZMQ on Jan 24, 2024
  30. hernanmarino approved
  31. hernanmarino commented at 10:23 pm on February 13, 2024: contributor
    Approach ACK
  32. fjahr commented at 11:33 pm on March 6, 2024: contributor
    Code review ACK 53fbf19e95a3d5083eaf033378bb8d0e87bdc6b2
  33. DrahtBot requested review from hernanmarino on Mar 6, 2024
  34. DrahtBot added the label CI failed on Mar 13, 2024
  35. DrahtBot commented at 8:12 pm on March 13, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/20626357772

  36. pablomartin4btc commented at 0:23 am on April 16, 2024: member

    Updates:

    • I think lint CI failure is unrelated with the code change, it was running successful before started failing with no code change, run ./test/lint/lint-python.py locally with no failure.
  37. furszy commented at 1:23 am on April 16, 2024: member
    The CI failure is due a merge conflict. Need to rebase the PR. Some of the code you are using here is not in master anymore.
  38. pablomartin4btc force-pushed on Apr 16, 2024
  39. pablomartin4btc force-pushed on Apr 16, 2024
  40. pablomartin4btc force-pushed on Apr 16, 2024
  41. pablomartin4btc commented at 2:58 am on April 16, 2024: member
    Rebased.
  42. DrahtBot removed the label CI failed on Apr 16, 2024
  43. in test/functional/feature_assumeutxo.py:117 in 521de52c75 outdated
    113@@ -114,6 +114,13 @@ def expected_error(log_msg="", rpc_details=""):
    114                 f.write(valid_snapshot_contents[(32 + 8 + offset + len(content)):])
    115             expected_error(log_msg=f"[snapshot] bad snapshot content hash: expected a4bf3407ccb2cc0145c49ebba8fa91199f8a3903daf0883875941497d2493c27, got {wrong_hash}")
    116 
    117+        self.log.info("  - snapshot file with invalid size, too small")
    


    fjahr commented at 7:32 am on April 16, 2024:
    nit: I think this info log could be improved if you retouch. Testing the file size is not really the goal, it’s about hitting the metadata parsing error that was added, right? So I would suggest to call it something like “Inablity to parse metadata fails early” or so.

    pablomartin4btc commented at 2:30 pm on April 16, 2024:
    Thanks! I agree, I’ll update it if I have to retouch it and if not, could be done in a follow-up.

    pablomartin4btc commented at 2:19 am on May 17, 2024:
    done!
  44. fjahr commented at 7:34 am on April 16, 2024: contributor
    re-ACK 521de52c751a4539333bb2f69a07c0f1b4f496de
  45. in src/rpc/blockchain.cpp:2779 in 40e45a7635 outdated
    2775+    try {
    2776+        // Read data from the file
    2777+        afile >> metadata;
    2778+    } catch (const std::exception&) {
    2779+        // Handle any exception, which includes reaching the end of the file e.g. file size < sizeOf(metadata)
    2780+        throw JSONRPCError(RPC_INTERNAL_ERROR, strprintf("Unable to load UTXO snapshot, "
    


    furszy commented at 7:30 pm on May 10, 2024:

    Shouldn’t use RPC_INTERNAL_ERROR. As per it description: “this error should only be used for genuine errors in bitcoind (for example datadir corruption)”

    Maybe use RPC_DESERIALIZATION_ERROR or RPC_MISC_ERROR.


    pablomartin4btc commented at 2:15 am on May 11, 2024:
    It makes sense, I’ll fix it, thanks!

    pablomartin4btc commented at 2:18 am on May 17, 2024:
    done!
  46. furszy commented at 7:31 pm on May 10, 2024: member
    utACK 521de52c751
  47. assumeutxo: improve EOF error when reading snapshot
    In the RPC command `loadtxoutset`, catch possible errors while
    reading snapshot metadata, most commonly due to the structure size
    being shorter than the expected one (40 bytes) or the snapshot file
    may be corrupted or crafted in an incompatible format.
    0a557c4178
  48. test, assumeutxo: add coverage for loadtxoutset EOF error
    Add a test to cover scenarios where an EOF error is raised during
    the reading of invalid snapshot metadata (40 bytes file size min)
    or a corrupted file with an invalid format.
    4a2df05e8c
  49. pablomartin4btc force-pushed on May 17, 2024
  50. pablomartin4btc commented at 2:44 am on May 17, 2024: member

    Updates:

    • Addressed feedback from @fjahr by amending the logging info description.
    • Addressed feedback from @furszy by using a better RPC error code (RPC_DESERIALIZATION_ERROR instead of RPC_INTERNAL_ERROR).
  51. in src/rpc/blockchain.cpp:2788 in 0a557c4178 outdated
    2784+        afile >> metadata;
    2785+    } catch (const std::exception&) {
    2786+        // Handle any exception, which includes reaching the end of the file e.g. file size < sizeOf(metadata)
    2787+        throw JSONRPCError(RPC_DESERIALIZATION_ERROR, strprintf("Unable to load UTXO snapshot, "
    2788+            "couldn't read snapshot metadata from file %s.\nThe file may be corrupted "
    2789+            "or it was crafted in an incompatible format.", path.utf8string()));
    


    furszy commented at 1:02 pm on May 20, 2024:
    nit: should use fs::PathToString(path)

    maflcko commented at 1:10 pm on May 20, 2024:

    PathToString

    That is only for internal use. However, the result here will be returned on RPC, so the dev-notes apply:

    0- Use `fs::path::u8string()`/`fs::path::utf8string()` and `fs::u8path()` functions when converting path
    1  to JSON strings, not `fs::PathToString` and `fs::PathFromString`
    2
    3  - *Rationale*: JSON strings are Unicode strings, not byte strings, and
    4    RFC8259 requires JSON to be encoded as UTF-8.
    
  52. furszy commented at 1:03 pm on May 20, 2024: member
    utACK 4a2df05e
  53. DrahtBot requested review from fjahr on May 20, 2024
  54. fjahr commented at 12:29 pm on May 21, 2024: contributor
    Hm, unfortunately, I think this PR was made redundant by #29612 recently. After the latest feedback there which I addressed just 3 weeks ago, we are enhancing the assumeutxo metadata there significantly and this means all of the lines changed here are touched there as well and I don’t see much of an overlap either that would let me use the code here.
  55. pablomartin4btc commented at 2:15 pm on May 21, 2024: member

    Hm, unfortunately, I think this PR was made redundant by #29612 recently. After the latest feedback there which I addressed just 3 weeks ago, we are enhancing the assumeutxo metadata there significantly and this means all of the lines changed here are touched there as well and I don’t see much of an overlap either that would let me use the code here.

    No prob, thanks for letting me know, I’ll review that PR.

  56. pablomartin4btc closed this on May 21, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-28 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me