doc: BaseIndex sync behavior with empty datadir #22485

pull jamesob wants to merge 1 commits into bitcoin:master from jamesob:2021-07-index-doc changing 2 files +8 −0
  1. jamesob commented at 4:04 pm on July 18, 2021: member
    Make a note about a potentially confusing behavior with BaseIndex::m_synced; if the user starts bitcoind with an empty datadir and an index enabled, BaseIndex will consider itself synced (as a degenerate case). This affects how indices are built during IBD (relying solely on BlockConnected signals vs. using ThreadSync()).
  2. DrahtBot added the label UTXO Db and Indexes on Jul 18, 2021
  3. Bocoi5011 commented at 6:25 pm on July 18, 2021: none
    .
  4. DrahtBot commented at 0:04 am on July 19, 2021: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #25193 (indexes: Read the locator’s top block during init, allow interaction with reindex-chainstate by mzumsande)
    • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  5. bitcoin deleted a comment on Jul 19, 2021
  6. Jaysonmald approved
  7. in src/index/base.cpp:75 in 2b48b03f9b outdated
    65@@ -66,7 +66,16 @@ bool BaseIndex::Init()
    66     } else {
    67         m_best_block_index = m_chainstate->m_blockman.FindForkInGlobalIndex(active_chain, locator);
    68     }
    69+
    70+    // Note: this will latch to true immediately if the user starts up with an empty
    71+    // datadir and an index enabled. If this is the case, indexation will happen solely
    72+    // via `BlockConnected` signals until, possibly, the next restart.
    


    MarcoFalke commented at 8:52 am on July 19, 2021:
    What do you mean with “until the next restart”? Once the index is in sync with the active chain tip, it will stay that way, regardless of restarts. The only way it could get out of sync is when you disable the index on a restart, advance the tip, then enable it again. Am I missing something?

    jamesob commented at 9:17 am on July 19, 2021:

    This is contingent on how fast the particular indexing process is; you can imagine that if a given index takes (for the sake of argument) a minute per block to build, if we get 1000 blocks connected and then the user shuts down immediately, the validationinterface queue may not clear* and we will start up out of sync after having been “synced” in the degenerate way. Does this make sense?

    *I know we wait for this queue to clear during a graceful shutdown, but we can’t assume this.


    ryanofsky commented at 7:27 pm on November 2, 2021:

    In commit “index: doc/log BaseIndex sync behavior with empty datadir” (2b48b03f9b6fd3126f957f328dd559fadceeaa7f)

    I also found this confusing and would suggest just saying “the index will get updated and stay in sync through BlockConnected notifications”


    jamesob commented at 2:15 pm on January 11, 2022:
    My issue with your suggested comment is that that isn’t a complete picture of what’s going on - as described above, indexation will only happen via BlockConnectected notifications until the next restart. If we have a hard shutoff that doesn’t clear the validation queue, the index will sync sequentially in ThreadSync(), which makes that comment somewhat misleading.

    ryanofsky commented at 7:49 pm on March 21, 2022:
    This is fine. I think I’m just agreeing with Marco that the comment seemed a little strange because I didn’t know what it might be implying. It seems obvious there will be no BlockConnected signals if the process is dead.
  8. tryphe commented at 10:14 am on July 30, 2021: contributor

    Concept ACK

    Any easy way to confirm this behavior, regarding BlockConnected signals vs. using ThreadSync()? Would be willing to test it out.

  9. jamesob commented at 1:54 pm on October 29, 2021: member
    Worth merging? Anything left to do here? Should I close?
  10. ryanofsky approved
  11. ryanofsky commented at 7:38 pm on November 2, 2021: contributor

    Code review ACK 2b48b03f9b6fd3126f957f328dd559fadceeaa7f. Concept ~0. I’d be curious to know more about what is confusing without this log message. It seems pretty straightforward that if there are no blocks, then there is nothing for the sync thread to do, and the index is in sync. Like if pieces_of_popcorn == 0 then popcorn_eaten == true. So I’m not sure why this is confusing or degenerate, etc.

    I guess I like the new comments, but I’m just unsure what the log message adds.

  12. jamesob closed this on Nov 30, 2021

  13. jamesob commented at 8:58 pm on January 10, 2022: member
    Reopening as I think this is still a useful clarification when trying to remind myself of how indexation works.
  14. jamesob reopened this on Jan 10, 2022

  15. MarcoFalke commented at 8:59 am on January 11, 2022: member
    It is recommended to address the feedback comments in some way before merge
  16. jamesob commented at 2:16 pm on January 11, 2022: member

    I guess I like the new comments, but I’m just unsure what the log message adds.

    It’s obviously not mandatory, but I found the whole indexation process a little confusing and so I figured the more hints the better?

  17. in src/index/base.cpp:78 in 2b48b03f9b outdated
    70+    // Note: this will latch to true immediately if the user starts up with an empty
    71+    // datadir and an index enabled. If this is the case, indexation will happen solely
    72+    // via `BlockConnected` signals until, possibly, the next restart.
    73     m_synced = m_best_block_index.load() == active_chain.Tip();
    74+    if (!m_best_block_index && !active_chain.Tip()) {
    75+        LogPrintf("%s: %s considered fully synced due to empty chain\n",
    


    mzumsande commented at 9:33 pm on March 20, 2022:
    I also think that there shouldn’t be a LogPrintf here: At this point in time, we only use this information for skipping the prune violation checks, we aren’t “officially” synced. For signaling to a user that an index is synced, we already have a LogPrintf in ThreadSync: It currently says “[indexname] is enabled” in case of an empty chain, and “[indexname] is enabled at height X” otherwise - we could add an “due to an empty chain” here. But two messages for the same thing would only be confusing in my opinion.

    ryanofsky commented at 8:00 pm on March 21, 2022:

    @mzumsande, maybe you could be specific about what is confusing? Do you only think this is redundant, or do you think there is something misleading or contradictory about these messages?

    If I were adding a log message here I’d probably say “Chain height is 0. Index is synced.” but I don’t know if that helps.

    Message in the PR seems basically ok to me. (I would just drop “fully” because no index will be fully synced until the last bitcoin block is produced)


    mzumsande commented at 8:17 pm on March 21, 2022:

    The log is

    0Init: txindex considered fully synced due to empty chain
    1txindex thread start
    2txindex is enabled
    3txindex thread exit
    

    (potentially x3 for each index). I’d find it a bit confusing why we’d need to start a thread to sync an index right after saying that the index is fully synced. So I think it could be simpler:

    0txindex thread start
    1txindex is enabled due to an empty chain
    2txindex thread exit
    

    ryanofsky commented at 9:29 pm on March 21, 2022:

    I’d find it a bit confusing why we’d need to start a thread to sync an index right after saying that the index is fully synced.

    I see. You don’t think the log message gives a misleading impression. You think the log message gives a correct impression that the code is doing something stupid, and it’s confusing why the stupid thing would be happening. Fair enough!


    fjahr commented at 11:07 pm on April 17, 2022:
    I agree with @mzumsande that this is unnecessary. It just doesn’t seem like the right place for this kind of information. The information that the code is something stupid seems fine as just a code comment while the logs are also viewed by non-developers who might be confused.

    jamesob commented at 7:31 pm on July 18, 2022:
    I’ve removed this log; now the entire change is doc-only.

    mzumsande commented at 2:24 pm on July 21, 2022:
    Doesn’t look like anything has changed, did you push the wrong commit?

    jamesob commented at 2:31 pm on July 21, 2022:
    Oops, thanks @mzumsande.
  18. mzumsande commented at 9:38 pm on March 20, 2022: contributor
    ACK to the comment, I think that the additional log message would be redundant (see below).
  19. ryanofsky approved
  20. ryanofsky commented at 8:04 pm on March 21, 2022: contributor
    Code review ACK 2b48b03f9b6fd3126f957f328dd559fadceeaa7f. This seems fine and generally better, despite confusion and nitpicking
  21. fjahr commented at 11:51 pm on April 17, 2022: contributor

    utACK 2b48b03f9b6fd3126f957f328dd559fadceeaa7f

    I think this can be merged as is but I would prefer it without the log message (see my comment).

    I think one more place where a note on this could be helpful for users is in the getindexinfo RPC, something like this: https://github.com/fjahr/bitcoin/commit/255f0943f7be76e534c9e66f82d1df7fee8f81bb. Feel free to add this in here but I can also open a follow-up.

  22. bitcoin deleted a comment on Apr 18, 2022
  23. jamesob force-pushed on Jul 18, 2022
  24. jamesob renamed this:
    index: doc/log BaseIndex sync behavior with empty datadir
    doc: BaseIndex sync behavior with empty datadir
    on Jul 18, 2022
  25. jamesob force-pushed on Jul 21, 2022
  26. doc: BaseIndex sync behavior with empty datadir
    Make a note about a potentially confusing behavior with `BaseIndex::m_synced`;
    if the user starts bitcoind with an empty datadir and an index enabled,
    BaseIndex will consider itself synced (as a degenerate case). This affects
    how indices are built during IBD (relying solely on BlockConnected signals vs.
    using ThreadSync()).
    11780f29e7
  27. jamesob force-pushed on Jul 21, 2022
  28. mzumsande commented at 5:45 pm on July 21, 2022: contributor
    ACK 11780f29e7c3f50fb7717fc98950ece9385d314b
  29. MarcoFalke merged this on Jul 21, 2022
  30. MarcoFalke closed this on Jul 21, 2022

  31. sidhujag referenced this in commit 55e04eb011 on Jul 22, 2022
  32. bitcoin locked this on Jul 21, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me