Progressbar based on time-based estimation of transactions. #2310

pull sipa wants to merge 2 commits into bitcoin:master from sipa:progressbar changing 7 files +119 −53
  1. sipa commented at 9:34 PM on February 14, 2013: member

    Improvement upon #2294.

  2. sipa commented at 9:37 PM on February 14, 2013: member

    (added an own pull so I can try fixing the pulltester build issue)

  3. BitcoinPullTester commented at 10:00 PM on February 14, 2013: none

    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/69be30d7e64eba0c936d9221b73b2bc35bebc7e2 for binaries and test log.

  4. qubez commented at 10:14 AM on February 15, 2013: none

    Interesting, I had a idea about five days ago of how to improve estimation (which can be network-bound or cpu-bound):

    -Include with each checkpoint the size of the blockchain at that checkpoint (which will loosely correspond with CPU work as well as download bandwidth) -Include with the last checkpoint a multiplier to estimate additional CPU expense per block to do full verification after last checkpoint (i.e. blocks will take 3x more time to verify with signature checking). This can also be a guess of future network growth: "how many more transactions per block will we see than between the last two checkpoints".

    then: -Obtain estimated (or actual) blockchain size at the current block, from interpolation between known sizes at checkpoints, -Estimate the total size/cpu_work of the current blockchain by extrapolating beyond the last checkpoint to the current height, -Progress bar percentage is current block's estimated size, divided by total estimated size.

    progressbarformula

    More advanced than I could write: curve fitting. The blockchain growth looks pretty darn exponential!

    Checkpoint blockchain sizes: block, size (0, 293) 011111, 2569495 033333, 7820351 074000, 28029745 105000, 69840955 134444, 341554543 168000, 1032555361 193000, 2491771562 210000, 4099216740 216116, 4855459871

  5. gavinandresen commented at 4:41 PM on February 15, 2013: contributor

    nit: compiling with clang 3.3 on my Mac: src/qt/bitcoingui.cpp:534:9: warning: unused variable 'totalSecs' [-Wunused-variable] int totalSecs = genesisBlockDate.secsTo(currentDate);

    And running against a pre-0.8 datadirectory gets a crash, passing a NULL *pblockindex to GuessVerificationProgress from BitcoinGUI::setNumBlocks

  6. sipa commented at 4:54 PM on February 15, 2013: member

    updated

  7. schildbach commented at 8:41 PM on February 15, 2013: contributor

    Some feedback:

    There is a space (or LF) missing before "Transactions after this will not yet be visible."

    I'd prefer proper plurals. Using workarounds like "day(s)" are looking unprofessional. Easy fix: Drop to the next lower unit just before the switch from 2 to 1 would occur.

  8. qubez commented at 8:46 PM on February 15, 2013: none

    I ran the win32 pull-tester build without issue, but was taken aback by the new message. "six hours behind" looks like it's going to take six hours to do something.

    I don't know if this is the way forward as this communicates even less information. I would rather have an estimate of how long it's going to run before it's caught up. This would be more useful information to end users, they want to know how long until they can use Bitcoin.

  9. luke-jr commented at 8:47 PM on February 15, 2013: member

    @schildbach Qt will automatically change "day(s)" to "day" or "days" as appropriate.

  10. schildbach commented at 8:59 PM on February 15, 2013: contributor

    @luke-jr Hu? I'm almost sure I saw "day(s)" in the actual running software. I'm running Ubuntu Linux 12.10 in case this matters.

  11. laanwj commented at 9:42 PM on February 15, 2013: member

    This isn't based on the last version of my commits. I fixed the totalSecs already, along with some other changes such as the LF before the transaction message. @schildbach: in Qt that is avoided by adding the correct plurals in the english translation file, no need to hack around in the source code. @qubez if you know better just submit a pull request to improve it, I'm pretty darn sick of all the complaining, whatever we change to the progress bar (or something else) there is at least someone that complains that the old situation was better. That's why I was reluctant to touch it, or do much development on bitcoin at all anymore for that matter.

  12. sipa commented at 11:04 PM on February 15, 2013: member

    @laanwj In that case, pull my commits into your pullreq and I'll close this?

  13. laanwj commented at 7:54 AM on February 16, 2013: member

    Yep good idea

  14. schildbach commented at 10:40 AM on February 16, 2013: contributor

    I just retested this on latest sipa/progressbar and I am definately seeing the "(s)" in day(s) and hour(s). I'm trying to do a screenshot.

  15. schildbach commented at 10:45 AM on February 16, 2013: contributor

    Screenshot from 2013-02-16 11:40:50

  16. Change progress bar from block-based to time-based
    This is less confusing to most people, and doesn't rely on estimates
    of the total number of blocks received from other nodes.
    24cde0b7d1
  17. Transactions-based verification progress 9f2467ad62
  18. sipa commented at 1:53 PM on February 16, 2013: member

    @laanwj Updated to your latest pull's code (I think...) @qubez If you're keen on making graphs, can you compare with the method implemented in this pullreq? It's fairly similar to what you propose, except it uses transactions and not blockchain size, as transactions are already computed (so don't require extra checkpoints data)... Or, as said, feel free to write a patch yourself. @laanwj @schildbach I still see week(s) in the current code. You say that is fixed by adding it to the translations - that isn't done yet then?

  19. Diapolo commented at 2:37 PM on February 16, 2013: none

    @schildbach As long as the english translation master file is not updated in the correct way, you will see this generic string. Most of the time I update these file or laanwy does. This is a non-issue, really :).

  20. laanwj commented at 3:35 PM on February 16, 2013: member

    @sipa yes this is the last version, and indeed the translations still have to be updated

  21. schildbach commented at 1:09 AM on February 17, 2013: contributor

    There is more. Sometimes it says phrases like "and 0 day(s)". Will the whole phrase also automagically get cut by the translations?

  22. laanwj commented at 7:55 AM on February 17, 2013: member

    I'm pretty sure you cannot get 0 day(s) as under 2 days it shows hours, but it's possible to get 0 hour(s) if secs < 90_60 *but_ count < nTotalBlocks. I suppose something less than an hour could be displayed as an hour.

  23. schildbach commented at 9:30 AM on February 17, 2013: contributor

    @laanwj emphasis on the "and". Probably it was "and 0 hours", I just wanted to describe the scheme. Could in future as well happen with "and 0 years" if you decide to add centuries.

  24. laanwj commented at 9:36 AM on February 17, 2013: member

    Fyi the "and" has been gone for a while.

  25. schildbach commented at 11:43 AM on February 17, 2013: contributor

    @laanwj Ok thanks for fixing.

    With the current version, there is another problem: The progress bar does not appear to progress well. It started with about 100 week(s) behind, and the bar was nearly empty. Now I'm at 29 week(s) behind, and the bar is filled by perhaps 15%. I'd expect to be filled either at about 70% or at about 85%, depending on if the progress is relative to genesis or to app start.

    Actually I'm not convinced that a progress bar is a suitable metaphor for blockchain download at all, because a) download never finishes and b) the blockchain is not a chain.

  26. laanwj commented at 12:05 PM on February 17, 2013: member

    The progress bar was relative to app start a few versions ago, and that confused people, as they thought the download restarted every time.

    The code in this pull request displays the absolute progress, in verification cycles computed from the number of transactions, from the genesis block to now. This is as precise as it gets: if it is 50% full then this means there is 50% download/verification time still to go.

    Anyway, we had that discussion many times already. Believe me, it's a fruitless waste of time, better spent on things like payment protocols/workflows. Let's keep this thread restricted to concrete issues/showstoppers with the current code.

  27. schildbach commented at 2:31 PM on February 17, 2013: contributor

    @laanwj Please read my report. My arguement is that about 15% fits neither calculation. Genesis was 210 weeks ago, in my test case I had 29 weeks to go. In terms of time, I had 86% done, yet it showed only about 15%. In terms of transaction I cannot say, but if you look at https://blockchain.info/charts/n-transactions its pretty clear visually that the integral before "29 weeks ago" is more than 15% compared to the integral after.

  28. sipa commented at 2:37 PM on February 17, 2013: member

    @schildbach That is because signature verification is only enabled after the last checkpoint, so transaction after that are given much higher weight in the calculation (15 times more). Feel free to suggest a better factor (or heuristic for determining it), but for a test I did, this resulted in a very accurate measurement of processing time.

  29. qubez commented at 7:18 PM on February 17, 2013: none

    Signature verification gets "turned on" only for the last 5000 blocks - the problem @schildbach is describing is that most of the download time is downloading blocks 0-216116, the progress of which is figured as 50% by 108058, but the actual percentage of blockchain size/transactions downloaded and processed by then is only about 2% - (see area under curve in my graphic above).

    That's the problem which my big unacknowledged post describes a way to fix. Such a fix needs to mess with checkpoints.cpp; you're messing with checkpoints.cpp...

    This pull is two things: 1 - bar wording changed ("blocks remaining" -> "days behind") 2 - post-final-checkpoint "correction factor"

    "# 2" being here has made it a place to discuss the other missing code: 3 - correction of progress bar percentage to reflect actual download progress (size MB) instead of block number.

    If you had just changed the wording with this pull, then "lets also fix # 3" likely wouldn't be bikeshedded here (I searched instead of filing a new issue.)

    I figure I could do # 3 with one very long weighted sum, but I'm at the "build env, installed git, now..." part of being able to code/submit.

  30. sipa commented at 8:41 PM on February 17, 2013: member

    @qubez Using block chain size or transaction count shouldn't matter much - they are very correlated.

    What this pull does is change the progressbar to reflect "verification time progress". Effectively, it counts/estimates transactions, weighing those after the last checkpoint higher, as those require higher much more CPU.

    The factor used to scale post-checkpoint transactions should reflect the time factor (download + process + sigcheck) / (download + process). The factor that is currently hardcoded (15.0) was determined by running on some system, and indeed not taking download time into account. Doing so would probably result in a lower number. Feel free to do benchmarks so the factor can be improved.

    If I understand you correctly, you suggest that your method (using block chain size, and hardcode sizes in the checkpoints) will better account for download time? I'm not sure about that - assuming transactions have a constant size (which, on average, is true) and the post-checkpoint time factor is reasonably, the estimation implemented here should be very accurate.

  31. qubez commented at 9:29 PM on February 17, 2013: none

    I have no problem believing that sipafactor(tm) is correct for your system and for a reindex or import of blocks already downloaded. However:

    1: a. verification up to last checkpoint is not well multithreaded, vs b. signature verification is well multithreaded

    result: end user reindex on 1 core vs 16 core will have a very different "time factor" correction.

    and

    2: end user initial blockchain download may have p2p network speed as a limiting factor, not CPU, in which case block number + correction factor after checkpoint will be completely incorrect - the progress bar should be the percentage of blockchain data estimated to have been downloaded (where 100% is an estimate of the current total size).

    combined with:

    3: The progress bar is only useful for initial blockchain download - any other time, the progress bar is 99% when you restart Bitcoin the next day. Therefore, the progress bar should be accurate for initial blockchain download.

    Storing the known size of the blockchain at several points makes for better estimation that the "factor" kicking in at one point near the end of the download:

    sipapercent

    There undoubtedly is room for some kind of factor, since we won't be storing and can't retrieve the exact size of the blockchain after the last checkpoint. Instead of simply using the same calculated block size from the last checkpoint interval, estimating data after the last checkpoint could have a factor like yours where we guess both additional CPU cost and future growth of block size.

  32. sipa commented at 10:16 PM on February 17, 2013: member

    @qubez I think you misunderstand this patch, given the graph above. I do not have a constant time per block (with a different speed before/after the checkpoint) - it's a constant time per transaction (with a different speed before/after the checkpoint), and transactions have a pretty constant size on average, so it's proportional to the size in bytes of the chain.

    Also, I do not claim the current implementation is correct for everyone. That's not possible with a constant slowdown factor after the checkpoint. For it to be exact, this factor would depend on whether or not your downloading (or reindexing from disk), the degree of parallellism you have for script validation, and a few other things probably.

    What I do claim, is that given a good factor (specifcally for the user's system - abstracting from how that factor is obtained), it is accurate in the general case. For example, if download time dominates everything and local processing/verification can be ignored, setting the factor=1 should result in more or less the same progress you'd calculate.

  33. gmaxwell commented at 10:25 PM on February 17, 2013: contributor

    @qubez Please don't post graphs which are pure conjecture. If you're going to go through the trouble of making a graph please make it on an actual measurement. What you're proposing is actually just a less accurate version of what Sipa already implemented his change uses the transaction count which is very closely related to size and more strongly correlated with the work than size is).

  34. BitcoinPullTester commented at 12:34 AM on February 18, 2013: none

    Automatic sanity-testing: WARNING, see http://jenkins.bluematt.me/pull-tester/9f2467ad6241ce6cf0897ed30c676598d59441a7 for binaries and test log.

    This pull decreases total test coverage, please add unit tests to test all new code and help us add test cases for existing code. Coverage report can be found at http://jenkins.bluematt.me/pull-tester/9f2467ad6241ce6cf0897ed30c676598d59441a7/bitcoin/src/total.coverage/

  35. gavinandresen referenced this in commit 9dca7190f9 on Feb 22, 2013
  36. gavinandresen merged this on Feb 22, 2013
  37. gavinandresen closed this on Feb 22, 2013

  38. sipa deleted the branch on May 3, 2013
  39. laudney referenced this in commit 71ccb6d9ae on Mar 19, 2014
  40. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 18:16 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me