doc: note the costs of fetching all pull requests #18382

pull vasild wants to merge 1 commits into bitcoin:master from vasild:note_fetch_all_cost changing 1 files +2 −2
  1. vasild commented at 1:29 PM on March 19, 2020: member

    Also mention that it is possible to fetch just one pull request.

  2. fanquake added the label Docs on Mar 19, 2020
  3. in doc/productivity.md:179 in 5d7a8176b6 outdated
     175 | @@ -176,7 +176,9 @@ When looking at other's pull requests, it may make sense to add the following se
     176 |          url = git@github.com:bitcoin/bitcoin.git
     177 |  ```
     178 |  
     179 | -This will add an `upstream-pull` remote to your git repository, which can be fetched using `git fetch --all` or `git fetch upstream-pull`. Afterwards, you can use `upstream-pull/NUMBER/head` in arguments to `git show`, `git checkout` and anywhere a commit id would be acceptable to see the changes from pull request NUMBER.
     180 | +This will add an `upstream-pull` remote to your git repository, which can be fetched using `git fetch --all` or `git fetch upstream-pull`. It will download ~800 MB (all PRs, including merged and closed ones) and will expand the disk usage from ~180 MB to ~1400 MB. Afterwards, you can use `upstream-pull/NUMBER/head` in arguments to `git show`, `git checkout` and anywhere a commit id would be acceptable to see the changes from pull request NUMBER.
    


    kristapsk commented at 1:48 PM on March 19, 2020:

    Maybe it's better to not specify exact numbers and use something like "hundreds of megabytes" instead, as it will change in future?


    vasild commented at 7:48 AM on March 20, 2020:

    Done. Shortened to just "It will download and store on disk about 1 GB of data" - enough to avoid the surprise.

  4. MarcoFalke commented at 1:54 PM on March 19, 2020: member

    I tend to fetch only a single pull by just copying the URL in blue on the top right (e.g. for this pull https://github.com/vasild/bitcoin/tree/note_fetch_all_cost)

    Then, paste it into a terminal and replace the tree/ with a space:

    $ git fetch https://github.com/vasild/bitcoin/ note_fetch_all_cost
    
    remote: Enumerating objects: 14, done.
    remote: Counting objects: 100% (14/14), done.
    remote: Compressing objects: 100% (3/3), done.
    remote: Total 17 (delta 11), reused 13 (delta 11), pack-reused 3
    Unpacking objects: 100% (17/17), 10.37 KiB | 212.00 KiB/s, done.
    From https://github.com/vasild/bitcoin
     * branch                  note_fetch_all_cost -> FETCH_HEAD
    
  5. vasild commented at 2:28 PM on March 19, 2020: member

    Actually, isn't it an overkill to download all pull requests - opened, closed and merged (~1 GB)? Usually one wants to review or test one PR. @sipa you added this in https://github.com/bitcoin/bitcoin/commit/31444491f2#diff-e4de6d71249c78e229a61d6a03ac255aR422 almost 4 years ago when the volume of this was smaller. What about dropping this advice from the doc and replace it with instructions on fetching a single PR, as per the comment above and/or github help?

  6. MarcoFalke commented at 2:45 PM on March 19, 2020: member

    Some people might have a really large disk and review a lot of pull requests or want to archive everything that happens for other reasons, so it might be good to mention both: the "archival solution" and the quick fetch solution.

  7. in doc/productivity.md:181 in 5d7a8176b6 outdated
     175 | @@ -176,7 +176,9 @@ When looking at other's pull requests, it may make sense to add the following se
     176 |          url = git@github.com:bitcoin/bitcoin.git
     177 |  ```
     178 |  
     179 | -This will add an `upstream-pull` remote to your git repository, which can be fetched using `git fetch --all` or `git fetch upstream-pull`. Afterwards, you can use `upstream-pull/NUMBER/head` in arguments to `git show`, `git checkout` and anywhere a commit id would be acceptable to see the changes from pull request NUMBER.
     180 | +This will add an `upstream-pull` remote to your git repository, which can be fetched using `git fetch --all` or `git fetch upstream-pull`. It will download ~800 MB (all PRs, including merged and closed ones) and will expand the disk usage from ~180 MB to ~1400 MB. Afterwards, you can use `upstream-pull/NUMBER/head` in arguments to `git show`, `git checkout` and anywhere a commit id would be acceptable to see the changes from pull request NUMBER.
     181 | +
     182 | +It is also possible to [fetch only a given pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/checking-out-pull-requests-locally#modifying-an-inactive-pull-request-locally).
    


    fjahr commented at 4:25 PM on March 19, 2020:

    I think referencing that link here is fragile. I would rather have the fetch example cli command.


    vasild commented at 7:49 AM on March 20, 2020:

    Agree, removed the link.

  8. fjahr commented at 4:29 PM on March 19, 2020: member

    Concept ACK

    I think this is good so people don't get surprised by the large amount of data. Agree with @kristapsk that there should not be any specific numbers though.

  9. vasild force-pushed on Mar 20, 2020
  10. theStack approved
  11. theStack commented at 9:54 PM on March 22, 2020: member

    ACK https://github.com/bitcoin/bitcoin/pull/18382/commits/5ebd8f6b84eeff3bc3da219aa7ce978d861db65d Thanks for adding those valuable informations/instructions :+1:

  12. promag commented at 12:35 AM on March 23, 2020: member

    FWIW I also tend to add a remote for each recurring contributor fork which also allows me to sneak peek what they are doing.

  13. laanwj commented at 5:44 PM on March 26, 2020: member

    This renders as:

    screenshot

    I don't think the "For example:"s should be in the code box.

  14. vasild force-pushed on Mar 26, 2020
  15. vasild commented at 7:35 PM on March 26, 2020: member

    Right, updated. It looks better now.

  16. sipa commented at 11:16 PM on March 26, 2020: member

    Doing some numbers myself:

    Cloning bitcoin/bitcoin downloads 138.8 MiB, resulting in a size on disk of .git of 146.8 MB. Starting with a fresh clone:

    • Fetching refs/pull/* downloads an additional 801.37 MiB, resulting in a size on disk of .git of 1405.4 MB.
    • Fetching refs/pull/*/head instead downloads an additional 797.09 MiB, resulting in a size on disk of .git of 1233.0 MB.

    I've generally never had a use for the refs/pull/*/merge branches (I prefer doing a merge with master myself if needed anyway, and that's rare), maybe it's reasonable to only suggest fetching the heads?

  17. vasild force-pushed on Mar 27, 2020
  18. vasild commented at 8:25 AM on March 27, 2020: member

    Those numbers more or less coincide with my observations. I agree with refs/pull/* vs refs/pull/*/head. Updated, render preview.

  19. laanwj commented at 1:55 PM on March 27, 2020: member

    Right, updated. It looks better now.

    Thanks, yes, looks much better now.

    ACK 61dd7e2ca7dcfe519023bc4882777d348d1f70fd

  20. laanwj added this to the milestone 0.20.0 on Mar 27, 2020
  21. in doc/productivity.md:202 in 61dd7e2ca7 outdated
     199 | +```
     200 | +
     201 | +for example:
     202 | +
     203 | +```
     204 | +git fetch https://github.com/glowang/bitcoin/ update_blocksonly_docs:docs_pr
    


    jonatack commented at 10:32 PM on March 27, 2020:

    Have you checked with this person about using their identity here?


    sipa commented at 10:38 PM on March 27, 2020:

    Ping @glowang.


    glowang commented at 10:56 PM on March 27, 2020:

    thank you for asking - this is fine with me!

  22. jonatack commented at 10:36 PM on March 27, 2020: member

    As there are many ways to do this, and a wide range of user computing configurations and needs, it never really made sense to me why these precise instructions (and now examples) would be provided. (I use a somewhat different gitconfig for this and know other contributors who do as well). Just a counterpoint.

  23. RandyMcMillan commented at 10:55 PM on March 27, 2020: contributor

    It may be useful to add a note about pulling from the most recent passing commit hash on Travis-ci.

  24. laanwj commented at 11:15 AM on March 28, 2020: member

    It may be useful to add a note about pulling from the most recent passing commit hash on Travis-ci.

    I don't think github allows fetching by commit hash. You can only fetch labels or branches. It would be awesomely convenient sometimes to retrieve an old version of a PR that has been force-pushed over, to compare, but I've never managed to do this.

  25. promag commented at 11:24 AM on March 28, 2020: member

    I don't think github allows fetching by commit hash. You can only fetch labels or branches.

    FWIW I also tend to add a remote for each recurring contributor fork which also allows me to sneak peek what they are doing. @laanwj I do git remote update laanwj && git reset --hard laanwj/<sha>

  26. theStack approved
  27. theStack commented at 12:35 PM on March 28, 2020: member

    re-ACK https://github.com/bitcoin/bitcoin/pull/18382/commits/61dd7e2ca7dcfe519023bc4882777d348d1f70fd Just checked that since my last ACK:

    • formatting has improved (the for example: string is now outside of the monospaced code block)
    • example .gitconfig section has been changed to only retreiving the heads
  28. theStack commented at 12:47 PM on March 28, 2020: member

    As there are many ways to do this, and a wide range of user computing configurations and needs, it never really made sense to me why these precise instructions (and now examples) would be provided. (I use a somewhat different gitconfig for this and know other contributors who do as well). Just a counterpoint.

    Personally speaking, the instructions on how to fetch all pull requests were very helpful when I started contributing/reviewing. It would be highly absurd to encourage new contributors to review and test code but then not including any instructions how to do so in terms of git(hub) workflow IMHO. I cannot follow the argumentation "there are many ways to do it, so better not even describe a single one of them".

  29. laanwj commented at 12:50 PM on March 28, 2020: member

    @laanwj I do git remote update laawnj && git reset --hard laawnj/<sha>

    You don't need the laanwj/ before the hash. But yes, that will work, but only for commits that are currently (or have been at some point when you've fetched) connected to a tag or branch. If you've missed the point they were part of a branch, you cannot fetch them anymore. You can however view those commits through the web interface.

    Personally speaking, the instructions on how to fetch all pull requests were very helpful when I started contributing/reviewing.

    Personally I think it's useful to have some tips for things people frequently want to do, but otherwise refer to existing documentation. I also understand @jonatack's point that we don't want to end up maintaining an alternative git documentation here.

  30. jonatack commented at 1:07 PM on March 28, 2020: member

    It would be highly absurd to encourage new contributors to review and test code but then not including any instructions how to do so in terms of git(hub) workflow IMHO. I cannot follow the argumentation "there are many ways to do it, so better not even describe a single one of them".

    I and others do describe these things more or less in detail in various articles to help onboard new people, a process I've been going through as well. Also, there are resources like https://bitcoincore.reviews to help onboard contributors -- and which I continue to learn from -- which everyone can contribute to and participate in.

  31. theStack commented at 3:32 PM on March 29, 2020: member

    It would be highly absurd to encourage new contributors to review and test code but then not including any instructions how to do so in terms of git(hub) workflow IMHO. I cannot follow the argumentation "there are many ways to do it, so better not even describe a single one of them".

    I and others do describe these things more or less in detail in various articles to help onboard new people, a process I've been going through as well. Also, there are resources like https://bitcoincore.reviews to help onboard contributors -- and which I continue to learn from -- which everyone can contribute to and participate in.

    Fair enough, there is indeed very good and deep introductory material out there (for me it was this article by Jimmy Song that got me started) and the PR Review Club is an absolutely awesome idea. I'd still without hesitation recommend to every newcomer to carefully read the included docs as early as possible (at the very least developer-notes.md and productivity.md), as common denominator of contributor-agreed best practices and tips, that are less likely to get outdated than articles maintained by single individuals. To sum up my view: as long as we have documents like productivity.md included in the repository, they should also include basics about how to fetch and checkout pull requests.

  32. in doc/productivity.md:179 in 61dd7e2ca7 outdated
     176 | +        fetch = +refs/pull/*/head:refs/remotes/upstream-pull/*
     177 |          url = git@github.com:bitcoin/bitcoin.git
     178 |  ```
     179 |  
     180 | -This will add an `upstream-pull` remote to your git repository, which can be fetched using `git fetch --all` or `git fetch upstream-pull`. Afterwards, you can use `upstream-pull/NUMBER/head` in arguments to `git show`, `git checkout` and anywhere a commit id would be acceptable to see the changes from pull request NUMBER.
     181 | +This will add an `upstream-pull` remote to your git repository, which can be fetched using `git fetch --all` or `git fetch upstream-pull`. It will download and store on disk about 1 GB of data (all PRs, including merged and closed ones). Afterwards, you can use `upstream-pull/NUMBER/head` in arguments to `git show`, `git checkout` and anywhere a commit id would be acceptable to see the changes from pull request NUMBER.
    


    jonatack commented at 3:51 PM on March 29, 2020:

    With respect to "about 1 GB of data", I'd suggest not stating an amount needs to be updated/maintained.


    vasild commented at 12:59 PM on March 31, 2020:

    Reworded to "quite a lot of data", its good enough.

  33. doc: note the costs of fetching all pull requests
    Also, update the example to skip downloading the merge commits when
    downloading all PRs.
    d695eb4c21
  34. vasild force-pushed on Mar 31, 2020
  35. vasild commented at 12:59 PM on March 31, 2020: member

    This PR caused more discussions than I anticipated plus some controversy.

    The initial reason I opened it is that I expected the upstream-pull remote to download only the currently opened PRs and the download to be less than 100-ish MB, so I wanted to "warn" others that it is more expensive than that, given that git does not display the total download size upfront.

    Updated the PR with just this basic info in one sentence (plus download only heads).

    Render preview

  36. MarcoFalke commented at 1:35 PM on April 2, 2020: member

    ACK d695eb4c2164fc8777dc014e1a30c014cf04982a

  37. laanwj removed this from the milestone 0.20.0 on Apr 2, 2020
  38. fanquake approved
  39. fanquake commented at 10:16 AM on April 3, 2020: member

    ACK d695eb4c2164fc8777dc014e1a30c014cf04982a

  40. fanquake merged this on Apr 3, 2020
  41. fanquake closed this on Apr 3, 2020

  42. sidhujag referenced this in commit 6393eef899 on Apr 3, 2020
  43. PastaPastaPasta referenced this in commit 9845d6b055 on Jun 27, 2021
  44. PastaPastaPasta referenced this in commit 241633be52 on Jun 28, 2021
  45. PastaPastaPasta referenced this in commit fd7e129bfa on Jun 29, 2021
  46. PastaPastaPasta referenced this in commit cfd0a7c4f2 on Jul 1, 2021
  47. PastaPastaPasta referenced this in commit c431185306 on Jul 1, 2021
  48. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-22 18:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me