script: Handle request exception and check svg image in getcoins.py #24180

pull sh15h4nk wants to merge 1 commits into bitcoin:master from sh15h4nk:master changing 1 files +18 −16
  1. sh15h4nk commented at 6:32 PM on January 27, 2022: contributor

    Improvement to getcoins.py:

    • handled request excecptions
    • added limit handler to SVG size
    • restricted format
    • added caption to captcha (From the url)

    Fixes https://github.com/bitcoin/bitcoin/issues/24119

  2. sh15h4nk commented at 6:34 PM on January 27, 2022: contributor

    hey @prayank23 please review the PR

  3. in contrib/signet/getcoins.py:142 in 11984ddd8e outdated
     141 | +        raise SystemExit(f"No captcha found URL {args.captcha}")
     142 |  
     143 |      # Convert SVG image to PPM, and load it
     144 |      try:
     145 | -        rv = subprocess.run([args.imagemagick, '-', '-depth', '8', 'ppm:-'], input=res.content, check=True, capture_output=True)
     146 | +        rv = subprocess.run([args.imagemagick, '-', '-depth', '8', 'ppm:-'], input=content, check=True, capture_output=True)
    


    laanwj commented at 6:42 PM on January 27, 2022:

    FWIW, if you want to allow only svgs you can also replace the first '-' with 'svg:-'. This makes imagemagick check the format and would avoid adding two dependencies (or at least one).


    sh15h4nk commented at 4:22 AM on January 30, 2022:

    yes, okay

  4. laanwj added the label Scripts and tools on Jan 27, 2022
  5. in contrib/signet/getcoins.py:132 in 11984ddd8e outdated
     130 | +        raise SystemExit(f"Unexpected error when contacting faucet: {e}")
     131 | +
     132 | +    # Size limitation
     133 | +    svg = xml.etree.ElementTree.fromstring(res.content)
     134 | +    if (svg.attrib.get('width') != '150' or svg.attrib.get('height') != '50'):
     135 | +        raise SystemExit("Captcha dimentions doesn't satisify")
    


    laanwj commented at 6:47 PM on January 27, 2022:

    This is not correct English, "Captcha size doesn't match expected 150x50" maybe


    sh15h4nk commented at 4:23 AM on January 30, 2022:

    Okay, my bad!

  6. laanwj commented at 6:50 PM on January 27, 2022: member

    Concept ACK. I think a dependency on etree xml is acceptable as it is commonly installed (or maybe even part of Python, i don't know for sure), but would prefer to avoid bleach.

  7. in contrib/signet/getcoins.py:131 in 11984ddd8e outdated
     129 | +    except requests.exceptions.RequestException as e:
     130 | +        raise SystemExit(f"Unexpected error when contacting faucet: {e}")
     131 | +
     132 | +    # Size limitation
     133 | +    svg = xml.etree.ElementTree.fromstring(res.content)
     134 | +    if (svg.attrib.get('width') != '150' or svg.attrib.get('height') != '50'):
    


    laanwj commented at 6:54 PM on January 27, 2022:

    Nit: no outer () needed for Python if


    sh15h4nk commented at 4:25 AM on January 30, 2022:

    okay

  8. ghost commented at 9:49 PM on January 27, 2022: none

    hey @prayank23 please review the PR

    Concept ACK

    Welcome as a new contributor 🥂

    Agree about dependency and other things suggested by @laanwj I won't be able to review or test for a few days. Took a break. Hopefully this PR gets merged in a week or I will review and test next week.

  9. sh15h4nk commented at 4:42 AM on January 30, 2022: contributor

    would prefer to avoid bleach

    In order to sanitize html content we need some or the other modules such as html-sanitizer, bs4, lxml.html, xml.html, etc. All these modules are not commonly installed. so I have choosen bleach because its has some good set of options. If you are familiar or heard of any other html content sanitizing module please mention, I will play with it for a while and update it. Thank you.

  10. sh15h4nk commented at 4:54 AM on January 30, 2022: contributor

    hey @laanwj, I have updated the code and resolved the issues, please have a look.

  11. luke-jr commented at 3:29 AM on January 31, 2022: member

    Just an anecdotal metric: My system has bleach available without any additional dependencies, but not html-sanitizer, bs4, or xml.html. On the other hand, lxml is already installed for KDE's Breeze icons, Inkscape, lv2 (an audio framework used by Audacity), and a few Gentoo development tools.

  12. sh15h4nk requested review from laanwj on Jan 31, 2022
  13. ghost commented at 5:39 PM on February 3, 2022: none
    1. You could have used a new branch in your fork repo for this pull request as mentioned in https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#contributor-workflow

    2. Still not sure about error message which was corrected based on #24180 (review). It looks better but size and dimension have different meaning.

    I think a dependency on etree xml is acceptable as it is commonly installed (or maybe even part of Python, i don't know for sure), but would prefer to avoid bleach.

    laanwj

    Just an anecdotal metric: My system has bleach available without any additional dependencies, but not html-sanitizer, bs4, or xml.html. On the other hand, lxml is already installed for KDE's Breeze icons, Inkscape, lv2 (an audio framework used by Audacity), and a few Gentoo development tools.

    luke-jr

    1. I had no bleach on Fedora :
    ModuleNotFoundError: No module named 'bleach'
    

    Not sure if there are better alternatives or we could sanitize without using it.

    1. L151: print(f"Captch from URL {args.captcha}") prints this which looks good:

    image

    1. L125 and L126 LGTM although others who understand python better could suggest if it still can be improved:
    
            res.raise_for_status()
        except requests.exceptions.RequestException as e:
        
    
    1. L130-132 code for limiting size looks okay. I did not test it yet.
    2. L133-138 code to sanitize and only allow svg looks okay. I have not tested it yet.
  14. ghost commented at 1:02 AM on February 5, 2022: none

    Sorry I could not test this PR today because of below error:

    ./getcoins.py 
    Unexpected error when contacting faucet: 502 Server Error: Bad Gateway for url: https://signetfaucet.com/captcha
    

    I have asked @kallewoof and hoping it gets fixed in next few days so that I can test everything.

  15. kallewoof commented at 1:52 AM on February 5, 2022: member

    @prayank23 It's back up. I shut it down for a few days as it was being swarmed with spammers.

  16. laanwj commented at 1:04 PM on February 6, 2022: member

    but not html-sanitizer, bs4, or xml.html

    xml.etree is available in the python standard library. Not any of the other submodules IIRC. And I don't really like to add any dependencies beyond the standard library. I intentially used ImageMagick instead of any python specific SVG library.

    I'm a bit skeptical about the need to "sanitize" the XML at all, to be honest. It feels like a bit of security theater that doesn't address a specific issue. I'm ok with removing that part and just keeping the size check + ensure SVG format.

    That said, iterating over an xml tree and keeping only certain elements and attributes doesn't seem rocket science, if we really want that it should be possible with just etree manipulation.

  17. in contrib/signet/getcoins.py:128 in ea643f13c0 outdated
     130 | +        raise SystemExit(f"Unexpected error when contacting faucet: {e}")
     131 | +
     132 | +    # Size limitation
     133 | +    svg = xml.etree.ElementTree.fromstring(res.content)
     134 | +    if svg.attrib.get('width') != '150' or svg.attrib.get('height') != '50':
     135 | +        raise SystemExit("Captcha size doesn't match expected dimensions 150x50")
    


    unknown commented at 2:28 AM on February 8, 2022:

    Tested this part of code by using different url for captcha in getcoins.py:

    DEFAULT_GLOBAL_CAPTCHA = 'https://prayank23.github.io/svg-test/size.svg'
    

    and this svg file:

    <svg xmlns="http://www.w3.org/2000/svg" height="300" width="100">
      <text x="0" y="15">Size test</text>
    </svg>
    

    Works as expected:

    $ ./getcoins.py 
    
    Captcha size doesn't match expected dimensions 150x50
    
  18. sh15h4nk commented at 11:59 AM on February 12, 2022: contributor

    Hey @laanwj and @prayank23, I have made all the changes to the script, waiting for merge. Thanks

  19. ghost commented at 3:36 AM on February 13, 2022: none

    Hey @laanwj and @prayank23, I have made all the changes to the script, waiting for merge. Thanks

    You will have to squash commits: https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#squashing-commits

    If you haven't done this before and want to avoid mistakes, trying squashing a few commits in test repository or other branch could be helpful. Or you could even use GitHub desktop application for it.

  20. sh15h4nk force-pushed on Feb 13, 2022
  21. sh15h4nk closed this on Feb 13, 2022

  22. sh15h4nk force-pushed on Feb 13, 2022
  23. improved getcoins.py 7c7dbe4398
  24. sh15h4nk reopened this on Feb 13, 2022

  25. laanwj renamed this:
    script: handled request exception and added image sanitization
    script: Handle request exception and check svg image in getcoins.py
    on Feb 14, 2022
  26. laanwj commented at 8:11 AM on February 14, 2022: member

    Tested ACK 7c7dbe43983ea8408ece8e483320ec55083d6c09

  27. laanwj merged this on Feb 14, 2022
  28. laanwj closed this on Feb 14, 2022

  29. in contrib/signet/getcoins.py:11 in 7c7dbe4398
       7 | @@ -8,6 +8,7 @@
       8 |  import requests
       9 |  import subprocess
      10 |  import sys
      11 | +import xml.etree.ElementTree
    


    unknown commented at 8:19 AM on February 14, 2022:
    import defusedxml.ElementTree
    

    Sorry for the late review. I was still going through everything in the code and testing things. Found that xml.etree.ElementTree.fromstring is known to be vulnerable to XML attacks. Maybe this can be changed in a follow up PR.


    laanwj commented at 9:03 AM on February 14, 2022:

    If there is a specific problem with xml.etree it needs to be addressed. But without introducing a third-party dependency, please. Adding defusedxml is not acceptable here.


    unknown commented at 11:52 PM on February 16, 2022:

    Details: https://deepsource.io/gh/prayank23/bitcoin/run/61a5e60f-9376-4fd4-8a65-3b76190a56af/python/BAN-B314

    I tried XML attacks with some test svg files added here: https://github.com/prayank23/svg-test but found nothing interesting. Maybe its not an issue for running the script in terminal and such attacks only work for websites.

    One of the error:

    limit on input amplification factor (from DTD and entities) breached: line 15, column 35
    
  30. sidhujag referenced this in commit 2a7883307d on Feb 14, 2022
  31. DrahtBot locked this on Feb 18, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-15 15:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me