utils and libraries: address check in update-translations.py #13375

pull abskj wants to merge 1 commits into bitcoin:master from abskj:master changing 1 files +8 −0
  1. abskj commented at 8:32 PM on June 2, 2018: none

    fix for issue #13363

  2. in contrib/devtools/update-translations.py:127 in 6a98117107 outdated
     121 | @@ -122,6 +122,12 @@ def escape_cdata(text):
     122 |      text = text.replace('"', '"')
     123 |      return text
     124 |  
     125 | +def check_for_address(s):
     126 | +    #checks each message for bitcoin addresses
     127 | +    address_rgx='(bc1|[13])[a-zA-HJ-NP-Z0-9]{25,34}'
    


    MarcoFalke commented at 9:13 PM on June 2, 2018:

    nit: Can make an upper case constant with re.compile

  3. in contrib/devtools/update-translations.py:128 in 6a98117107 outdated
     121 | @@ -122,6 +122,12 @@ def escape_cdata(text):
     122 |      text = text.replace('"', '"')
     123 |      return text
     124 |  
     125 | +def check_for_address(s):
     126 | +    #checks each message for bitcoin addresses
     127 | +    address_rgx='(bc1|[13])[a-zA-HJ-NP-Z0-9]{25,34}'
     128 | +    return bool(re.search(address_rgx,s))
    


    MarcoFalke commented at 9:14 PM on June 2, 2018:

    Probably fine to just inline (using the compiled re constant), since it is only used once

  4. Empact commented at 12:47 AM on June 3, 2018: member

    See also #13374

  5. fanquake added the label Scripts and tools on Jun 3, 2018
  6. in contrib/devtools/update-translations.py:164 in 8840b6bcd2 outdated
     160 | @@ -159,6 +161,11 @@ def postprocess_translations(reduce_diff_hacks=False):
     161 |                  for translation in translations:
     162 |                      if translation is None:
     163 |                          continue
     164 | +                    
    


    giancarloGiuffra commented at 12:04 PM on June 6, 2018:

    hi, delete the whitespace to make the build pass. The linter is failing.


    giancarloGiuffra commented at 12:08 PM on June 6, 2018:

    it's in line 164


    ccdle12 commented at 12:20 PM on June 6, 2018:

    Can also install clang-format and run https://github.com/bitcoin/bitcoin/tree/master/contrib/devtools#clang-format-diffpy to fix any whitespaces before pushing


    MarcoFalke commented at 12:23 PM on June 6, 2018:

    Note that clang format doesn't work on py files.


    abskj commented at 2:34 PM on June 6, 2018:

    done..thank you

  7. in contrib/devtools/update-translations.py:166 in b849e87264 outdated
     160 | @@ -159,6 +161,11 @@ def postprocess_translations(reduce_diff_hacks=False):
     161 |                  for translation in translations:
     162 |                      if translation is None:
     163 |                          continue
     164 | +
     165 | +                    #check if btc address is included in translation
     166 | +                    if bool(re.search(ADDRESS_RE,translation)) is True:
    


    promag commented at 2:58 PM on June 6, 2018:

    Remove is True? Why cast to bool?

  8. in contrib/devtools/update-translations.py:34 in b849e87264 outdated
      29 | @@ -30,6 +30,8 @@
      30 |  LOCALE_DIR = 'src/qt/locale'
      31 |  # Minimum number of messages for translation to be considered at all
      32 |  MIN_NUM_MESSAGES = 10
      33 | +#regex for address
      34 | +ADDRESS_RE=re.compile('(bc1|[13])[a-zA-HJ-NP-Z0-9]{25,34}')
    


    promag commented at 2:59 PM on June 6, 2018:

    Space around =.

  9. in contrib/devtools/update-translations.py:33 in b849e87264 outdated
      29 | @@ -30,6 +30,8 @@
      30 |  LOCALE_DIR = 'src/qt/locale'
      31 |  # Minimum number of messages for translation to be considered at all
      32 |  MIN_NUM_MESSAGES = 10
      33 | +#regex for address
    


    promag commented at 3:00 PM on June 6, 2018:

    Nit, # Regular expression for addresses

  10. in contrib/devtools/update-translations.py:165 in b849e87264 outdated
     160 | @@ -159,6 +161,11 @@ def postprocess_translations(reduce_diff_hacks=False):
     161 |                  for translation in translations:
     162 |                      if translation is None:
     163 |                          continue
     164 | +
     165 | +                    #check if btc address is included in translation
    


    promag commented at 3:05 PM on June 6, 2018:

    Nit # Check ...

  11. in contrib/devtools/update-translations.py:166 in 13e8896838 outdated
     160 | @@ -159,6 +161,11 @@ def postprocess_translations(reduce_diff_hacks=False):
     161 |                  for translation in translations:
     162 |                      if translation is None:
     163 |                          continue
     164 | +
     165 | +                    # Check if btc address is included in translation
     166 | +                    if re.search(ADDRESS_RE,translation):
    


    promag commented at 4:05 PM on June 6, 2018:

    nit, space after ,.


    promag commented at 4:05 PM on June 6, 2018:

    Maybe remove whitespaces from translation before searching for addresses?

  12. in contrib/devtools/update-translations.py:188 in 13e8896838 outdated
     184 | @@ -178,6 +185,10 @@ def postprocess_translations(reduce_diff_hacks=False):
     185 |                  if translation_node.get('type') == 'unfinished':
     186 |                      context.remove(message)
     187 |  
     188 | +                if translation_node.get('type') == 'malicious' :
    


    promag commented at 4:05 PM on June 6, 2018:

    nit, remove space before :

  13. in contrib/devtools/update-translations.py:34 in 13e8896838 outdated
      29 | @@ -30,6 +30,8 @@
      30 |  LOCALE_DIR = 'src/qt/locale'
      31 |  # Minimum number of messages for translation to be considered at all
      32 |  MIN_NUM_MESSAGES = 10
      33 | +# Regular expression for addresses
      34 | +ADDRESS_RE = re.compile('(bc1|[13])[a-zA-HJ-NP-Z0-9]{25,34}')
    


    promag commented at 4:05 PM on June 6, 2018:

    How about multisig?


    abskj commented at 4:53 PM on June 6, 2018:

    I thought the address format is same for multisig. Could you provide some info about it if I am mistaken.


    promag commented at 5:23 PM on June 6, 2018:

    Sorry it's already covered. Maybe change to (1|3|bc1)?


    abskj commented at 5:58 PM on June 6, 2018:

    done

  14. promag commented at 4:06 PM on June 6, 2018: member

    Concept ACK.

  15. giancarloGiuffra approved
  16. giancarloGiuffra commented at 9:25 PM on June 6, 2018: none

    ACK, tested with P2PKH, P2SH and Bech32 addresses.

  17. in contrib/devtools/update-translations.py:189 in ea12170b64 outdated
     184 | @@ -178,6 +185,10 @@ def postprocess_translations(reduce_diff_hacks=False):
     185 |                  if translation_node.get('type') == 'unfinished':
     186 |                      context.remove(message)
     187 |  
     188 | +                if translation_node.get('type') == 'malicious':
     189 | +                    print('WARNING...address found in translation.Message will be deleted')
    


    Empact commented at 10:36 PM on June 6, 2018:

    Would be good to output the filename here for specificity, as with line 173.

  18. in contrib/devtools/update-translations.py:170 in ea12170b64 outdated
     165 | +                    # Check if btc address is included in translation after removing whitespaces
     166 | +                    if re.search(ADDRESS_RE, translation.replace(' ', '')):
     167 | +                        translation_node.set('type', 'malicious')
     168 | +                        break
     169 |                      errors = []
     170 |                      valid = check_format_specifiers(source, translation, errors, numerus)
    


    Empact commented at 10:39 PM on June 6, 2018:

    Independent of this PR, but this bit could be simplified a fair bit:

    • Return error from check_format_specifiers, which generates at most 1 error
    • Combine loop and if not valid below
  19. in contrib/devtools/update-translations.py:168 in ea12170b64 outdated
     160 | @@ -159,6 +161,11 @@ def postprocess_translations(reduce_diff_hacks=False):
     161 |                  for translation in translations:
     162 |                      if translation is None:
     163 |                          continue
     164 | +
     165 | +                    # Check if btc address is included in translation after removing whitespaces
     166 | +                    if re.search(ADDRESS_RE, translation.replace(' ', '')):
     167 | +                        translation_node.set('type', 'malicious')
     168 | +                        break
    


    Empact commented at 10:39 PM on June 6, 2018:

    nit: I still like the idea of using errors for this.


    abskj commented at 11:27 AM on June 7, 2018:

    Yes..realized now. Amending.


    abskj commented at 2:48 PM on June 7, 2018:

    done

  20. in contrib/devtools/update-translations.py:166 in ea12170b64 outdated
     160 | @@ -159,6 +161,11 @@ def postprocess_translations(reduce_diff_hacks=False):
     161 |                  for translation in translations:
     162 |                      if translation is None:
     163 |                          continue
     164 | +
     165 | +                    # Check if btc address is included in translation after removing whitespaces
     166 | +                    if re.search(ADDRESS_RE, translation.replace(' ', '')):
    


    promag commented at 10:48 PM on June 6, 2018:

    To remove all whitespaces:

    translation.translate({ ord(c): None for c in ' \n\t\r' })
    

    or

    ''.join(translation.split())
    

    abskj commented at 2:46 PM on June 7, 2018:

    done..the implementation is changed in accordance with @Empact 's suggestion

  21. promag commented at 10:48 PM on June 6, 2018: member

    Please squash.

  22. promag commented at 2:00 AM on June 7, 2018: member

    Just realised that this is duplicate of #13374. @kaplanmaxe maybe incorporate above suggestions in yours?

  23. utils and library : check for addresses in translation
    address check in translations Minor changes
    
    extra whitespace removed
    
    minor errors
    
    minor fixes
    
    checking after removing whitespaces
    
    regular expression refined
    
    using errors
    89e16f1bee
  24. abskj force-pushed on Jun 7, 2018
  25. in contrib/devtools/update-translations.py:87 in 89e16f1bee
      81 | @@ -80,6 +82,10 @@ def sanitize_string(s):
      82 |      return s.replace('\n',' ')
      83 |  
      84 |  def check_format_specifiers(source, translation, errors, numerus):
      85 | +    # Check for btc address in translation after removing whitespaces
      86 | +    if re.search(ADDRESS_RE, ''.join(translation.split())):
      87 | +        errors.append("WARNING...Address found in translation for '%s': '%s'\nMessage will be deleted" % (sanitize_string(source), sanitize_string(translation)))
    


    laanwj commented at 1:22 PM on June 8, 2018:

    There's not need to add the "\nMessage will be deleted" here, it's the case for all errors so doesn't need mentioning.

  26. laanwj commented at 1:23 PM on June 8, 2018: member

    I've tested this and found it found tons of false positives! it detected addresses in almost every message, deleting more than half the translation messages. (for testing, it's useful to comment out fetch_all_translations() so it just processes the translations inthe tree.)

  27. promag commented at 1:26 PM on June 8, 2018: member

    Makes sense because of the whitespace removal.

  28. laanwj commented at 2:54 PM on June 8, 2018: member

    Closing, as #13374 was merged.

  29. laanwj closed this on Jun 8, 2018

  30. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 18:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me