Translations were updated for 28.x in #30715.
Looks like that update pulled in things that are not translations. i.e:
?
28.x
update pulled in random strings?
#30897
Transations were updated for 28.x in #30715.
Looks like that update pulled in things that are not translations. i.e:
?
This is a poor / malicious translation.
https://app.transifex.com/bitcoin/bitcoin/translate/#gl_ES/qt-translation-028x/508593963:
Aren’t LLMs capable of translation? With all the hype around them I wonder if a script can be written to check that each translation pair is a valid translation. With 4o-mini the cost should also be trivial.
(Edit: To clarify, I don’t mean that translation should be done by the LLM, just that the validity check yes/no could be considered to be done by one, as an additional check)
In the meantime, is it too ugly to add a regex into update-translations.py
? (or a function like the existent ones that already parse the strings and validate their format)
0# regex patterns for malicious content and symbols (just an example)
1MALICIOUS_PATTERN = re.compile(r'[\x00-\x1F\x7F-\x9F<>&\'";`\\\xFFFD]|'
2 r'(\.\./|\.\.\\|\%2e%2e/|\%2e%2e\\|'
3 r'\$HOME/|\%USERPROFILE\%|\%APPDATA\%|\$USER|\$PATH|'
4 r';|&&|\|\||\||&|\\|>|--)', re.UNICODE | re.IGNORECASE)
It won’t detect random non-sense translations but at least it’s a step forward while we find the 4o-mini alternative.
fanquake
hebasto
maflcko
pablomartin4btc
Milestone
28.0