BIP39: Adds Russian word list #432
pull farazdagi wants to merge 2 commits into bitcoin:master from farazdagi:master changing 2 files +2068 −0-
farazdagi commented at 3:20 pm on August 12, 2016: noneI’ve tried to follow guidelines defined in other languages.
-
BIP39: Adds Russian word list 6332230d63
-
in bip-0039/russian.txt: in 6332230d63 outdated
192+буфет 193+бухта 194+бушлат 195+бывать 196+быль 197+быть
UdjinM6 commented at 4:02 pm on August 12, 2016:“бывать” and “быть” are too similar imo and I would exclude both of them tbhin bip-0039/russian.txt: in 6332230d63 outdated
237+весло 238+весна 239+весть 240+ветвь 241+ветер 242+ветка
UdjinM6 commented at 4:05 pm on August 12, 2016:“ветвь” and “ветка” are very close synonyms, should probably pick only one?in bip-0039/russian.txt: in 6332230d63 outdated
464+досада 465+доска 466+доход 467+доцент 468+дочь 469+дошлый
UdjinM6 commented at 4:11 pm on August 12, 2016:“дошлый” is rare/little-used word, probably not a good candidatein bip-0039/russian.txt: in 6332230d63 outdated
524+жизнь 525+жилой 526+жилье 527+житель 528+жить 529+жрать
UdjinM6 commented at 4:18 pm on August 12, 2016:“жрать” is kind of vulgar form of “eat”, I would remove itin bip-0039/russian.txt: in 6332230d63 outdated
702+клуб 703+клык 704+ключ 705+клятва 706+книга 707+книжка
UdjinM6 commented at 4:25 pm on August 12, 2016:“книга” and “книжка” are too close imo, I would remove “книжка”in bip-0039/russian.txt: in 6332230d63 outdated
846+лодка 847+ложь 848+лозунг 849+локоть 850+ломать 851+лондон
UdjinM6 commented at 4:33 pm on August 12, 2016:“лондон” is the name of the city (London), should be removed imoin bip-0039/russian.txt: in 6332230d63 outdated
946+мораль 947+морда 948+море 949+мороз 950+моряк 951+москвич
UdjinM6 commented at 4:36 pm on August 12, 2016:“москвич” is either the name of Russian automobile brand or it means “the one who is living in Moscow”. In either way it’s not a good candidate imo.in bip-0039/russian.txt: in 6332230d63 outdated
1182+пальто 1183+память 1184+панель 1185+паника 1186+парень 1187+париж
UdjinM6 commented at 4:51 pm on August 12, 2016:“париж” is the name of the city “Paris”, should be removed imo
cryply commented at 6:56 pm on May 12, 2020:it is important to have distinct easy to type and probably remember words - париж is good in that sense. it is not vocabulary of russian words - but list of mnemonic words in Russianin bip-0039/russian.txt: in 6332230d63 outdated
1411+реформа 1412+рецепт 1413+речь 1414+решать 1415+решение 1416+решить
UdjinM6 commented at 4:57 pm on August 12, 2016:“решать” and “решить” are too close imoin bip-0039/russian.txt: in 6332230d63 outdated
1420+ритм 1421+рифма 1422+робкий 1423+родитель 1424+родной 1425+рожа
UdjinM6 commented at 4:58 pm on August 12, 2016:“рожа” is a vulgar form of “face”, could be removed probablyin bip-0039/russian.txt: in 6332230d63 outdated
1428+роль 1429+роман 1430+ронять 1431+роса 1432+рослый 1433+россия
UdjinM6 commented at 4:59 pm on August 12, 2016:“россия” is the name of the country “Russia”, not sure if it’s a good candidate herein bip-0039/russian.txt: in 6332230d63 outdated
1507+сестра 1508+сеть 1509+сечение 1510+сжечь 1511+сзади 1512+сибирь
UdjinM6 commented at 5:02 pm on August 12, 2016:“сибирь” is the name of the region in Russia - “Siberia”, probably not a good candidatein bip-0039/russian.txt: in 6332230d63 outdated
1781+узор 1782+уйма 1783+указ 1784+уклон 1785+укол 1786+украина
UdjinM6 commented at 5:10 pm on August 12, 2016:“украина” is the name of the country - “Ukraine”, probably not a good candidatein bip-0039/russian.txt: in 6332230d63 outdated
1785+укол 1786+украина 1787+уксус 1788+улица 1789+улыбка 1790+ум
UdjinM6 commented at 5:10 pm on August 12, 2016:“ум” is too short
cryply commented at 6:57 pm on May 12, 2020:why length is a problem? most important will one make mistake while entering word list or not.in bip-0039/russian.txt: in 6332230d63 outdated
1849+фонд 1850+фонтан 1851+форма 1852+фото 1853+фраза 1854+франция
UdjinM6 commented at 5:12 pm on August 12, 2016:“франция” is the name of the country - “France”, probably not a good candidatein bip-0039/russian.txt: in 6332230d63 outdated
1889+царство 1890+царь 1891+цветок 1892+целиком 1893+целое 1894+целый
UdjinM6 commented at 5:14 pm on August 12, 2016:“целое” and “целый” are probably too closein bip-0039/russian.txt: in 6332230d63 outdated
2004+энергия 2005+эпизод 2006+эпоха 2007+эскиз 2008+эссе 2009+эстония
UdjinM6 commented at 5:18 pm on August 12, 2016:“эстония” is the name of the country - “Estonia”, probably not a good candidatein bip-0039/russian.txt: in 6332230d63 outdated
2033+язык 2034+яйцо 2035+якобы 2036+якорь 2037+январь 2038+япония
UdjinM6 commented at 5:19 pm on August 12, 2016:“япония” is the name of the country - Japan, probably not a good candidateUdjinM6 commented at 5:24 pm on August 12, 2016: contributorAlso
итак когда кроме кстати куда либо ловко между наверх назад налево нигде никак нынче однажды около откуда отнюдь отсюда оттого оттуда плохо полтора помимо поперек почему против путем пятеро пяток пять ранее сбоку сверху сегодня сейчас сзади слегка смело снизу снова совсем сорок сразу также твой теперь тогда тоже точно триста туго туда уйма целиком четыре явно якобы ярко ясно
All of these above do not fit noun/verb/adj criteria - should be removed or mentioned in criteria imo. There also are some “numeric-like” words like “первый”, “тысяча” etc which I’m not sure about too but probably they are ok.
luke-jr added the label Proposed BIP modification on Aug 12, 2016Manual cleanup a59cc3e1acfarazdagi commented at 4:08 am on August 14, 2016: noneI’ve spend considerable amount of time manually going through word list and:
- applying all suggestions made above (thanks again @UdjinM6)
- making sure that only nouns/verbs/adjectives are used (mostly nouns)
- making sure that words are distinct enough from each other (improved Levenshtein distance)
Please review and let me know if there are any issues left.
UdjinM6 commented at 11:41 am on August 14, 2016: contributorVery nice! IMO the list looks much better now 👍
PS. And btw, thanks for submitting this PR!
farazdagi cross-referenced this on Aug 19, 2016 from issue Extended Keys + Remind Details + Login + Complete Transaction w/o unlocking by farazdagiBohdat commented at 11:21 am on September 5, 2016: noneHere is some very familiar words I have found: арка арфа банк танк бард барс батон бутон бинт бунт бочка точка брак брат букет буфет вахта шахта весть честь взвод вывод взор узор влияние слияние волк воля волк толк вход уход глава слава гном гром губа гуща губа шуба дата хата день тень диск риск дума душа душа суша жара фара задор затор замок зарок игла игра имение умение кабель кафель кабель табель капля цапля катер шатер козел котел койка кошка конверт концерт корнет корсет кубок кусок куча туча лента рента лечение течение магия мафия метр мэтр модель модуль мост рост народ наряд нация рация нейлон нейрон нива ниша нить шить нога нота норма форма нота рота олень осень оплата уплата ответ отчет паек парк пакт факт пальто сальто певец перец пена цена петь путь петь сеть пила пища пила сила план плац плита элита повар товар пруд труд пугать ругать путь суть река рука сбруя струя сеть суть слон стон смена стена сосед сосуд удав удар хобот хохот цинк цирк чадо чудо челнок чеснок штаб штатvoisine commented at 0:51 am on September 13, 2016: contributorthis needs to be NFKD normalized, which you can do with the following perl script:
0#!/usr/bin/perl 1 2use Unicode::Normalize; 3use strict; 4use warnings; 5use open qw(:std :utf8); 6 7while (<>) { 8 print NFKD("$_"); 9}
greenaddress commented at 7:54 pm on September 14, 2016: contributorreviewed the words - looked OK. The list of words is also sorted so that’s great.dabura667 commented at 11:37 pm on September 14, 2016: noneNFKD normalization needed.
Be sure to resort after normalization.
Japanese forgot to do so, :-( (oops!)
jonathancross commented at 2:58 pm on March 30, 2017: contributorPing @farazdagi – Seems this still needs to be normalized?Sjors commented at 10:40 am on June 30, 2017: memberA general observation about adding more languages to BIP 39 is that English now has broad wallet support. If a new language is only supported by a small number of wallets, this could lead to (unintended) vendor-lockin.
If someone writes down their mnemonic and puts in a vault, they should be able to take it out 50 years later and have a reasonable chance of finding software that can still import it.
Perhaps getting BIP 39 (or something similar) recognized as an ISO standard would be a good step towards durability, before adding more languages.
in bip-0039/russian.txt:17 in a59cc3e1ac
12+аврал 13+автор 14+агат 15+агент 16+агрегат 17+адажио
ValleZ commented at 8:02 pm on January 7, 2018:адажио is a quite rare word, is it okay to use it here?nym-zone referenced this in commit 8aaa6f37e8 on Jan 7, 2018nym-zone referenced this in commit 234c66cd5d on Jan 7, 2018dabura667 commented at 6:59 am on January 8, 2018: none@Sjors BIP39 states
The conversion of the mnemonic sentence to a binary seed is completely independent from generating the sentence. This results in rather simple code; there are no constraints on sentence structure and clients are free to implement their own wordlists
And
software must compute a checksum for the mnemonic sentence using a wordlist and issue a warning if it is invalid.
Which means “If you can’t detect (or don’t know the wordlist) the checksum, show a warning, but ALLOW THE SEED TO BE GENERATED”
But almost every single wallet used their “developer common sense” which states “if there exists a checksum. Always check it, and always fail loudly and stop everything”… which makes sense.
It is the fault of BIP39 which was made to contradict developer common sense that is at fault.
But to be honest. Electrum supports all BIP39 wordlists, because it actually follows the BIP, and if it doesn’t recognize the wordlist, it shows a warning but generates the wallet anyways. I have recovered many wallets using Electrum.
Ironically, Electrum’s developer pointed out this contradiction, the authors ignored it, Thomas asked to have his name removed because of this and other problems, and now Electrum is the only wallet that implements BIP39 correctly in this aspect.
nym-zone commented at 8:37 am on January 8, 2018: contributorAt nym-zone/easyseed@234c66c, I have created a Unicode NFKD-normalized and binary-sorted
russian.txt
from farazdagi/bips@a59cc3e as modified by approximately the following command:0uconv -f utf-8 -t utf-8 -x '::nfkd;' < russian.txt | \ 1 sort -s > normalized/russian.txt
(I originally forgot to force the
"C"
locale forsort(1)
; but I later checked, and found it did not make a difference for this list in my environment. It did make a difference for the proposed Ukrainian and Czech lists.)The result has been confirmed to not have a leading BOM, and to have a final line terminated with ‘\n’ (#622). I did not yet examine the source for these issues.
SHA-256 hash for the resulting
russian.txt
: a8d7b9d8bdd3816eddd2aeb98718ad586d8e7dd8c364a944c072cdf3cd6bcb05nym-zone commented at 9:00 am on January 8, 2018: contributorA general observation about adding more languages to BIP 39 is that English now has broad wallet support. If a new language is only supported by a small number of wallets, this could lead to (unintended) vendor-lockin.
If someone writes down their mnemonic and puts in a vault, they should be able to take it out 50 years later and have a reasonable chance of finding software that can still import it.
The answer to vendor lock-in is independent implementations. BIP 39’s simplicity facilitates that. In ten days of occasional side-work, I have written a BIP 39 implementation with extensive self-tests which generates mnemonics in any language for which a wordlist is available in the BIP repository. It can output a BIP 32
xprv
extended master private key for wallet restoration (although this feature is not yet documented in the manpage). Restoration toxprv
from a user-input mnemonic in any language will be added in the near future. This is written in standard C/mostly standard POSIX. Anybody with technical competence who urgently needed to restore a wallet could whip up a barebones/no-tests/no-checksum-check/no-manpage mnemonic-to-xprv
tool as a little afternoon project.I have C code on my disk with copyright dates from almost 40 years ago—actually, if memory serves, the oldest date I have seen in my platform’s source tree is exactly 1978. Likewise, I expect that my freely available C11 code will compile with minimal changes for decades to come.
When such tools are available and easy to produce ab initio, where is the vendor lock-in? Wallets don’t need multi-language support to restore from an
xprv
.I am glad to see new languages being proposed and added. The important part is to get the wordlist right before it’s carved into the standard, baked into implementations, and used for wallets containing actual people’s actual money. That is important.
nym-zone referenced this in commit c7d698a35f on Jan 11, 2018ZilvinasKucinskas commented at 10:23 pm on April 19, 2018: noneSo is it ok to implement this Russian wordlist in the wallet?
What are the rules of accepting language to BIP39 by the community?
dabura667 commented at 10:55 pm on April 19, 2018: noneYou can implement any wordlist you want, and Electrum will properly recover it. (Though it will not detect checksum errors)
Other wallets are poorly implemented.
DonaldTsang cross-referenced this on Dec 24, 2018 from issue Binary Lists by DonaldTsangDonaldTsang cross-referenced this on Aug 22, 2019 from issue BIP39: Russian wordlist added by 3sGgpQ8HDonaldTsang commented at 1:40 am on August 22, 2019: noneluke-jr commented at 9:28 pm on July 2, 2021: memberluke-jr closed this on Jul 2, 2021
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-30 01:10 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me