Add Portuguese wordlist to BIP39 #998

pull sabotag3x wants to merge 1 commits into bitcoin:master from sabotag3x:master changing 2 files +2065 −0
  1. sabotag3x commented at 10:08 pm on September 18, 2020: contributor

    The Portuguese wordlist was carefully checked manually by Portuguese and Brazilians in order to achieve a high level of quality. All the words are commonly used in both countries.

    In addition the Portuguese wordlist was revised using Python in order to check the Levenshtein distance, words already used in other mnemonic sets and first 4 characters rules.

    More details on the word selection process can be found in the Bitcointalk’s Portuguese section.

    Portuguese wordlist rules:

    1. Words can be uniquely determined typing the first 4 characters.
    2. No accents or special characters.
    3. No complex verb forms.
    4. No plural words, unless there’s no singular form.
    5. No words with double spelling.
    6. No words with the exact sound of another word with different spelling.
    7. No offensive words.
    8. No words already used in other language mnemonic sets.
    9. The words which have not the same spelling in Brazil and in Portugal are excluded.
    10. No words that remind negative/sad/bad things.
    11. No very similar words with 1 letter of difference.
  2. bitmover-studio commented at 11:09 pm on September 19, 2020: contributor

    The idea to create this wordlist began in bitcointalk.org forum. This is the thread where all details were discussed along its creation: https://bitcointalk.org/index.php?topic=5272106.

    We used python scripts to help us check for those rules:

    1. Words can be uniquely determined typing the first 4 characters.
    2. No words already used in other language mnemonic sets.
    3. No very similar words with 1 letter of difference. (Levenshtein distance > 1)
  3. promag commented at 11:47 pm on September 19, 2020: member
    See #720.
  4. bitmover-studio commented at 11:19 am on September 20, 2020: contributor

    See #720.

    That list has many problems:

    1 - It is inactive for 2 years.

    2- duplicated words: 58 - ampola 60 -ampola

    3 - Problems with repeated word from spanish: bonsai

    4 - Words that cannot be uniquely determined typing the first 4 characters. [53, 569, 570, 630, 721, 765, 1060, 1120, 1690, 1894] esquadro - esquerda esqui ferrolho - ferrugem garrafa - garrote gracejo - gracioso magnata - magno sentado - sentido trilho - trilogia

    5 - And Levenshtein distance < 1 [‘acidez - avidez’, ‘adiante - diante’, ‘aflito - afoito’, ‘afoito - aflito’, ‘afta - anta’, ‘agito - apito’, ‘agulha - fagulha’, ‘alho - olho’, ‘alvo - alho’, ‘anexo - nexo’, ‘anta - afta’, ‘apito - apto’, ‘apto - apito’, ‘areia - aveia’, ‘argila - argola’, ‘argola - argila’, ‘assado - passado’, ‘ator - fator’, ‘aveia - veia’, ‘avidez - acidez’, ‘bafo - safo’, ‘bagulho - barulho’, ‘bainha - rainha’, ‘balada - salada’, ‘balsa - valsa’, ‘banho - ganho’, ‘barata - batata’, ‘barulho - bagulho’, ‘basta - besta’, ‘batata - barata’, ‘beato - boato’, ‘beco - bico’, ‘beira - feira’, ‘belo - selo’, ‘bento - tento’, ‘besta - festa’, ‘bico - beco’, ‘bloco - floco’, ‘boato - beato’, ‘boldo - bolso’, ‘bolha - rolha’, ‘bolso - boldo’, ‘bossa - fossa’, ‘botina - rotina’, ‘brado - irado’, ‘brando - brado’, ‘briga - brita’, ‘brilho - trilho’, ‘brita - briga’, ‘bromo - broto’, ‘broto - bromo’, ‘bula - lula’, ‘bule - bula’, ‘busto - custo’, ‘butano - tutano’, ‘cabelo - camelo’, ‘cabo - nabo’, ‘cacho - tacho’, ‘caixa - faixa’, ‘camelo - cabelo’, ‘caro - raro’, ‘casca - lasca’, ‘ceia - veia’, ‘cera - hera’, ‘cereja - cerveja’, ‘cerrado - errado’, ‘cerveja - cereja’, ‘cidade - idade’, ‘cisco - risco’, ‘coceira - coleira’, ‘coelho - joelho’, ‘coice - foice’, ‘colar - molar’, ‘coleira - moleira’, ‘copeiro - coveiro’, ‘corja - coruja’, ‘corno - morno’, ‘coruja - corja’, ‘corvo - corno’, ‘couro - touro’, ‘coveiro - copeiro’, ‘cuia - guia’, ‘cunhado - punhado’, ‘custo - busto’, ‘demente - semente’, ‘dente - rente’, ‘diante - adiante’, ‘dica - doca’, ‘diodo - iodo’, ‘doador - voador’, ‘dobrado - dourado’, ‘doca - dica’, ‘doceiro - roceiro’, ‘dois - pois’, ‘domador - doador’, ‘domo - gomo’, ‘dotado - lotado’, ‘dourado - dobrado’, ‘dublado - nublado’, ‘dueto - gueto’, ’efeito - refeito’, ’efusivo - elusivo’, ’eira - feira’, ’eixo - seixo’, ’elusivo - efusivo’, ’embolado - empolado’, ’empolado - embolado’, ’enxame - exame’, ’errado - cerrado’, ’escola - esmola’, ’esmola - escola’, ’exame - vexame’, ‘facada - sacada’, ‘fagulha - agulha’, ‘faixa - caixa’, ‘falta - malta’, ‘fasor - fator’, ‘fator - fasor’, ‘favela - fivela’, ‘febre - lebre’, ‘feio - seio’, ‘feira - fera’, ‘feixe - peixe’, ‘feno - seno’, ‘fera - hera’, ‘festa - fresta’, ‘feto - reto’, ‘figa - viga’, ‘fita - figa’, ‘fivela - favela’, ‘fixo - lixo’, ‘floco - foco’, ‘fluxo - luxo’, ‘focinho - mocinho’, ‘foco - toco’, ‘fogo - logo’, ‘foice - coice’, ‘folia - polia’, ‘fonte - monte’, ‘forno - morno’, ‘forte - morte’, ‘fosco - foco’, ‘fossa - bossa’, ‘freio - frevo’, ‘frente - rente’, ‘fresta - festa’, ‘frevo - trevo’, ‘friagem - triagem’, ‘fronte - frente’, ‘frota - rota’, ‘funil - fuzil’, ‘fuzil - funil’, ‘galho - ganho’, ‘ganho - galho’, ‘garoto - maroto’, ‘gaveta - gazeta’, ‘gazeta - gaveta’, ‘geada - gemada’, ‘gelo - selo’, ‘gemada - geada’, ‘gemido - temido’, ‘genro - tenro’, ‘giga - viga’, ‘goela - moela’, ‘goleiro - poleiro’, ‘gomo - domo’, ‘gongo - longo’, ‘gorro - jorro’, ‘gosto - rosto’, ‘gralha - tralha’, ‘grato - prato’, ‘gruta - truta’, ‘gueto - dueto’, ‘guia - gula’, ‘gula - lula’, ‘hera - fera’, ‘hiena - viena’, ‘horto - torto’, ‘idade - cidade’, ‘ilustre - lustre’, ‘impune - imune’, ‘imune - impune’, ‘inapto - inepto’, ‘incolor - indolor’, ‘inculto - insulto’, ‘indolor - incolor’, ‘inepto - inapto’, ‘inferno - inverno’, ‘insulto - inculto’, ‘inverno - inferno’, ‘iodo - diodo’, ‘irado - virado’, ‘isolado - solado’, ‘janela - panela’, ‘jarro - jorro’, ‘jato - tato’, ‘jeito - peito’, ‘joelho - coelho’, ‘jogo - logo’, ‘joio - jogo’, ‘jorro - jarro’, ‘jota - rota’, ‘juba - tuba’, ‘julho - junho’, ‘junho - julho’, ‘juro - ouro’, ’ladeira - madeira’, ’lama - lhama’, ’lareira - ladeira’, ’lasca - casca’, ’lastro - mastro’, ’latente - patente’, ’lavado - levado’, ’lavrado - lavado’, ’lebre - febre’, ’legado - negado’, ’leigo - meigo’, ’leito - peito’, ’lenda - tenda’, ’lenha - lenda’, ’lesado - pesado’, ’lesma - resma’, ’levado - nevado’, ’lhama - lama’, ’ligado - legado’, ’ligeiro - lixeiro’, ’limbo - lombo’, ’limpo - olimpo’, ’lividez - vividez’, ’lixa - rixa’, ’lixeiro - ligeiro’, ’lixo - luxo’, ’locador - tocador’, ’logo - longo’, ’loja - soja’, ’lombo - tombo’, ’lona - tona’, ’longo - logo’, ’lotado - dotado’, ’lula - gula’, ’lustre - ilustre’, ’luxo - lixo’, ‘machado - rachado’, ‘macio - macro’, ‘macro - micro’, ‘madeira - ladeira’, ‘magno - mogno’, ‘malhado - malvado’, ‘malta - falta’, ‘malvado - malhado’, ‘mangue - sangue’, ‘maroto - garoto’, ‘mastro - lastro’, ‘mato - tato’, ‘meia - veia’, ‘meigo - leigo’, ‘melado - velado’, ‘mesa - meia’, ‘miado - mimado’, ‘micro - macro’, ‘mimado - rimado’, ‘mocinho - moinho’, ‘moedor - roedor’, ‘moela - goela’, ‘mogno - morno’, ‘moinho - mocinho’, ‘molar - colar’, ‘moleira - coleira’, ‘molho - olho’, ‘monge - monte’, ‘monte - morte’, ‘morno - mogno’, ‘morse - morte’, ‘morte - morse’, ‘moto - mato’, ‘mudez - nudez’, ‘mugido - rugido’, ‘munido - zunido’, ‘murro - urro’, ’nabo - nato’, ’nato - tato’, ’navio - pavio’, ’negado - nevado’, ’nevado - negado’, ’nexo - anexo’, ’nobreza - pobreza’, ’noivo - novo’, ’nojo - novo’, ’nono - sono’, ’nora - tora’, ’nosso - vosso’, ’novo - nono’, ’nublado - dublado’, ’nudez - mudez’, ‘oceano - octano’, ‘ocioso - odioso’, ‘octano - oceano’, ‘odioso - ocioso’, ‘olho - molho’, ‘olimpo - limpo’, ‘orelha - ovelha’, ‘osso - vosso’, ‘ouro - touro’, ‘ousado - usado’, ‘ovelha - orelha’, ‘pagem - vagem’, ‘pampa - tampa’, ‘panela - janela’, ‘parado - tarado’, ‘parto - perto’, ‘passado - assado’, ‘patente - potente’, ‘pavio - navio’, ‘peito - perto’, ‘peixe - feixe’, ‘peludo - veludo’, ‘penhor - senhor’, ‘pensado - pesado’, ‘pente - rente’, ‘pequisa - pesquisa’, ‘perito - perto’, ‘perto - perito’, ‘pesado - pescado’, ‘pescado - pesado’, ‘pesquisa - pequisa’, ‘peste - pente’, ‘picado - pirado’, ‘pirado - virado’, ‘pobreza - nobreza’, ‘poeira - zoeira’, ‘poente - potente’, ‘pois - dois’, ‘poleiro - goleiro’, ‘polia - polpa’, ‘polpa - polia’, ‘pombo - tombo’, ‘pontal - postal’, ‘porco - pouco’, ‘porque - torque’, ‘posse - tosse’, ‘postal - pontal’, ‘potente - poente’, ‘pouco - rouco’, ‘pouso - pouco’, ‘praga - praia’, ‘praia - praga’, ‘pranto - prato’, ‘prato - preto’, ‘prazo - prato’, ‘pregado - prezado’, ‘preto - reto’, ‘prezado - pregado’, ‘profeta - proveta’, ‘proveta - profeta’, ‘prumo - rumo’, ‘punhado - cunhado’, ‘punido - zunido’, ‘rabada - rajada’, ‘rachado - machado’, ‘rainha - bainha’, ‘raio - raso’, ‘raiz - raio’, ‘rajada - rabada’, ‘ralo - talo’, ‘raro - raso’, ‘raso - raro’, ‘reator - reitor’, ‘recente - repente’, ‘redator - redutor’, ‘redutor - sedutor’, ‘refeito - efeito’, ‘regente - repente’, ‘reitor - reator’, ‘rente - pente’, ‘repente - regente’, ‘resma - lesma’, ‘reto - preto’, ‘rifado - rimado’, ‘rimado - rifado’, ‘ripa - rixa’, ‘risada - visada’, ‘risco - cisco’, ‘rixa - ripa’, ‘roceiro - roteiro’, ‘rodado - rogado’, ‘roedor - moedor’, ‘rogado - rodado’, ‘rolante - volante’, ‘rolha - bolha’, ‘rolo - tolo’, ‘rombo - tombo’, ‘rosto - gosto’, ‘rota - jota’, ‘roteiro - roceiro’, ‘rotina - botina’, ‘roubo - rouco’, ‘rouco - roubo’, ‘roxo - rolo’, ‘rugido - mugido’, ‘ruivo - uivo’, ‘rumo - prumo’, ‘sacada - salada’, ‘sadio - vadio’, ‘safira - safra’, ‘safo - bafo’, ‘safra - safira’, ‘salada - sacada’, ‘sangue - mangue’, ‘sarda - sarna’, ‘sarna - sarda’, ‘sebo - seno’, ‘secto - septo’, ‘seda - seja’, ‘sedutor - redutor’, ‘seio - seno’, ‘seita - seiva’, ‘seiva - seita’, ‘seixo - seio’, ‘seja - soja’, ‘selado - velado’, ‘selo - silo’, ‘semente - somente’, ‘senhor - penhor’, ‘seno - sono’, ‘sentado - sentido’, ‘sentido - sentado’, ‘septo - secto’, ‘setor - vetor’, ‘silo - siso’, ‘silvo - silo’, ‘siso - silo’, ‘sitiado - situado’, ‘situado - sitiado’, ‘socado - sovado’, ‘sogro - soro’, ‘soja - soma’, ‘solado - sovado’, ‘soma - soja’, ‘somente - semente’, ‘sono - soro’, ‘sonso - sono’, ‘soro - sono’, ‘sovado - solado’, ‘suado - sugado’, ‘suco - sulco’, ‘sueco - sulco’, ‘sugado - suado’, ‘sujo - suco’, ‘sulco - sueco’, ’tacho - cacho’, ’taipa - tampa’, ’tala - vala’, ’talo - tolo’, ’tampa - taipa’, ’tanto - tento’, ’tapado - tarado’, ’tarado - tarjado’, ’tarjado - tarado’, ’tato - tatu’, ’tatu - tato’, ’tecido - temido’, ’teia - veia’, ’temido - tecido’, ’tenda - lenda’, ’tenor - tensor’, ’tenro - tento’, ’tensor - tenor’, ’tento - tenro’, ’testado - tostado’, ’tigela - tijela’, ’tijela - tigela’, ’tintura - tontura’, ’toalha - tralha’, ’tocador - locador’, ’toco - troco’, ’tolo - topo’, ’tomada - topada’, ’tombo - rombo’, ’tona - tosa’, ’tontura - tintura’, ’topada - tomada’, ’topo - tolo’, ’tora - tosa’, ’torque - porque’, ’torto - horto’, ’tosa - tora’, ’tosse - posse’, ’tostado - testado’, ’touro - ouro’, ’tralha - toalha’, ’trama - trava’, ’trava - trova’, ’treco - troco’, ’treta - truta’, ’trevo - treco’, ’triagem - friagem’, ’trilho - brilho’, ’troco - treco’, ’trova - trava’, ’trufo - trunfo’, ’trunfo - trufo’, ’truta - treta’, ’tuba - juba’, ’tucano - tutano’, ’turbo - turvo’, ’turco - turvo’, ’turvo - turco’, ’tutano - tucano’, ‘uivo - ruivo’, ‘umidade - unidade’, ‘unidade - umidade’, ‘urina - usina’, ‘urro - urso’, ‘urso - urro’, ‘usado - ousado’, ‘usina - urina’, ‘vadio - vazio’, ‘vaga - zaga’, ‘vagem - viagem’, ‘vaia - veia’, ‘vaidade - validade’, ‘vala - valsa’, ‘validade - vaidade’, ‘valsa - vala’, ‘vasto - visto’, ‘vazio - vadio’, ‘veado - velado’, ‘vedado - velado’, ‘veia - vaia’, ‘velado - veludo’, ‘veludo - velado’, ‘vetor - setor’, ‘vexame - exame’, ‘viagem - virgem’, ‘vibrado - virado’, ‘videira - viseira’, ‘viela - vitela’, ‘viena - viela’, ‘viga - vigia’, ‘vigia - viga’, ‘virado - vibrado’, ‘virgem - viagem’, ‘visada - risada’, ‘viseira - videira’, ‘visto - xisto’, ‘vitela - viela’, ‘vividez - lividez’, ‘voador - doador’, ‘volante - votante’, ‘vosso - osso’, ‘votante - volante’, ‘xisto - visto’, ‘zaga - vaga’, ‘zoeira - poeira’, ‘zunido - punido’]

    Additionally, in the remaning words there are a lot of words which are negative and offensive, such as defunto.

  5. ninjastic force-pushed on Sep 27, 2020
  6. ninjastic commented at 0:48 am on September 27, 2020: contributor
    Just squashed all the 151 commits into a single one. Also added @brenorb as a co-author.
  7. sabotag3x commented at 0:54 am on September 27, 2020: contributor
  8. Create portuguese.txt
    Co-authored-by: Breno Rodrigues Brito <brenorb@gmail.com>
    Co-authored-by: ninjastic <ninjasticdev@protonmail.com>
    Co-authored-by: sabotag3x <sabotage.sta@gmail.com>
    Co-authored-by: bitmover <67111541+bitmover-studio@users.noreply.github.com>
    Co-authored-by: alegotardo <40860228+alegotardo@users.noreply.github.com>
    Co-authored-by: kuthullu <kuthullu@gmail.com>
    Co-authored-by: Trimegistus <trimegisto@rocketmail.com>
    d353c54154
  9. ninjastic force-pushed on Sep 28, 2020
  10. luke-jr added the label Proposed BIP modification on Oct 5, 2020
  11. sabotag3x commented at 11:08 am on October 18, 2020: contributor

    @slush0 @prusnak @voisine @ebfull

    So, I know that you may only care about the english list and that’s why no new wordlist have been accepted in recent years.

    However, BIP-0039 was created to help users to restore their wallets as it’s easier to write down 12 words than 64 random characters. (well, you know that better than me since you are the authors)

    I’ll use your own words: “a group of easy to remember words”

    English words aren’t easy to remember for non-english speakers. As well as portuguese words may not be easy for you, for example. In addition, a foreign language is more likely to cause typos and, at worst, make people lose their BTC.

    More than 250 million people speak portuguese, it’s one of the most widely spoken languages in the world and it’s the native language in Brazil, Portugal, Angola, Mozambique and other smaller countries. Moreover, few of them speak english.

    My point is that the BIP-0039 method should be easier for non-english speakers as well.

    PS: Let me know if you need more portuguese speakers to review the wordlist before accepting it.

  12. p2w34 commented at 6:32 am on October 23, 2020: none

    My comment may sound harsh for both the list creators and the maintainers of BIP0039 but nevertheless, I am still going to make it. By the list creators, I mean not only the Portuguese list but all recently created lists. This applies also to me, as I made this mistake as well.

    Before the beginning of your work - have you asked any of BIP0039 maintainers whether there is a chance that your work will be merged in? Especially in the presence of many unmerged word lists proposals for other languages? By having a look at closed PRs one can see when exactly the last PR with a word list was merged in - it should be discouraging. However, the word lists creators seem to ignore this fact and then try to somehow push it through.

    A massive amount of time is wasted on all those hanging forever PRs. I would be really glad to see a clear direction set here. Anything would be better than the current situation.

    If the new word lists are meant to be never accepted then I would expect that the maintainers would clearly state the new word lists are not accepted. The ones already merged in would be the official BIP0039 word lists or this could be limited to just the English list. Does it solve the problem? No, because the word lists are for sure needed. I do not know how to proceed from this point (most likely new BIP, discussion on the mailing list etc.), but thanks to it people would not waste time!

  13. fortesp commented at 5:46 pm on November 4, 2020: none
    Why is this not merged yet?
  14. bitmover-studio cross-referenced this on Nov 18, 2020 from issue Adding Polish wordlist to BIP39 by KarolTrzeszczkowski
  15. mateusnds commented at 11:41 am on December 2, 2020: none
    Hey, list will be merged?
  16. fortesp commented at 8:19 am on December 17, 2020: none
    Sorry to say, but i am not sure at this point who or what exactly is the Bitcoin “community” when we have pull requests such as this, waiting to be merged for months. Some pull requests are even for years. This is the case to actually ask who is the “boss” of this project? Because community here does not seem to exist.
  17. sipa commented at 8:03 pm on December 17, 2020: member
    @fortesp BIPs are a mechanism for publishing ideas/proposals. Accepting changes to those proposals is the BIP’s authors responsibility. If they don’t like a particular change, you’re always welcome to publish your own competing proposal.
  18. brenorb commented at 8:52 pm on December 17, 2020: contributor
    I’m not sure we have a large community of Portuguese speaking people who also speak English and use Github. However, isn’t it really one more strong argument in favor of having a Portuguese BIP039? I’m not really sure of what’s missing for this BIP to be merged.
  19. sipa commented at 8:53 pm on December 17, 2020: member
    @brenorb Agreement from the BIP authors is the only thing that matters.
  20. fortesp commented at 9:02 am on December 18, 2020: none
    @sipa Agreed. But there is no feedback from any of the authors that i can see. Not sure what is missing or maybe not compliant, i did not check it myself to be honest.
  21. brenorb commented at 3:21 pm on December 20, 2020: contributor
    @sipa Ok, I’m one of the BIP authors and I’m pretty sure we all agree on it. What is the next step we need to do in practice? Is there a specific button to click? Can you show us the step-by-step process?
  22. prusnak commented at 3:34 pm on December 20, 2020: contributor

    Ok, I’m one of the BIP authors and I’m pretty sure we all agree on it.

    You are not a BIP39 author. I am (one of them).

    Let’s get this merged in.

    Edit: ACK

  23. brenorb commented at 3:42 pm on December 20, 2020: contributor
    @prusnak sorry, for the mistake. I’m one of the authors of this proposal.
  24. luke-jr commented at 7:01 pm on December 20, 2020: member

    Let’s get this merged in. @prusnak Interpreting that as an ACK; let me know if I should revert

  25. luke-jr merged this on Dec 20, 2020
  26. luke-jr closed this on Dec 20, 2020

  27. brenorb cross-referenced this on Jun 11, 2021 from issue Add Portuguese wordlist by brenorb
  28. johnnyasantoss cross-referenced this on Jun 11, 2021 from issue Wordlist Portuguese Brazil BIP39 by alissonsolitto
  29. johnnyasantoss cross-referenced this on Jun 15, 2021 from issue Fix the portuguese word list to use the correct and accepted version by johnnyasantoss
  30. crypto-punk referenced this in commit 7553058670 on Sep 20, 2022
  31. scottwad approved

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-24 09:10 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me