junderw commented at 2:26 AM on August 10, 2017: contributor

Alternative to #544

Made on behalf of @Annyonghaseyo a Korean native that lives in Japan and regularly attends the Tokyo Bitcoin Meetup. She is also the translator of a lot of Korean bitcoin apps.

I don't really have much to say in the matter except the limited knowledge I have of Korean based on similarities to Japanese. She echoed the concerns of using -hada verbs in the list which was also echoed by others and myself.

This list is NFKD normalized and sorted in order of UTF-8 binary so as to allow for binary search.

Add Korean wordlist 2880981dc3

gnujoow commented at 10:05 AM on August 10, 2017: none

nice work @junderw @Annyonghaseyo

All words on the list are not difficult, easy to remember and well-following the rule mentioned by @Annyonghaseyo, and bip39

this is much better than #544

voisine commented at 4:19 AM on August 15, 2017: contributor

This list looks good. Non-Korean speaking ACK.

junderw commented at 4:57 AM on August 15, 2017: contributor

@luke-jr Looks like we have an ACK from BIP author @voisine and Korean natives @Annyonghaseyo and @gnujoow both see this wordlist as sufficient. Please merge if everything seems in order.

Thanks everyone.

junderw commented at 5:40 AM on August 15, 2017: contributor

Just generated 10 phrases using bitcore-mnemonic.

Can't understand it, but looks good.

역시 삼십 하드웨어 보통 세월 고궁 시골 제한 한마디 장미 청춘 관념
암컷 영국 같이 토마토 법률 탤런트 고전 마음 갈색 공업 점점 공동
예선 고구려 목숨 흥분 모퉁이 단골 시아버지 바탕 울음 사슴 의학 원인
신비 명예 내일 모양 교직 연출 전문 팝송 부문 부탁 사나이 창고
정비 왕자 태권도 만화 번개 전체 중학교 막걸리 입사 서명 승용차 전시
탤런트 책가방 형부 주택 퇴근 강아지 인공 사냥 어려움 극히 엔진 기원
무릎 테니스 전혀 알루미늄 간섭 즉석 칠십 밥솥 예전 엔진 줄거리 임무
행복 도저히 주말 벌레 공부 백화점 혹시 추천 교과서 자연 별도 법률
촛불 의논 단추 지진 고양이 습기 가뭄 시아버지 전통 정성 매장 가족
진짜 홍수 상자 강도 유적 지리산 막걸리 닭고기 햄버거 사슴 클래식 진료

gnujoow commented at 8:04 AM on August 15, 2017: none

@junderw it looks good for korean as well :)

luke-jr merged this on Aug 15, 2017

luke-jr closed this on Aug 15, 2017

ghost commented at 6:01 AM on January 9, 2018: none

ㅎㅏㄴㄱㅡㄹ, I mean, 한글.

How should one treat un-normalized Hangeul inputs(e.g. 가격 (U+AC00 U+ACA9)), which are more commonly used than normalized ones?

Should one normalize them into NFKD(Normalization Form: Compatibility Decomposition.)(e.g. 가격 (U+1100 U+1161 U+1100 U+1167 U+11A8)) before processing, or refuse to accept them?

junderw commented at 6:45 AM on January 9, 2018: contributor

@wlzla000

User enters string.
Your app NFKD normalizes the string.
Your app splits the string on ' ' space (0x20) into an array.
Your app checks if each word is contained within a single wordlist you support.
If so, your app converts the indices of each word into binary to reassemble the data and verifies the checksum. If the checksum fails, throw a failure. If checksum passes, run the NKFD normalized phrase from step 2 through the pbkdf2 rounds.
If not, Ask your user "We could not verify the validity of this phrase. Would you like to use it anyways?" and if they say no. Throw a failure. If they say yes, run the NFKD normalized string from step 2 through the pbkdf2 rounds.

Since you will NFKD normalize the user's input, 한글 will become ㅎㅏㄴㄱㅡㄹ automatically.

That's the whole point of NFKD normalization. You do it on the backend. The user only interacts with the frontend.

iancoleman commented at 12:26 AM on March 12, 2018: contributor

Should Korean mnemonics be displayed with ideographic spaces (as per Japanese) or ASCII spaces (as per Chinese)?

There is no mention of this in the bip39 wordlists document.

Any other Korean language considerations that should be added to that document?

junderw commented at 1:02 AM on March 12, 2018: contributor

Should Korean mnemonics be displayed with ideographic spaces (as per Japanese) or ASCII spaces (as per Chinese)?

If nothing is specified, ASCII.

If you google any Korean website at all, they use ASCII spaces in their writing.

Japanese and Chinese don't use any spaces at all normally.

iancoleman referenced this in commit 01be853e4b on May 7, 2018

Add Korean Wordlist #570

ㅎㅏㄴㄱㅡㄹ, I mean, 한글.