Improve EncodeBase58 performance

promag commented at 0:48 am on March 9, 2016: member

This change consists in avoiding filling the b58 buffer with zeros. The improvement is about 30% - 45%.

For example, calling listunspents with my wallet results in 313ms in EncodeBase58 whereas before was 578ms.

jonasschnelli added the label Refactoring on Mar 9, 2016

Improve EncodeBase58 performance 3252208cb1

in src/base58.cpp: in 56472eaef0 outdated

74         zeroes++;
75     }
76     // Allocate enough space in big-endian base58 representation.
77-    std::vector<unsigned char> b58((pend - pbegin) * 138 / 100 + 1); // log(256) / log(58), rounded up.
78+    int size = (pend - pbegin) * 138 / 100 + 1;
79+    std::vector<unsigned char> b58(size); // log(256) / log(58), rounded up.

MarcoFalke commented at 10:01 am on March 9, 2016:

Nit: Shouldn’t this comment go up one line?

promag commented at 10:09 am on March 9, 2016:

:+1:

promag force-pushed on Mar 9, 2016

laanwj commented at 7:35 am on March 11, 2016: member

For example, calling listunspents with my wallet results in 313ms in EncodeBase58 whereas before was 578ms.

Interesting result. I was not aware anything was bottlenecked by base58 encoding. As you’ve demonstrated a concrete performance improvement: concept ACK

gmaxwell commented at 7:47 am on March 11, 2016: contributor

Indeed, thanks for the concrete result. Concept ack.

laanwj commented at 8:49 am on March 11, 2016: member

Ping @luke-jr, I think it makes sense for you to review this because of libbase58

sipa commented at 5:23 pm on March 12, 2016: member

Untested ACK.

This should result in an asymptotic 2x speedup. I didn’t expect that it would matter anywhere, but as it seems it does, great.

in src/base58.cpp: in 3252208cb1

88             *it = carry % 58;
89             carry /= 58;
90         }
91+
92         assert(carry == 0);
93+        length = i;

jtimon commented at 5:53 pm on March 16, 2016:

Do you really need both i and length? It seems you could simply ++length; here. I see your i < length condition, but I don’t see how can it possibly be ever true.

fanatid commented at 9:36 am on March 17, 2016:

It isn’t simply ++length, it can be length += 0|1|2, but there is possible to remove i: length = it - b58.begin()

jtimon commented at 7:05 pm on March 17, 2016:

mhmm, I need to read this more deeply for an utACK, I think. I retire my utACK but maintain the concept ACK.

jtimon commented at 5:54 pm on March 16, 2016: contributor

Very nice, utACK besides nit.

dcousens commented at 1:20 am on March 17, 2016: contributor

utACK 3252208

in src/base58.cpp: in 3252208cb1

81     while (pbegin != pend) {
82         int carry = *pbegin;
83+        int i = 0;
84         // Apply "b58 = b58 * 256 + ch".
85-        for (std::vector<unsigned char>::reverse_iterator it = b58.rbegin(); it != b58.rend(); it++) {
86+        for (std::vector<unsigned char>::reverse_iterator it = b58.rbegin(); (carry != 0 || i < length) && (it != b58.rend()); it++, i++) {

fanatid commented at 9:37 am on March 17, 2016:

Wouldn’t ++it faster than it++?

MarcoFalke commented at 9:52 am on March 17, 2016:

Presumably, yes: http://stackoverflow.com/a/35085/2084795

laanwj commented at 11:27 am on March 17, 2016:

for iterators that may well be the case (prefix increment saves a copy)

jtimon commented at 7:22 pm on March 17, 2016:

This should depend on the compiler and target machine anyway, but it is my understanding that in x86 there are separated instructions with different performance (how big is the difference I have no idea). I also suspect that many compilers are smart enough to do this for you. So if it may not do anything but if it may do something good, why not? Of course this is not to say we should do it everywhere. But in new code, why not? It may be useful for some platforms. The cons from stackoverflow seem very week, but I’m very curious if anybody else has some other more solid concerns or benchmarks showing this is not really something to think much about (or just data showing that, yes, compilers are currently smart for this too). This is a recurring discussion that I would like to stop thinking about one way or the other.

laanwj commented at 9:59 am on March 18, 2016:

This should depend on the compiler and target machine anyway, but it is my understanding that in x86 there are separated instructions with different performance

For integers the compiler is certainly smart enough that there is no difference between prefix and postfix ++, if bare (the result is not actually used).

But this doesn’t have much to do with the instructions, just with language design. Iterators are objects. The definition of prefix and postfix ++ respectively is, when overloading them:

0Point& operator++();       // Prefix increment operator.
1Point operator++(int);     // Postfix increment operator.

So: postfix operation returns a copy (the old value), whereas prefix increments in place and returns a reference (to self). This means prefix can, in principle, be implemented more efficiently.

luke-jr commented at 0:05 am on March 20, 2016:

Maybe change it++ to ++it while we’re optimising… (it++ creates an unnecessary copy of the iterator)

fanatid commented at 10:54 am on March 18, 2016: contributor

https://github.com/cryptocoinjs/base-x/blob/d33156e62ea435073e4b73640f433756124f89d8/src/basex.cc#L51 another base58 encoding implementation (it was attempt speed up base58 for node.js) @promag if you apply .reserve to digits I think it can be even faster than it is now.

in src/base58.cpp: in 3252208cb1

87             carry += 256 * (*it);
88             *it = carry % 58;
89             carry /= 58;
90         }
91+
92         assert(carry == 0);

luke-jr commented at 0:04 am on March 20, 2016:

This could be transformed into an assert(it != b58.rend()); within the loop.

luke-jr commented at 0:06 am on March 20, 2016: member

Concept ACK and (in-depth) utACK 3252208cb10be645bae415c90fb2ed8217838490

laanwj merged this on Mar 21, 2016

laanwj closed this on Mar 21, 2016

laanwj referenced this in commit 7b832d286b on Mar 21, 2016

ruimarinho deleted the branch on Apr 6, 2016

random-zebra referenced this in commit 961e2d2a70 on Jun 27, 2020

martinus referenced this in commit 9b86d870a4 on Feb 14, 2021

martinus referenced this in commit fa505e3101 on Feb 14, 2021

martinus referenced this in commit fc38405751 on Feb 14, 2021

martinus referenced this in commit 0d96470bca on Feb 14, 2021

MarcoFalke locked this on Sep 8, 2021

Improve EncodeBase58 performance #7656