refactor: add benchmarks to bech32::Encode/Decode #13586

kallewoof commented at 7:06 AM on July 2, 2018: member

This PR adds benchmarks to Encode()/Decode().

The benchmark commit is duplicated in #13632.

fanquake added the label Tests on Jul 2, 2018

kallewoof force-pushed on Jul 2, 2018

in src/bech32.cpp:153 in d4e8281d01 outdated

 147 | @@ -147,11 +148,15 @@ namespace bech32
 148 |  std::string Encode(const std::string& hrp, const data& values) {
 149 |      data checksum = CreateChecksum(hrp, values);
 150 |      data combined = Cat(values, checksum);
 151 | -    std::string ret = hrp + '1';
 152 | -    ret.reserve(ret.size() + combined.size());
 153 | +    size_t hrp_size = hrp.size();
 154 | +    char ret[hrp_size + combined.size() + 2];
 155 | +    memcpy(ret, hrp.c_str(), hrp_size);

Empact commented at 7:32 AM on July 2, 2018:

Is there slow-down associated with inlining hrp_size? Seems the compiler would optimize that.

kallewoof commented at 7:42 AM on July 2, 2018:

I'll check. I didn't like repeatedly calling hrp.size() but it may be optimized away, as you say.

kallewoof commented at 7:54 AM on July 2, 2018:

I don't think the difference warrants the extra line, so I removed it. Thanks for the nudge. :)

in src/bech32.cpp:157 in d4e8281d01 outdated

 155 | +    memcpy(ret, hrp.c_str(), hrp_size);
 156 | +    char* ptr = ret + hrp_size;
 157 | +    *(ptr++) = '1';
 158 |      for (auto c : combined) {
 159 | -        ret += CHARSET[c];
 160 | +        *(ptr++) = CHARSET[c];

Empact commented at 7:34 AM on July 2, 2018:

How about switching to ++ptr?

kallewoof commented at 7:42 AM on July 2, 2018:

Why? (And that would change the behavior to setting the next value rather than the current and iterating.)

Empact commented at 8:04 AM on July 2, 2018:

++i is preferred over i++. https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md

You can make the behavior fit by shifting the increment to the later operation in each case.

kallewoof commented at 8:35 AM on July 2, 2018:

In this case, I think $(ptr++) is a better alternative, as it is more intuitive. Update the value and then move to the next position. Doing ++i here semantically means the pointer points to the last byte that was set, rather than the next available byte.

Empact commented at 4:58 PM on July 2, 2018:

The nice thing about ++ptr, and the reason it is preferred AFAIK, is that it does not require a copy of the value in order to operate - you always operate using the value directly because it is holding the current value rather than the next one.

sipa commented at 5:34 PM on July 2, 2018:

In theory, ++x is faster than x++ for the reason you mention. However, in practice, for simple types the optimizer will rearrange things to avoid the copy. Even for more complex types this is often the case.

Furthermore, just because a construct is preferred doesn't mean you need to go out of your way to use over an alternative. If there are good reasons for x++, like here, you can use it.

kallewoof force-pushed on Jul 2, 2018

kallewoof commented at 11:26 AM on July 2, 2018: member

@fanquake: This isn't really tests, it's an optimization of bech32::Encode(). It does include benchmarks to show that the optimization is useful though.

fanquake added the label Refactoring on Jul 2, 2018

fanquake removed the label Tests on Jul 2, 2018

Add simple bech32 benchmarks 189cf35f3e

kallewoof force-pushed on Jul 3, 2018

in src/bech32.cpp:155 in d06b0f2a53 outdated

 157 |      for (auto c : combined) {
 158 | -        ret += CHARSET[c];
 159 | +        *(ptr++) = CHARSET[c];
 160 |      }
 161 | +    *ptr = 0;
 162 |      return ret;

sipa commented at 6:42 PM on July 3, 2018:

Even faster: return std::string(ret, ptr); (no need for the string constructor to find the string length again, you already know).

kallewoof commented at 3:23 AM on July 4, 2018:

Nice!

kallewoof commented at 3:31 AM on July 4, 2018:

Damn.

Bech32Encode, 5, 800000, 3.79493, 9.32891e-07, 9.75863e-07, 9.47069e-07

kallewoof commented at 5:54 AM on July 4, 2018:

Actually, the vanity generator did not see an increase in speed so this may have been a fluke. It didn't make it worse though so keeping.

laanwj commented at 2:00 PM on July 10, 2018:

Too bad it didn't help, but I prefer the avoidance of zero-terminated strings in any case.

kallewoof force-pushed on Jul 4, 2018

DrahtBot commented at 7:26 AM on July 6, 2018: contributor

Note to reviewers: This pull request conflicts with the following ones:

#13249 (Make objects in range declarations immutable by default. Avoid unnecessary copying of objects in range declarations. by practicalswift)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

in src/bech32.cpp:152 in 99c07d97bf outdated

 147 | @@ -147,12 +148,14 @@ namespace bech32
 148 |  std::string Encode(const std::string& hrp, const data& values) {
 149 |      data checksum = CreateChecksum(hrp, values);
 150 |      data combined = Cat(values, checksum);
 151 | -    std::string ret = hrp + '1';
 152 | -    ret.reserve(ret.size() + combined.size());
 153 | +    char ret[91];
 154 | +    memcpy(ret, hrp.c_str(), hrp.size());

laanwj commented at 2:03 PM on July 10, 2018:

please use .data() here (it goes more naturally with .size())

laanwj commented at 2:06 PM on July 10, 2018:

This really needs a length check. Fixing the length at 91 but still copying the entire input (and writing after it) is a buffer overflow waiting to happen, otherwise.

gmaxwell commented at 3:51 PM on July 10, 2018:

Came here to say the same regarding a length check. It's too brittle otherwise.

kallewoof commented at 1:56 AM on July 11, 2018:

I knew you'd say that. Added length check and updated benchmark results in OP. Still improvement.

laanwj commented at 2:13 PM on July 10, 2018: member

I'm a bit scared of this due to the potential security issue introduced. Is it really worth it?

kallewoof force-pushed on Jul 11, 2018

kallewoof commented at 1:56 AM on July 11, 2018: member

@laanwj If you mean the unchecked length, I added an assertion in e35eed5.

laanwj commented at 6:58 AM on July 11, 2018: member

Sure, yes. Though an assertion is not a replacement for proper error handing, IMO. This still means the whole process crashes if a string that is too long is passed in, and makes the security dependent on assertions being enabled (this might be true in other place too though :( ).

kallewoof commented at 7:08 AM on July 11, 2018: member

I am not convinced this is necessary myself anymore, so I am closing. If someone feels like it's a worthy endeavour go nuts.

kallewoof closed this on Jul 11, 2018

kallewoof deleted the branch on Jul 11, 2018

Empact cross-referenced this on Jul 11, 2018 from issue refactor: Optimize bech32::Encode by Empact

laanwj commented at 9:19 AM on July 11, 2018: member

It still makes sense to add the benchmark, I think!

kallewoof restored the branch on Jul 11, 2018

kallewoof renamed this:
~~refactor: add benchmarks and speed up bech32 encoder~~
refactor: add benchmarks to bech32::Encode/Decode
on Jul 11, 2018

kallewoof reopened this on Jul 11, 2018

kallewoof force-pushed on Jul 11, 2018

kallewoof commented at 9:27 AM on July 11, 2018: member

Dropped the last commit and keeping the bench part only.

laanwj commented at 9:35 AM on July 11, 2018: member

utACK 189cf35f3e6d2cc9ed08eb23dd0ea36be28b6c11

Empact commented at 9:36 AM on July 11, 2018: member

tACK 189cf35

laanwj merged this on Jul 11, 2018

laanwj closed this on Jul 11, 2018

laanwj referenced this in commit acc68bc631 on Jul 11, 2018

kallewoof deleted the branch on Jul 12, 2018

Bushstar cross-referenced this on Jul 13, 2018 from issue commits from bitcoin/master by Bushstar

PastaPastaPasta referenced this in commit 357164dcb7 on Jun 27, 2021

PastaPastaPasta referenced this in commit 5937d23ae6 on Jun 28, 2021

PastaPastaPasta referenced this in commit c70ef4cbd4 on Jun 28, 2021

PastaPastaPasta referenced this in commit 3c0bda8e15 on Jun 28, 2021

PastaPastaPasta referenced this in commit c7c810772d on Jun 28, 2021

PastaPastaPasta referenced this in commit 097bde6377 on Jun 29, 2021

bitcoin locked this on Sep 8, 2021