Bignum2 #1825

roques commented at 5:56 PM on September 15, 2012: contributor

`CBigNum::SetCompact()` and `CBigNum::GetCompact()`

Because my previous pull-request #1823 which changed the semantics of SetCompact() and GetCompact() in an effectively backwards-compatible way was met with understandable skepticism I've prepared this branch, which provides a bit for bit compatible re-implementation and unit tests.

Please, pull at least the first commit of either of the branches. They only contain unit-tests making sure the semantic of GetCompact() and SetCompact() is not changed unintentionally.

If you agree that the semantics should not be changed, you should still pull the second commit of this branch if you agree that my re-implementation is easier to understand, more efficient, or both.

tests for SetCompact and GetCompact of CBigNum 6f0cecfc47

BitcoinPullTester commented at 10:22 PM on September 15, 2012: none

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/34b6e1c052024a0c80069ef1a7ce63764a496007 for binaries and test log.

xanatos commented at 8:22 AM on September 17, 2012: none

I haven't benchmarked it, but just to be sure, would it be possible to make a one-shot program to compare all the 4 billion possible compact value SetCompactOld() == SetCompactNew() and from the SetCompact() result GetCompactOld() == GetCompactNew()? (with possible I mean that I don't know how much fast are GetCompact and SetCompact. Being one shot I would find acceptable to take 24h, so it would need to do 50k compares/second.

xanatos commented at 9:44 AM on September 17, 2012: none

And if you want to document the compact format (something very useful I think), please write in the "big comment" the endianness of the mantissa and if ABCD are the 4 bytes of the compact number, which one is the exponent (A?) and which one is the mantissa (BCD in ?endian). As it is I think the format seems still "black magic".

I'll add that you haven't really removed the internals of bignumber from the implementation: BN_num_bytes is based on the internal representation of BN. To make an example, BN_num_bytes of the 0 number is 0, while the .NET BigInteger serializes the 0 as a single { 0 } byte, and BN_num_bytes gives the length of the "internal" buffer of the BN, with the sign that is saved on another field, so BN_num_bytes(0xFF) == BN_num_bytes(-0xFF) == BN_num_bytes(0x7F) == BN_num_bytes(0x-7F) == 1.

reimplement CBigNum's compact encoding of difficulty targets

Use shifts instead of going through the MPI representation of BIGNUMs.
Be careful to keep the meaning of 0x00800000 as sign bit.

48a10a3780

roques commented at 2:30 PM on September 17, 2012: contributor

@xanatos: You're raising good questions and I've thus pushed an updated "big comment".

On testing the correctness of the implementation: I've tested the equivalence of the old and new implementations for a reasonably big subset of all possible compact numbers. While doing so I've benchmarked the new against the old implementation: For sizes relevant to bitcoin the new implementation is ~80% faster. For bigger sizes the speedup approaches ~900% due to the big intermediate value used by the old implementation. — Compact numbers are such a small part of bitcoin, their speed is almost irrelevant; I'm hoping the new implementation is easier to understand, the speedup is just a nice side effect.

About endianness: The "compact" representation is not a serialization into bytes, but a mapping into and out of an unsigned 32 bit number. The endianness (= order of bytes when mapping a number to a sequence of bytes) is thus irrelevant. However, bitcoin serializes CBlock::nBits like any other unsigned long.

The semantics of BN_num_bytes(n) is well documented and part of the interface of OpenSSL. It is not the number of bytes used in a serialization of n, but the position of the most significant non-zero base256 digit of n. It actually is implemented as (BN_num_bits(n)+7)/8. (BN_num_bits(10) = 4 because decimal 10 = binary 1010)

BitcoinPullTester commented at 7:05 PM on September 17, 2012: none

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/48a10a37804a01b2533aea920567867deedc6b9e for binaries and test log.

laanwj commented at 3:35 PM on September 21, 2012: member

I really like that this defines the number formats without relying on OpenSSL internals.

I've checked all 32 bit numbers for SetCompact, comparing old and new implementations, with the following mismatches:

0180xxxx returns -0 (old) versus 0 (new)
80800000 returns -0 (old) versus 0 (new)
028000xx returns -0 (old) versus 0 (new)
xx800000 returns -0 (old) versus 0 (new)

I'm not sure whether the distinction between +0/-0 is interesting, but there were no other differences.

If you want to verify for yourself, the test code is here: https://gist.github.com/3762207 . It takes about 20 minutes to run here.

I've also checked GetCompactNew(SetCompactOld(x)) == GetCompactOld(SetCompactOld(x)) using the same code. There I only find issues in handling signed/unsigned zero as well, for example:

SetCompactOld(01800000) produces GetCompactOld 0x00000000 and GetCompactNew 0x00800000

roques commented at 2:30 PM on September 25, 2012: contributor

OpenSSL's BN_set_negative() can not be used to create a -0: void BN_set_negative(BIGNUM *a, int b) { if (b && !BN_is_zero(a)) a->neg = 1; else a->neg = 0; } In fact, all of the OpenSSL basic arithmetic routines are careful not to create a -0. BN_mpi2bn() however can be used to create a -0.

I've looked at the uses of SetCompact() in bitcoind: They either are immediately followed by a getuint256() which ignores the sign, or a 0 instead of -0 does not make a difference.

If we want GetCompact(-0) to be 0 instead of 0x00800000 I could add a special case for that, but I don't think that's necessary.

xanatos commented at 8:34 PM on October 20, 2012: none

As another interesting sidenote, contrary to what it's written in the documentation, bn_get_word ignores the sign. So -255 returns 255 (the documentation says "BN_get_word() returns the value a, and 0xffffffffL if a cannot be represented as an unsigned long." http://linux.die.net/man/3/bn_get_word , but clearly -255 can't be represented as an unsigned long)

sipa commented at 9:10 AM on November 14, 2012: member

ACK

gavinandresen referenced this in commit d339da70e5 on Dec 12, 2012

gavinandresen merged this on Dec 12, 2012

gavinandresen closed this on Dec 12, 2012

laudney referenced this in commit 464a2695d5 on Mar 19, 2014

KolbyML referenced this in commit a06c0fd993 on Dec 5, 2020

DrahtBot locked this on Sep 8, 2021

Bignum2 #1825

CBigNum::SetCompact() and CBigNum::GetCompact()

`CBigNum::SetCompact()` and `CBigNum::GetCompact()`