Remove unused statements in serialization

paveljanik commented at 8:09 pm on September 2, 2016: contributor

As the line

0nVersion = this->nVersion;

seems to have no meaning in READ and also in WRITE serialization op, let’s remove it and see what our tests/travis will tell us. See #8468 for previous discussion.

dcousens commented at 1:11 am on September 3, 2016: contributor

I’d suspect the binaries to be the same if any optimization flags are enabled.

paveljanik commented at 6:26 am on September 3, 2016: contributor

Tests/travis is OK. Can anyone test binaries, please?

jonasschnelli added the label Refactoring on Sep 5, 2016

[WIP] Remove unused statement in serialization 64d9507ea5

paveljanik force-pushed on Sep 9, 2016

paveljanik commented at 12:00 pm on September 9, 2016: contributor

Rebased to match net.h-> addrdb.h commit from upstream.

sipa commented at 12:46 pm on September 9, 2016: member

The only place where the implicitly-passed-around nVersion variable is actually used is in CAddress::SerializationOp, and there it is just comparing with a constant directly.

paveljanik commented at 12:58 pm on September 9, 2016: contributor

@sipa Exactly. And there is no such line like in the proposed change.

sipa commented at 1:06 pm on September 9, 2016: member

utACK 64d9507ea5724634783cdaa290943292132086a9

After this I think we can actually go further and completely remove the nType and nVersion arguments from all SerializationOp methods, and replace them with calls to s.GetType() and s.GetVersion().

MarcoFalke renamed this:
~~WIP/DO NOT MERGE: Remove unused statements in serialization~~
Remove unused statements in serialization
on Sep 9, 2016

MarcoFalke commented at 8:18 pm on September 9, 2016: member

Tests/travis is OK. Can anyone test binaries, please? @paveljanik No need to test them. Apparently I get the same binaries, which means this is indeed dead code. (Instead of deleting the lines, you can replace them with empty lines to get rid of the offsets in the objdump). Can you check this?

MarcoFalke commented at 8:27 pm on September 9, 2016: member

bitcoind-matches-ACK 64d9507ea5724634783cdaa290943292132086a9 (qt binaries do not match, though)

luke-jr commented at 9:54 am on September 10, 2016: member

Eh, isn’t this intended to allow the serialised version to override the parameter?

sipa commented at 10:01 am on September 10, 2016: member

@luke-jr Yes, that was the intention. But I don’t think it’s very usable. First of all, it requires that everything can be encoded inside a single version number, something that is increasingly hard in a decentralized environment. It also has never ever actually been used. It also is sort of a layer violation, as you need a single namespace for versioning across all of wallet code, network code, consensus-critical hash computation, database storage, …

I think there are more flexible mechanisms possible using wrapper objects that introduce new serialization formats.

luke-jr commented at 10:32 am on September 10, 2016: member

These are all implementation-specific objects, so decentralisation is less of an issue. But I guess it’s fine so long as it’s never been used before (but I’m not certain of that).

laanwj commented at 8:43 am on September 14, 2016: member

Concept ACK.

Eh, isn’t this intended to allow the serialised version to override the parameter?

It can always be added back for a certain serialization op in the oft case that that needs to be used. No need to have dead code around ‘just in case’.

paveljanik commented at 5:21 pm on September 19, 2016: contributor

I think this is ready for more ACKs. I volunteer for the next steps pointed out by @sipa above.

laanwj commented at 2:21 pm on September 27, 2016: member

bitcoind-matches-ACK 64d9507 (qt binaries do not match, though)

Cannot reproduce this, I detect differences in the following functions between 64d9507 and 6898213:

0void CBanEntry::SerializationOp<CDataStream, CSerActionSerialize>(CDataStream&, CSerActionSerialize, int, int)
1void CBanEntry::SerializationOp<CDataStream, CSerActionUnserialize>(CDataStream&, CSerActionUnserialize, int, int)
2void CHDChain::SerializationOp<CDataStream, CSerActionSerialize>(CDataStream&, CSerActionSerialize, int, int)
3void CHDChain::SerializationOp<CDataStream, CSerActionUnserialize>(CDataStream&, CSerActionUnserialize, int, int)
4void CKeyMetadata::SerializationOp<CDataStream, CSerActionSerialize>(CDataStream&, CSerActionSerialize, int, int)
5void CKeyMetadata::SerializationOp<CDataStream, CSerActionUnserialize>(CDataStream&, CSerActionUnserialize, int, int)
6void CMerkleTx::SerializationOp<CDataStream, CSerActionSerialize>(CDataStream&, CSerActionSerialize, int, int)
7void CMerkleTx::SerializationOp<CDataStream, CSerActionUnserialize>(CDataStream&, CSerActionUnserialize, int, int)

MarcoFalke commented at 3:57 pm on September 27, 2016: member

35: mov [-$0x34a,%r8d-]{+$0x34b,%r8d+}

Make sure to replace the deleted lines by empty lines to get rid of the offset?

paveljanik commented at 4:11 pm on September 27, 2016: contributor

BTW - this is not -Wshadow PR…

I think that we should compare binaries in some higher -O…

sipa commented at 0:44 am on September 28, 2016: member

I think it’s unlikely that this can result in identical binaries. It requires some cross-function reasoning across multiple modules to see this has no effect. We are changing the actual values of arguments passed down - they’re just not used.

laanwj commented at 6:54 am on September 28, 2016: member

Make sure to replace the deleted lines by empty lines to get rid of the offset?

So these are line numbers? Ok, that’d make sense, will try again and w/ disabled LINE.

I think it’s unlikely that this can result in identical binaries.

Yes the changes in the constructors are more involved.

I think that we should compare binaries in some higher -O…

Increasing -O generally makes it worse, not better. A lot of optimization settings make the output extremely sensitive to small change in the input, as well as cause a change in one function to move another (little related) function, making per-function comparison useless. This is why I went out of my way to find specific flags that work for this in my build/compare script: https://github.com/laanwj/bitcoin-maintainer-tools/blob/master/build-for-compare.py#L33 . It’d be possible to add specific flags if they’re known to be safe.

paveljanik commented at 7:06 am on September 28, 2016: contributor

It is very unlikely to produce the same binary, but to be safe, we should fully understand the difference.

Can template inline play some role here?

As a side note: in the “gcc set” -Wshadow I automatically changed nVersion to _nVersion in class CBlock : public CBlockHeader. It was not used at all in the function, so I thought without testing it is OK. But p2p-segwit.py started reproducibly failing. I used that binary to test -rescan on testnet and it always “failed” - see #8808#pullrequestreview-1610754 for details. Thus I had to revert it. Simple rename! Can you shed some light on it? Maybe it is relevant with this one.

laanwj commented at 7:32 am on September 28, 2016: member

So these are line numbers? Ok, that’d make sense, will try again and w/ disabled LINE.

They had to do with __LINE__ usage in LOCK/TRY_LOCK, which is interesting because without lock debugging the line numbers aren’t used at all. So our sync.h macros/wrappers do incur a slight overhead even in that case.

In any case I’ve updated the above list, they no longer show up after removing line number sensitivity there.

laanwj commented at 7:47 am on September 28, 2016: member

We are changing the actual values of arguments passed down - they’re just not used.

Right: READWRITE expands to

0#define READWRITE(obj)      (::SerReadWrite(s, (obj), nType, nVersion, ser_action))

Thus silently passes nVersion to ::SerReadWrite. This function may or may not use the argument, but it is used.

This does change my opinion on this change from “harmless” to “hard to verify for correctness”.

As a side note: in the “gcc set” -Wshadow I automatically changed nVersion to _nVersion in class CBlock : public CBlockHeader. It was not used at all in the function, so I thought without testing it is OK.

Indeed, if you change the argument name, the name binding will change and functions called will suddenly receive this->nVersion instead of the function argument. this->nVersion obviously gets updated by the READWRITE(this->nVersion). This will change the executable, but should not change the behavior, which is essentially the same as nVersion = this->nVersion. So by renaming you’d keep the behavior that there is right now, instead of changing it as in this PR.

At least that’s what I would infer. I don’t see why it would lead to a crashing segwit test though… This makes it kind of scary.

laanwj commented at 7:54 am on September 28, 2016: member

in class CBlock : public CBlockHeader.

OHH I get it, maybe. Unlike the serialization functions changed in this PR, CBlockHeader::SerializationOp does not have the nVersion = this->nVersion line. This means that the behavior does change if you rename the argument nVersion to _nVersion. After all,

0READWRITE(this->nVersion);
1READWRITE(hashPrevBlock);

expands to

0(::SerReadWrite(s, (this->nVersion), nType, nVersion, ser_action))
1(::SerReadWrite(s, (hashPrevBlock), nType, nVersion, ser_action))

So the first line will changes this->nVersion, the second line passes nVersion. Which, after your argument renaming, refers to this->nVersion too. SO you effectively added nVersion = this->nVersion, which is a behavior change.

Do you really need that change for shadowing? I’d prefer not to do that, it looks to me that we’re taking a huge risk just to avoid some compiler warnings. CBlock is as deep in consensus code as you can get.

dcousens commented at 9:01 am on September 28, 2016: contributor

@laanwj IMHO, this reveals that READWRITE is hiding intricacies which should instead be in plain sight.

laanwj commented at 9:54 am on September 28, 2016: member

@laanwj IMHO, this reveals that READWRITE is hiding intricacies which should instead be in plain sight.

IMO it’d have been better to just write those macros out., they make the code shorter apparently but much more obfuscated. (but that’s an issue for another time)

sipa commented at 10:01 am on September 28, 2016: member

My suggestion was that after this PR we would proceed to get rid of the nVersion and nType implicit parameters, and just replace them with getters on the stream implementation.

That would make things much easier to reason about.

MarcoFalke commented at 6:42 pm on September 28, 2016: member

Guess I’ve been doing it wrong then.

 0git checkout bitcoin/master && \
 1git reset --hard HEAD && \
 2curl https://raw.githubusercontent.com/laanwj/bitcoin-maintainer-tools/6e4425587736144b067f67ad792d9ee904e74fd7/patches/stripbuildinfo.patch | patch -p 1 && \
 3make -j 2 && \
 4objdump -d -r -C --no-show-raw-insn src/bitcoind > /tmp/d_old && \
 5curl https://github.com/bitcoin/bitcoin/commit/64d9507ea5724634783cdaa290943292132086a9.diff | patch -p 1 && \
 6make -j 2 && \
 7objdump -d -r -C --no-show-raw-insn src/bitcoind > /tmp/d_new && \
 8diff /tmp/d_old /tmp/d_new | wc
 9
10      0       0       0

laanwj commented at 6:24 am on September 29, 2016: member

@MarcoFalke the differences there would be:

I check the commits themselves, you merge the patch on top of master
I use a specific set of optimization flags, you use default optimization flags (-O2)

I don’t think the first would make executables suddenly match. I’ll retry with “-O2” and see if I can get a match.

laanwj commented at 6:46 am on September 29, 2016: member

Good news: using -O2 gives a complete match on bitcoind:

0317c917ed86f7c32113598876ac3f48f84ea73281d01f69df1de78f429a019ed  /tmp/compare/bitcoind.64d9507.stripped
1317c917ed86f7c32113598876ac3f48f84ea73281d01f69df1de78f429a019ed  /tmp/compare/bitcoind.6898213.stripped

as well as on bitcoin-qt

08675a67f3c2e5d4fc76f0083caeaa50585ba612e9b881ca22deea35de7f62c8b  /tmp/compare/bitcoin-qt.64d9507.stripped
18675a67f3c2e5d4fc76f0083caeaa50585ba612e9b881ca22deea35de7f62c8b  /tmp/compare/bitcoin-qt.6898213.stripped

ACK 64d9507

dcousens approved

paveljanik commented at 7:46 am on September 29, 2016: contributor

@laanwj Thanks!

laanwj commented at 12:59 pm on September 29, 2016: member

This still has a [WIP] tag on the commit. However I’m going to merge nevertheless, as rebasing to change the commit message would have us all re-check executables again…

laanwj merged this on Sep 29, 2016

laanwj closed this on Sep 29, 2016

laanwj referenced this in commit 9b94cca41f on Sep 29, 2016

codablock referenced this in commit 824f541b06 on Sep 19, 2017

codablock referenced this in commit 575f625f4d on Jan 12, 2018

zkbot referenced this in commit 3b68ab255f on Apr 17, 2018

zkbot referenced this in commit d408e23ab7 on Apr 18, 2018

zkbot referenced this in commit 0753a0e8a9 on Apr 19, 2018

andvgal referenced this in commit 626a001e29 on Jan 6, 2019

furszy referenced this in commit 8cd9c592a9 on May 29, 2020

MarcoFalke locked this on Sep 8, 2021

Remove unused statements in serialization #8658