Compress Blocks before sending #6973

ptschip commented at 1:33 PM on November 9, 2015: contributor

Compress Blocks before sending achieves an average of 21% block compression.

When blocks are almost full and at the highest Zlib setting,level 9, there is an average 22% block compression and takes 0.19 seconds to compress. At level 6 (which has been set as the default) there is 21% block compression but only takes 0.09 seconds. Decompression is very fast in all cases averaging only 0.008 seconds.

NOTE: all block data used to gather these numbers was from mainnet and compression was done using a 4 year old laptop with a Celeron processor. (Current i7 processors are 8 times more powerful)

laanwj added the label P2P on Nov 9, 2015

ptschip force-pushed on Nov 9, 2015

paveljanik commented at 1:47 PM on November 9, 2015: contributor

Do we want this as a default protocol feature? Isn't it better to make it optional and use it only if available/announced?

paveljanik commented at 1:48 PM on November 9, 2015: contributor

https://www.cvedetails.com/vulnerability-list/vendor_id-72/product_id-1820/GNU-Zlib.html

ptschip commented at 1:54 PM on November 9, 2015: contributor

You can turn it off by setting -compressionlevel=0 , and it is backwardly compatible with nodes that do not support compression

On 09/11/2015 5:48 AM, paveljanik wrote:

Do we want this as a default protocol feature? Isn't it better to make it optional and use it only if available/announced?

— Reply to this email directly or view it on GitHub #6973 (comment).

in src/main.cpp:None in 7be5ee57bf outdated

 210 | @@ -208,6 +211,7 @@ namespace {
 211 |      set<int> setDirtyFileInfo;
 212 |  } // anon namespace
 213 |  
 214 | +

paveljanik commented at 2:00 PM on November 9, 2015:

This hunk can be removed.

in src/main.cpp:None in 7be5ee57bf outdated

4935 | @@ -4868,6 +4936,7 @@ bool ProcessMessages(CNode* pfrom)
4936 |              continue;
4937 |          }
4938 |  
4939 | +

paveljanik commented at 2:00 PM on November 9, 2015:

This hunk can be removed.

in src/main.h:None in 7be5ee57bf outdated

  82 | @@ -83,6 +83,8 @@ static const unsigned int DATABASE_WRITE_INTERVAL = 60 * 60;
  83 |  static const unsigned int DATABASE_FLUSH_INTERVAL = 24 * 60 * 60;
  84 |  /** Maximum length of reject messages. */
  85 |  static const unsigned int MAX_REJECT_MESSAGE_LENGTH = 111;
  86 | +/** 0 = no compression , 9 = maximum compression. */

paveljanik commented at 2:00 PM on November 9, 2015:

Please remove space before comma.

jonasschnelli commented at 2:09 PM on November 9, 2015: contributor

Compression/decompression before/after transmitting data through the internet in general is a good idea. Very likely the CPU costs is tiny,.. but it would be nice to see a benchmark of a node serving compressed blocks for IBD to another node.

Two things which moves this PR towards NACK territory for me:

adding transmission compression should happen in a different, deeper layer and should be independent from the used p2p command
compression should be optional (version bits?)

Maybe apache's mod_deflate could be a point of inspiration.

ptschip force-pushed on Nov 9, 2015

jmcorgan commented at 2:46 PM on November 9, 2015: contributor

Agree this capability should be advertised so it only comes into effect between mutually supporting nodes; also, even as-is it should default to compression level 0 to allow the node operator to turn on if desired. Concept ACK, implementation NACK.

ptschip force-pushed on Nov 9, 2015

ptschip commented at 3:42 PM on November 9, 2015: contributor

On 09/11/2015 6:10 AM, Jonas Schnelli wrote:

Compression/decompression before/after transmitting data through the internet I general is a good idea. Very likely the CPU costs is tiny,.. but it would be nice to see a benchmark of a node serving compressed blocks for IBD to another node.

I'll work on this. Currenlty my only benchmark is with a 600MB blockchain. Takes 3:47 to sync without compression and 1:25 with compression. However, those numbers are quite optomistic since the blocks in my blockchain, although full blocks, compress down very small.

Two things which moves this PR towards NACK territory for me:

adding transmission compression should happen in a different, deeper layer and should be independent from the used p2p command

Perhaps a good idea...I think the only place for it would be CDataStream....then we could do ss.compress(), ss.decompress(). I'll work on that.

compression should be optional (version bits?)

Compression is optional by setting -compressionlevel=0 (this bypasses compression entirely and uses the current code for block sending).

version bits? I'm not sure how that would make anything better. I was getting the remote nodes' protocol version during the handshake and using 70011 or higher to determine if nodes are accepting block compression or not.

Maybe apache's mod_deflate could be a point of inspiration.

— Reply to this email directly or view it on GitHub #6973 (comment).

ptschip force-pushed on Nov 9, 2015

sipa commented at 4:20 PM on November 9, 2015: member

I don't think a 90ms delay before a block can be transmitted is acceptable. Is there a way to cache the compressed result and relay that, if the peer accepts compressed blocks? There may be other compression algorithms which compress much faster (miniLZO?) that are more appropriate.

Compression may not be useful for all messages, if done on a per-message basis, as most compression algorithms don't offer significant compression for small data amounts.

We need a way for peers to advertize support for compressed messages or a compressed network link.

This will need a BIP, but it looks interesting and reasonably simple to do.

ptschip commented at 6:55 PM on November 9, 2015: contributor

On 09/11/2015 8:21 AM, Pieter Wuille wrote:

I don't think a 90ms delay before a block can be transmitted is acceptable. Is there a way to cache the compressed result and relay that, if the peer accepts compressed blocks? There may be other compression algorithms which compress much faster (miniLZO?) that are more appropriate.

The 90ms delay is just for compression of large blocks...small blocks can be 10 or 20 milliseconds, and these numbers come from a very slow laptop. Also the transmission should be much faster and make up for the 90ms compression, particularly when network latency is high. For example: a block that normally takes 1 second to transmit should save (200-90=110) millis, but i'll have some better numbers for that forthcoming. Compression may not be useful for all messages, if done on a per-message basis, as most compression algorithms don't offer significant compression for small data amounts.

I was surprised to see even on small blocks of 181 bytes the compression is still 20% using zlib. So we could theoretically apply this to transactions as well. But, first get it all working for Blocks though.

We need a way for peers to advertize support for compressed messages or a compressed network link. The current build is backwardly compatible and uses the protocol version to determine whether nodes can handle compression/decompression, but it does assume in most cases that compression would be ON. To get around that we could send blocks with a different process message string such as, pfrom->PushMessage("cmp_block", block); as opposed to
pfrom->PushMessage("block", block); that way the receiving node will know if this is a compressed or uncompressed block and act accordingly. IMO, Other than using the protocol version, I'm not sure we need to advertise that we can accept compressed blocks.

This will need a BIP, but it looks interesting and reasonably simple to do.

ok, will do

— Reply to this email directly or view it on GitHub #6973 (comment).

gmaxwell commented at 8:57 PM on November 9, 2015: contributor

Interesting to see someone trying this.

Matt's relay network protocol can achieve much better compaction (e.g. sending a 1MB block in a couple kilobytes); and considering zlib's security non-track record... this doesn't excite me greatly. Then one must consider the potential adverse effect on the system from punishing people for failing to reuse addresses, which is another impact matt's protocol does not have.

Given the approach these numbers seemed suspect to me-- but I can verify them*. When we tested gzipping single block yeas ago we got much worse results... I wonder how much this is due to spam attack transactions (where pubkeys/txouts are repeated unusually often)?

I'd like to better understand where the compression is coming from, and I think it's critical that we do so: The only significant gains this should really be getting is from reused pubkeys, which is quite concerning. Protocol aware compression could likely do better (e.g. blocks can't spend outputs that don't exist.) without that problem.

Better than compressing blocks may be be compressing the stream of transactions, as doing that is not exclusive with the relay network protocol style compression; though it would continue the problem of punishing people who do not behave in a fungiblity destroying manner.

For mining block relay performance is a consideration; adding 90ms is an considerable fraction of the total processing time. For non-latency critical relay (e.g. nodes not involved anywhere near mining) a process that used mempool set reconciliation on just the txids, then more round trips to send the unknowns would be the bandwidth minimizing but (again) it wouldn't be exclusive with compressing the transactions which are sent further.

This would perhaps be more interesting if we also considered a flag to ask peers to never inv/send loose transactions... a blocks only node would not gain from set reconciliation.

*Here are some recent results:

$ for i in {382658..382558} ; do ../bitcoin-cli getblock `../bitcoin-cli getblockhash $i` false | xxd -r -p >   $i.block ;echo $i ; done
$ for i in *.block ; do gzip -c $i > $i.gz ; done
$ for i in *.block ; do lzop -c $i > $i.lzop ; done
$ for i in *.block ; do xz -9 -c $i > $i.xz ; done
$ (for i in *.block ; do echo $i `stat -c %s $i` `stat -c %s $i.gz` `stat -c %s $i.lzop` `stat -c %s $i.xz` ; done) | awk '{print $1 " " $2 " " $3 " " $4 " " $3/$2 " " $4/$2 " " $5/$2 }'
#block  size gzip_size lzop_size xz_size gzip_ratio lzop_ratio xz_ratio
382558.block 122824 90407 96736 0.736069 0.787599 0.721032
382559.block 988572 799051 858113 0.808288 0.868033 0.779019
382560.block 266 251 296 0.943609 1.11278 1.08271
382561.block 333353 272856 293846 0.81852 0.881486 0.801061
382562.block 541623 435686 468303 0.804408 0.864629 0.779236
382563.block 227 216 268 0.951542 1.18062 1.11013
382564.block 341258 283515 302818 0.830794 0.887358 0.805572
382565.block 893559 717071 768396 0.802489 0.859928 0.772515
382566.block 749057 611027 655882 0.815728 0.87561 0.782061
382567.block 942454 712471 769150 0.755974 0.816114 0.724004
382568.block 313 284 342 0.907348 1.09265 1.02236
382569.block 811435 604537 651208 0.745022 0.802539 0.714995
382570.block 400081 311668 335881 0.779012 0.839532 0.757767
382571.block 649033 510302 550495 0.78625 0.848177 0.757663
382572.block 367810 281224 305532 0.76459 0.830679 0.739936
382573.block 996160 758662 816582 0.761586 0.81973 0.731915
382574.block 155247 116737 125048 0.751944 0.805478 0.737972
382575.block 509657 403780 430034 0.792258 0.843771 0.766186
382576.block 52252 40904 43560 0.782822 0.833652 0.774018
382577.block 655587 506850 543875 0.773124 0.8296 0.744682
382578.block 705484 512772 555897 0.726837 0.787965 0.702088
382579.block 238714 187717 200740 0.786368 0.840923 0.762519
382580.block 447591 314203 342661 0.701987 0.765567 0.67937
382581.block 208 201 251 0.966346 1.20673 1.15385
382582.block 301355 233817 249768 0.775886 0.828817 0.75195
382583.block 206800 160699 171977 0.777074 0.83161 0.757369
382584.block 999899 775152 830125 0.77523 0.830209 0.745215
382585.block 998060 750095 801905 0.751553 0.803464 0.723207
382586.block 984373 763464 816055 0.775584 0.82901 0.746465
382587.block 232076 189910 202464 0.81831 0.872404 0.796032
382588.block 949186 750718 803205 0.790907 0.846204 0.765233
382589.block 999998 804064 860562 0.804066 0.860564 0.77145
382590.block 931152 742625 793933 0.797534 0.852635 0.773549
382591.block 544959 303828 337169 0.557525 0.618705 0.533398
382592.block 929600 736238 787706 0.791994 0.84736 0.764514
382593.block 247668 205529 219612 0.829857 0.886719 0.80553
382594.block 883901 686950 734745 0.77718 0.831253 0.751989
382595.block 749166 593357 635020 0.792023 0.847636 0.762891
382596.block 847163 615690 661539 0.726767 0.780888 0.702406
382597.block 999959 751467 804191 0.751498 0.804224 0.72805
382598.block 933760 769875 820098 0.824489 0.878275 0.794414
382599.block 908684 699290 749225 0.769563 0.824517 0.744113
382600.block 266 251 296 0.943609 1.11278 1.08271
382601.block 431600 340621 364402 0.789205 0.844305 0.764727
382602.block 935892 685965 737263 0.732953 0.787765 0.70459
382603.block 292455 234383 249536 0.801433 0.853246 0.781768
382604.block 559251 443879 473626 0.793703 0.846893 0.77003
382605.block 213638 163193 174536 0.763876 0.816971 0.747096
382606.block 74820 61262 64615 0.818792 0.863606 0.807378
382607.block 420724 333483 356493 0.792641 0.847332 0.767791
382608.block 283325 228900 244426 0.807906 0.862705 0.786884
382609.block 435743 356102 379651 0.817229 0.871273 0.790034
382610.block 980774 750634 803926 0.765349 0.819685 0.742646
382611.block 765796 557447 598262 0.727931 0.781229 0.713459
382612.block 235941 193486 211086 0.820061 0.894656 0.800912
382613.block 929665 748107 798310 0.804706 0.858707 0.775454
382614.block 590108 477945 507778 0.809928 0.860483 0.777939
382615.block 157373 122041 130375 0.775489 0.828446 0.762545
382616.block 236 228 278 0.966102 1.17797 1.11864
382617.block 158077 127455 136356 0.806284 0.862592 0.789868
382618.block 918867 742465 792599 0.808022 0.862583 0.778311
382619.block 382858 307429 328248 0.802984 0.857362 0.764597
382620.block 749171 615280 657365 0.821281 0.877457 0.790751
382621.block 994183 794648 848814 0.799298 0.85378 0.769388
382622.block 999884 772020 830017 0.77211 0.830113 0.746303
382623.block 999989 796868 861281 0.796877 0.86129 0.761964
382624.block 718452 537754 598137 0.74849 0.832536 0.705472
382625.block 749109 565427 630255 0.754799 0.84134 0.719287
382626.block 989025 783333 843602 0.792025 0.852963 0.726938
382627.block 749198 609404 649044 0.813408 0.866318 0.787018
382628.block 657078 507435 542607 0.77226 0.825788 0.748155
382629.block 931820 747230 801537 0.801904 0.860184 0.77433
382630.block 315315 251163 268249 0.796546 0.850733 0.778713
382631.block 999889 769033 827318 0.769118 0.82741 0.738514
382632.block 932469 735094 786156 0.788331 0.843091 0.76205
382633.block 998186 735943 800815 0.73728 0.80227 0.712028
382634.block 430757 339819 362681 0.788888 0.841962 0.766845
382635.block 884195 714246 764549 0.807792 0.864684 0.781488
382636.block 930310 676394 717216 0.727063 0.770943 0.675917
382637.block 749052 553888 596624 0.739452 0.796505 0.720911
382638.block 519469 457459 480686 0.880628 0.925341 0.86235
382639.block 676344 521378 554828 0.770877 0.820334 0.745514
382640.block 103406 85648 91046 0.828269 0.880471 0.811365
382641.block 72391 60329 63932 0.833377 0.883148 0.815184
382642.block 232520 181101 193590 0.778862 0.832574 0.761431
382643.block 873971 679367 724255 0.777334 0.828695 0.750332
382644.block 331142 264139 281885 0.797661 0.851251 0.778433
382645.block 116272 95675 101685 0.822855 0.874544 0.811717
382646.block 542584 434716 464523 0.801196 0.856131 0.776934
382647.block 198145 163891 175818 0.827127 0.88732 0.808781
382648.block 998014 773141 826789 0.77468 0.828434 0.750466
382649.block 175104 142006 157459 0.810981 0.899231 0.794088
382650.block 148659 122148 129465 0.821666 0.870886 0.80649
382651.block 52570 43943 46521 0.835895 0.884934 0.824805
382652.block 949128 752697 805762 0.793041 0.84895 0.766464
382653.block 814260 624485 671208 0.766936 0.824317 0.740004
382654.block 368017 298131 318899 0.810101 0.866533 0.787279
382655.block 798832 626008 668033 0.783654 0.836262 0.7568
382656.block 166027 134814 143374 0.812 0.863558 0.793943
382657.block 171837 136226 145551 0.792763 0.847029 0.777807
382658.block 238265 188116 201485 0.789524 0.845634 0.770655

sipa commented at 10:27 PM on November 9, 2015: member

@gmaxwell My vague recollection is that gzip gave 20% years ago, actually.

gmaxwell commented at 12:31 AM on November 10, 2015: contributor

@sipa I thought it gave gains when compressing many blocks, and almost nothing on single blocks? (gained from reuse, but reuse in a block was rare.) I could be in space. (incidentally, catting up those 101 blocks above and xzing as a single file gets 31% compression).

ptschip renamed this:
~~Zlib Block Compression for block relay~~
Compress Blocks before sending
on Nov 14, 2015

ptschip commented at 5:03 PM on November 14, 2015: contributor

@gmaxwell After running several tests the data is showing there is about a 20% compression benefit and surprisingly a slight improvement in performance particularly when latency is present. It seems that in the past this was looked into but dropped but I'm wondering if in the past gzip was used as a comparison rather than the zlib library directly. Gzip while based on zlib, can add additional metadata to a file and I don't think you have the option of selecting compression levels with gzip and maybe that's where the difference lies.

As far as moving forward, I'd like to keep working on this and see that there is some benefit to perhaps enhancing the code further, trying out different compression libraries... But I'm a little unclear on where this should go from here. I'm wondering if this really does need a BIP given the compression feature can be turned fully on/off through config settings and it's backward compatible with previous versions (advertising as a service). It doesn't seem that this is a major change to bitcoin but rather just another feature that is configurable? I'd like to get your guidance on this.

jgarzik commented at 5:06 PM on November 14, 2015: contributor

Protocol changes for behavior that appears on the network, even optional ones, need a BIP.

You're absolutely right, it is something that is easily made configurable, and there will probably be additional thought and discussion given to default setting(s) related to such a protocol feature, and an implementation's use of it.

ptschip commented at 8:52 PM on November 14, 2015: contributor

@jgarzik Thanks for the clarification. Just have one more question. Do I write up a formal BIP proposal first and then get a BIP number assigned or get a BIP number and then write the formal proposal?

gmaxwell commented at 9:08 PM on November 14, 2015: contributor

First figure out what you want to propose in detail while discussing with others, then write the document, post it as a draft (so people can comment), then make a pull req to the bips repository adding it and request a number and one will be assigned there.

I suggest trying to drop the standard compressor and try using a simple custom one. For example, that keeps a little 256 entry LFU cache for txouts scriptpubkey and a 256 entry cache for 'script sigs' the first push if there are two and first two if there are more than two. I think such a construction could get equal (or perhaps better) compression than zlib and be much faster. (other small improvements would be similarly coding nsequences, txver, nlocktime, and vin-- but I these are necessarily small improvements that will only save a few bytes per transaction)

jgarzik commented at 9:23 PM on November 14, 2015: contributor

"meat before number assignment" - you want demo implementation and BIP-draft-ptschip-compression written before requesting number, ideally.

ptschip commented at 9:26 PM on November 14, 2015: contributor

sounds good...thanks.

On 14/11/2015 1:23 PM, Jeff Garzik wrote:

"meat before number assignment" - you want demo implementation and |BIP-draft-ptschip-compression| written before requesting number, ideally.

— Reply to this email directly or view it on GitHub #6973 (comment).

ptschip force-pushed on Nov 18, 2015

ptschip force-pushed on Nov 19, 2015

ptschip force-pushed on Nov 23, 2015

ptschip force-pushed on Nov 28, 2015

ptschip force-pushed on Nov 30, 2015

Datastream compression for Blocks and Tx's

Compress blocks and transactions using the
LZO1x Library before sending. Blocks
and transactions are concatenated where
possible.

8b4a62e308

ptschip force-pushed on Nov 30, 2015

paveljanik commented at 8:17 PM on January 20, 2016: contributor

Needs rebase.

Any progress on this?

laanwj cross-referenced this on Jan 25, 2016 from issue Possibly more efficient blocks? by yuilleb

bgorlick commented at 9:23 PM on January 26, 2016: none

I would like to suggest Brotli as an compression technique. I've done some brief analysis, possibly it could provide 20% improvement in compression and a speed bump.

rebroad commented at 4:43 PM on February 1, 2016: contributor

Can compression get significantly better if many blocks are compressed together?

laanwj commented at 3:17 PM on February 2, 2016: member

@rebroad This was always the case. E.g. compressing a blkXXXX.dat gets a better compression ratio than compressing a separate block. This was mainly due to repeating outputs, for example due to gambling games. I'm don't have statistics by how much though.

laanwj added the label Brainstorming on Feb 4, 2016

laanwj commented at 11:28 AM on March 24, 2016: member

Closing this for now: progress seems to be stuck, both at the BIP discussion and implementation side.

laanwj closed this on Mar 24, 2016

ptschip deleted the branch on Apr 5, 2016

bitcoin locked this on Sep 8, 2021