Rewrite addrdb with less duplication using CHashVerifier #10248
pull sipa wants to merge 1 commits into bitcoin:master from sipa:chashverifier changing 2 files +72 −144-
sipa commented at 12:00 pm on April 21, 2017: memberThis rewrites addrdb.cpp to use CHashVerifier (from #10195), and combines the code for addrdb and bandb to the extent possible.
-
laanwj added the label Utils and libraries on Apr 21, 2017
-
in src/validation.cpp:1252 in d76b7ac079 outdated
1454@@ -1455,19 +1455,18 @@ bool UndoReadFromDisk(CBlockUndo& blockundo, const CDiskBlockPos& pos, const uin 1455 1456 // Read block 1457 uint256 hashChecksum; 1458+ CHashVerifier<CAutoFile> verifier(&filein); // We need a CHashVerifier as reserializing may lose data
TheBlueMatt commented at 9:04 pm on April 21, 2017:I believe this comment is wrong if this is pulled out of #10195.
sipa commented at 3:21 am on May 3, 2017:Indeed; fixed.TheBlueMatt commented at 9:34 pm on April 21, 2017: memberutACK d76b7ac079e140948d589f5c53a4af15ddcad6f5sipa force-pushed on May 3, 2017in src/hash.h:184 in d016b2c79c outdated
179+ void ignore(size_t nSize) 180+ { 181+ char data[1024]; 182+ while (nSize > 0) { 183+ size_t now = std::min<size_t>(nSize, 1024); 184+ source->read(data, now);
ryanofsky commented at 9:36 pm on May 3, 2017:In commit “Introduce CHashVerifier to hash read data”
Could replace these two lines with just
read(data, now);
sipa commented at 9:15 pm on May 17, 2017:Done.in src/addrdb.cpp:140 in 0e8d567209 outdated
216- // de-serialize address data into one CAddrMan object 217- ssPeers >> addr; 218- } 219- catch (const std::exception& e) { 220- // de-serialization has failed, ensure addrman is left in a clean state 221+ bool ret = DeserializeDB(ssPeers, addr, false);
ryanofsky commented at 9:51 pm on May 3, 2017:In commit “Deduplicate addrdb.cpp and use CHashWriter/Verifier”
Not your change, but it isn’t clear to me why this why this function is a part of the CAddrDB class and is non-static. It doesn’t seem to access any state or be directly related to the class.
sipa commented at 9:13 pm on May 17, 2017:I’ve made it static.in src/addrdb.cpp:168 in 0e8d567209 outdated
170- CAutoFile filein(file, SER_DISK, CLIENT_VERSION); 171- if (filein.IsNull()) 172- return error("%s: Failed to open file %s", __func__, pathAddr.string()); 173- 174- // use file size to size memory buffer 175- uint64_t fileSize = fs::file_size(pathAddr);
ryanofsky commented at 10:02 pm on May 3, 2017:In commit “Deduplicate addrdb.cpp and use CHashWriter/Verifier”
It seems this logic to read the file into a memory buffer before deserializing it has gone away. Any idea why it was written that way in the first place?
sipa commented at 9:05 pm on May 17, 2017:I believe it’s to protect against the case where the data is unusually expensive to deserialize. I’m not particularly worried about (non-adverserial) corruption that causes expensive deserialization without causing an error somewhere along the way.
laanwj commented at 10:24 am on June 6, 2017:IIRC the point was to be able to check the checksum before starting any deserialization work
sipa commented at 7:53 pm on June 7, 2017:Indeed. I don’t think that’s particularly interesting, but if that remains a requirement, I’ll drop this PR.ryanofsky commented at 10:03 pm on May 3, 2017: memberutACK 0e8d567209ba198f64c7657acb0e05018bf0c5bein src/hash.h:174 in d016b2c79c outdated
169+ 170+public: 171+ CHashVerifier(Source* source_) : CHashWriter(source_->GetType(), source_->GetVersion()), source(source_) {} 172+ 173+ void read(char* pch, size_t nSize) 174+ {
sipa commented at 8:31 pm on May 17, 2017:It seems that the invoked read and write method already need to deal with nSize==0 anyway?in src/hash.h:171 in d016b2c79c outdated
166+{ 167+private: 168+ Source* source; 169+ 170+public: 171+ CHashVerifier(Source* source_) : CHashWriter(source_->GetType(), source_->GetVersion()), source(source_) {}
theuni commented at 8:12 pm on May 8, 2017:Pass a reference instead, so that it can’t be null?
sipa commented at 8:30 pm on May 17, 2017:I don’t like using references for objects that persist. This risks calling CHashVerifier(function_that_returns_a_Source()).theuni commented at 8:21 pm on May 8, 2017: memberutACK, other than the nitssipa force-pushed on May 17, 2017sipa commented at 9:13 pm on May 17, 2017: memberI think I’ve addressed or responded to all comments.Deduplicate addrdb.cpp and use CHashWriter/Verifier cf68a488a4sipa force-pushed on Jun 1, 2017sipa renamed this:
Introduce CHashVerifier to hash while deserializing
Rewrite addrdb with less duplication using CHashVerifier
on Jun 1, 2017in src/addrdb.cpp:25 in cf68a488a4
22+bool SerializeDB(Stream& stream, const Data& data) 23 { 24- pathBanlist = GetDataDir() / "banlist.dat"; 25+ // Write and commit header, data 26+ try { 27+ CHashWriter hasher(SER_DISK, CLIENT_VERSION);
TheBlueMatt commented at 5:37 pm on June 7, 2017:Seems it may be nicer to add a more analygous hasher to CHashVerifyer here - why not hash while writing the data instead of serializing twice?TheBlueMatt commented at 5:50 pm on June 7, 2017: memberFeel free to leave comment for future PR.
utACK cf68a488a4cc78d711ca6e5e3236c6d89d689079
laanwj commented at 5:54 pm on June 22, 2017: memberutACK cf68a48laanwj merged this on Jun 22, 2017laanwj closed this on Jun 22, 2017
laanwj referenced this in commit 01c4b143a8 on Jun 22, 2017PastaPastaPasta referenced this in commit 7261b741f6 on Jul 6, 2019PastaPastaPasta referenced this in commit 97515afe31 on Jul 8, 2019PastaPastaPasta referenced this in commit 620e15f873 on Jul 9, 2019PastaPastaPasta referenced this in commit 305fd06796 on Jul 11, 2019barrystyle referenced this in commit 296e58d0d9 on Jan 22, 2020random-zebra referenced this in commit 116bb50765 on Apr 20, 2021DrahtBot locked this on Sep 8, 2021
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 21:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me