Fixes subscript 0 (&var[0]) where should use (var.data()) instead. #9804

pull JeremyRubin wants to merge 12 commits into bitcoin:master from JeremyRubin:fix-subscript-0 changing 13 files +51 −57
  1. JeremyRubin commented at 7:45 PM on February 19, 2017: contributor

    This PR fixes up a couple places where the codebase has semi-risky uses of 0 subscript. I don't think any of these are actively bugs, but some of them are unsafe (or unexpected-errors) for unsanitized inputs (such as in addrdb or key.cpp).

    This also allows us to eliminate some bounds checks, so I guess this is a net-negative patch. I don't think any of those bounds checks are performance sensitive, but worth double checking in review.

    p.s. I also fix some confusing code in murmurhash (with non obvious offset-math) as it took me a little bit to see that it was correct, could separate that out if desired.

  2. fanquake added the label Refactoring on Feb 20, 2017
  3. in src/utilstrencodings.cpp:None in 2937635978 outdated
     228 | @@ -229,7 +229,7 @@ vector<unsigned char> DecodeBase64(const char* p, bool* pfInvalid)
     229 |  string DecodeBase64(const string& str)
     230 |  {
     231 |      vector<unsigned char> vchRet = DecodeBase64(str.c_str());
     232 | -    return (vchRet.size() == 0) ? string() : string((const char*)vchRet.data(), vchRet.size());
     233 | +    return string((const char*)vchRet.data(), vchRet.size());
    


    dcousens commented at 3:39 AM on February 20, 2017:

    std::string maybe?

  4. dcousens approved
  5. JeremyRubin force-pushed on Feb 20, 2017
  6. JeremyRubin commented at 5:23 AM on February 20, 2017: contributor

    Sorry for the bit of line noise, there was some UB when I removed some of the branches (memset/memcpy/read/write 0 bytes on nullptr/invalid ptr) that was causing one of the builds (with more aggressive runtime checks) to fail. I added them back in and all seems well now.

  7. laanwj commented at 3:49 PM on February 20, 2017: member

    Whoa, good catch.

    I'm shocked that &vch[0] use was still so rampant in the code. We'd introduced begin_ptr with the intention to replace those, and phased even that out again with C++11 .data().

    Anyhow, concept ACK.

  8. in src/serialize.h:None in efa6deae0f outdated
     570 | @@ -571,7 +571,7 @@ void Serialize_impl(Stream& os, const prevector<N, T>& v, const unsigned char&)
     571 |  {
     572 |      WriteCompactSize(os, v.size());
     573 |      if (!v.empty())
     574 | -        os.write((char*)&v[0], v.size() * sizeof(T));
     575 | +        os.write((char*) v.data(), v.size() * sizeof(T));
    


    laanwj commented at 3:53 PM on February 20, 2017:

    I wonder why we're casting to (char*) here iso (const char*), even though these are write operations and the type passed in is const.

    There's a couple more of these. Though seems out of scope to fix in this pull.


    JeremyRubin commented at 4:53 PM on February 20, 2017:

    Yeah that seems obviously incorrect, but I think I understand why it was done (or at least have a theory). Because the methods are all generic w.r.t. the stream type, they probably are made to work with some arbitrary stream that can't accept a const char *. Hopefully not the case anymore...

    Anyways, probably best to fix those up separately as you suggest.


    laanwj commented at 1:25 PM on February 21, 2017:

    Sure, but it would be wrong for a stream write to take a char* i.s.o. const char*, so fixing those just means the const-correctness spreads.

  9. JeremyRubin commented at 5:14 PM on February 20, 2017: contributor

    There are still a few more places that were trickier/not possible to correct (where the type passed in is templated, such that it would be a pointer or a vector).

    I also didn't fix anything in subdirectories, because I felt this PR was getting a bit large already, and there are a few you can still grep for pretty easily there:

    ❯ grep '\&v[A-z0-9_]*\[0\]' */*.cpp | nl
        1	qt/signverifymessagedialog.cpp:    ui->signatureOut_SM->setText(QString::fromStdString(EncodeBase64(&vchSig[0], vchSig.size())));
        2	rpc/misc.cpp:    return EncodeBase64(&vchSig[0], vchSig.size());
        3	script/ismine.cpp:        CRIPEMD160().Write(&vSolutions[0][0], vSolutions[0].size()).Finalize(hash.begin());
        4	script/sigcache.cpp:        CSHA256().Write(nonce.begin(), 32).Write(hash.begin(), 32).Write(&pubkey[0], pubkey.size()).Write(&vchSig[0], vchSig.size()).Finalize(entry.begin());
        5	script/sign.cpp:        CRIPEMD160().Write(&vSolutions[0][0], vSolutions[0].size()).Finalize(h160.begin());
        6	script/standard.cpp:            CHash160().Write(&vSolutions[0][0], vSolutions[0].size()).Finalize(h160);
        7	test/getarg_tests.cpp:    ParseParameters(vecChar.size(), &vecChar[0]);
        8	test/skiplist_tests.cpp:        BOOST_CHECK(vIndex[from].GetAncestor(0) == &vIndex[0]);
        9	wallet/crypter.cpp:    size_t nLen = enc.Encrypt(&vchPlaintext[0], vchPlaintext.size(), &vchCiphertext[0]);
       10	wallet/crypter.cpp:    nLen = dec.Decrypt(&vchCiphertext[0], vchCiphertext.size(), &vchPlaintext[0]);
       11	wallet/rpcwallet.cpp:    return EncodeBase64(&vchSig[0], vchSig.size());
       12	wallet/wallet.cpp:    GetStrongRandBytes(&vMasterKey[0], WALLET_CRYPTO_KEY_SIZE);
    

    (there are a lot more worth checking if you don't use the hungarian notation v prefix as a filter)

  10. in src/serialize.h:None in db3d75b417 outdated
     548 | @@ -549,7 +549,7 @@ void Serialize(Stream& os, const std::basic_string<C>& str)
     549 |  {
     550 |      WriteCompactSize(os, str.size());
     551 |      if (!str.empty())
     552 | -        os.write((char*)&str[0], str.size() * sizeof(str[0]));
     553 | +        os.write((char*) str.data(), str.size() * sizeof(typename std::basic_string<C>::value_type));
    


    TheBlueMatt commented at 11:16 PM on February 20, 2017:

    AFAICT, this is equivalent to sizeof(C) (std::basic_string<C>::value_type is std::basic_string<C, std::char_traits<C>>::value_type which is std::char_traits<C>::char_type, which is C).


    laanwj commented at 1:28 PM on February 21, 2017:

    @thebluematt is right


    JeremyRubin commented at 12:17 AM on February 22, 2017:

    Yep, I was aware :) I felt that the version I chose was more similar to the existing code (the size of whatever str[0] returns), but if everyone is comfortable with sizeof(C) I can change it.


    JeremyRubin commented at 11:09 AM on February 22, 2017:

    👍 fixed

  11. in src/utilstrencodings.cpp:None in efa6deae0f outdated
     228 | @@ -229,7 +229,7 @@ vector<unsigned char> DecodeBase64(const char* p, bool* pfInvalid)
     229 |  string DecodeBase64(const string& str)
     230 |  {
     231 |      vector<unsigned char> vchRet = DecodeBase64(str.c_str());
     232 | -    return (vchRet.size() == 0) ? string() : string((const char*)&vchRet[0], vchRet.size());
     233 | +    return std::string((const char*)vchRet.data(), vchRet.size());
    


    TheBlueMatt commented at 12:20 AM on February 21, 2017:

    Hmm, based on cppreference I cant tell if this is well-defined with vchRet.size() == 0 or not.


    dcousens commented at 12:48 AM on February 21, 2017:

    From my understanding (it doesn't seem ambiguous, but, it certainly isn't definitive), a len of 0 is well-defined, and simply an empty string. Aka, equivalent to std::string(). I don't have a definitive statement from the reference.


    laanwj commented at 1:26 PM on February 21, 2017:

    Yes, a string with size 0 is well-defined. Calling .data() on a string of size 0 is also defined. This should be ok.


    sipa commented at 6:00 PM on February 21, 2017:

    The question is whether calling std::string(garbage_pointer, 0) is well-defined. I can't tell from what cppreference.com says (The behavior is undefined if s does not point at an array of at least count elements of CharT, including the case when s is a null pointer.).

    As an alternative, what about std::string(vchRet.begin(), vchRet.end())?


    laanwj commented at 6:22 PM on February 21, 2017:

    I read that as: If the count is 0, it always points at least 0 elements, thus "point at an array of at least count elements of CharT" is always satisfied.

    Alternatively: if count is 0 it would be wrong to dereference the pointer's address at all.


    sipa commented at 6:35 PM on February 21, 2017:

    I'm not convinced that "a pointer to an array of 0 elements of type T" can be any pointer. I think it's extremely unlikely to matter, and that every C++ std library effectively is implemented in a way that makes the constructor, when invoked with (pointer, 0) effectively ignore pointer, however.


    TheBlueMatt commented at 7:23 PM on February 21, 2017:

    I mean when cppreference is ambiguous thats probably a good sign that trying to parse a wiki is going to get your an incorrect result...best to go to the standard if in doubt in that case. :/


    JeremyRubin commented at 12:59 AM on February 22, 2017:

    I don't think it is can be a garbage pointer. vector.data() is guaranteed to return a valid range (even when empty), which I think would require that it have proper alignment.


    laanwj commented at 8:25 AM on February 22, 2017:

    Just to be complete: this is not the same as std::string(vchRet.begin(), vchRet.end()) in the general case. When creating an iterator range, if the type of vch is different from the type of string, a per-element cast will happen. So string.size() will be vch.size(). Usual issues with casting between signed and unsigned apply.

    std::string((const char*)vchRet.data(), vchRet.size()) however just return the memory area with all the elements of the vector as a string. So string.size() will be vch.size()*sizeof(vch::value_type). It is likely also more efficient.

  12. gmaxwell commented at 12:57 AM on February 21, 2017: contributor

    Concept ACK.

  13. practicalswift commented at 3:53 PM on February 28, 2017: contributor

    Concept ACK! 👍

  14. JeremyRubin force-pushed on Mar 27, 2017
  15. JeremyRubin force-pushed on Mar 27, 2017
  16. JeremyRubin commented at 5:22 PM on March 27, 2017: contributor

    Rebased.

  17. JeremyRubin force-pushed on Mar 27, 2017
  18. in src/hash.cpp:43 in 66377fa0bf outdated
      58 | -        //----------
      59 | -        // tail
      60 | -        const uint8_t* tail = (const uint8_t*)(&vDataToHash[0] + nblocks * 4);
      61 | +    //----------
      62 | +    // tail
      63 | +    const uint8_t* tail = (const uint8_t*)(vDataToHash.data() + nblocks * 4);
    


    ryanofsky commented at 10:26 PM on March 27, 2017:

    In commit: "Cleanup (safe, it was checked) 0-subscript in MurmurHash3"

    Could you drop this (const uint8_t*) cast here? It shouldn't be needed since we already call vDataToHash.data() with no cast above on line 27.

  19. in src/utilstrencodings.cpp:230 in 33ce56d867 outdated
     226 | @@ -227,8 +227,8 @@ std::vector<unsigned char> DecodeBase64(const char* p, bool* pfInvalid)
     227 |  
     228 |  std::string DecodeBase64(const std::string& str)
     229 |  {
     230 | -    std::vector<unsigned char> vchRet = DecodeBase64(str.c_str());
     231 | -    return (vchRet.size() == 0) ? std::string() : std::string((const char*)&vchRet[0], vchRet.size());
     232 | +    vector<unsigned char> vchRet = DecodeBase64(str.c_str());
    


    ryanofsky commented at 10:38 PM on March 27, 2017:

    In commit "Fix 0-subscript in utilstrencodings.cpp"

    Should put back std:: prefixes.

  20. in src/utilstrencodings.cpp:417 in 33ce56d867 outdated
     413 | @@ -414,8 +414,13 @@ std::vector<unsigned char> DecodeBase32(const char* p, bool* pfInvalid)
     414 |  
     415 |  std::string DecodeBase32(const std::string& str)
     416 |  {
     417 | +<<<<<<< 1f0401c11315b5090cfe2b37f6338205e2984372
    


    ryanofsky commented at 10:38 PM on March 27, 2017:

    In commit "Fix 0-subscript in utilstrencodings.cpp"

    Bad merge here.

  21. in src/streams.h:388 in f3cc981bad outdated
     384 | @@ -385,7 +385,7 @@ class CDataStream
     385 |      {
     386 |          // Special case: stream << stream concatenates like stream += stream
     387 |          if (!vch.empty())
     388 | -            s.write((char*)&vch[0], vch.size() * sizeof(vch[0]));
     389 | +            s.write((char*) vch.data(), vch.size() * sizeof(typename decltype(vch)::value_type));
    


    ryanofsky commented at 10:50 PM on March 27, 2017:

    In commit "Fix subscript-0 in streams.h"

    It would seem more direct (and shorter) to write sizeof(*vch.data()).

  22. in src/utilstrencodings.cpp:230 in 4afa57fcb6 outdated
     226 | @@ -227,8 +227,8 @@ std::vector<unsigned char> DecodeBase64(const char* p, bool* pfInvalid)
     227 |  
     228 |  std::string DecodeBase64(const std::string& str)
     229 |  {
     230 | -    vector<unsigned char> vchRet = DecodeBase64(str.c_str());
     231 | -    return (vchRet.size() == 0) ? string() : string((const char*)vchRet.data(), vchRet.size());
     232 | +    std::vector<unsigned char> vchRet = DecodeBase64(str.c_str());
    


    ryanofsky commented at 10:53 PM on March 27, 2017:

    In commit "Remove uneccessary branches in utilstrencodings string constructors."

    Some of the changes in this commit should be squashed into previous commit "In commit "Fix 0-subscript in utilstrencodings.cpp."

    Also missing an n in unnecessary.

  23. JeremyRubin force-pushed on Mar 28, 2017
  24. JeremyRubin force-pushed on Mar 28, 2017
  25. JeremyRubin commented at 3:45 PM on March 28, 2017: contributor

    @ryanofsky feedback addressed.

  26. ryanofsky commented at 6:11 PM on June 14, 2017: member

    utACK 275caff431cea65aa77d6a3035597b4b2b9b0052. Only changes since previous review (69fd8ab0a6543dace4b080166b21b2aafdd2c0a9) were addressing feedback and combining / rearranging some of the commits.

  27. laanwj commented at 10:52 AM on June 26, 2017: member

    Needs rebase again (sorry).

  28. Fix subscript[0] bug in net.cpp if GetGroup returns a 0-sized vector e0451e3e2a
  29. Fix 2 subscript[0] bugs in pubkey.cpp, and eliminate one extra size check 500710bd29
  30. Fix subscript[0] in compressor.cpp 96f2119e6c
  31. Cleanup (safe, it was checked) subscript[0] in MurmurHash3 (and cleanup MurmurHash3 to be more clear). 6896dbf169
  32. Fix subscript[0] in base58.cpp 361d95265a
  33. Fix subscript[0] in netaddress.cpp b6856ebedc
  34. Fix subscript[0] in torcontrol ac658e55ff
  35. Fix subscript[0] in validation.cpp 4cac0d1e04
  36. Fix subscript[0] in streams.h bc2e7fd984
  37. Fix subscript[0] in utilstrencodings.cpp e19db7b5ad
  38. Remove unnecessary branches in utilstrencodings string constructors. 4b1c0f2e2e
  39. Fix subscript[0] potential bugs in key.cpp 30ac7688e3
  40. JeremyRubin force-pushed on Jul 8, 2017
  41. JeremyRubin commented at 9:05 PM on July 8, 2017: contributor
  42. meshcollider commented at 8:24 AM on July 11, 2017: contributor

    Looks good to me, utACK

  43. ryanofsky commented at 3:39 PM on July 11, 2017: member

    utACK 30ac7688e398bbe5400a48ecf0726b679ffb845d. Changes since previous review were addrdb changes going away and some rearranged commits.

  44. sipa commented at 5:42 PM on July 11, 2017: member

    utACK 30ac7688e398bbe5400a48ecf0726b679ffb845d

  45. TheBlueMatt commented at 10:20 PM on July 11, 2017: member

    utACK 30ac7688e398bbe5400a48ecf0726b679ffb845d

  46. sipa commented at 11:23 PM on July 12, 2017: member

    utACK 30ac7688e398bbe5400a48ecf0726b679ffb845d

  47. sipa merged this on Jul 12, 2017
  48. sipa closed this on Jul 12, 2017

  49. sipa referenced this in commit 479afa0f84 on Jul 12, 2017
  50. laanwj referenced this in commit efb4383ef6 on Sep 7, 2017
  51. zkbot referenced this in commit 2c1f65d6e2 on Apr 23, 2018
  52. zkbot referenced this in commit 1c2f24793b on Apr 24, 2018
  53. zkbot referenced this in commit d5b558ba5b on Apr 25, 2018
  54. zkbot referenced this in commit 962ba99b36 on Apr 25, 2018
  55. zkbot referenced this in commit e84f02af8b on May 1, 2018
  56. zkbot referenced this in commit bcd65519ca on May 1, 2018
  57. zkbot referenced this in commit 90304c4214 on May 1, 2018
  58. jasonbcox referenced this in commit 24ad36f71b on May 15, 2019
  59. jonspock referenced this in commit e3c1e66361 on Jun 13, 2019
  60. jonspock referenced this in commit 236274a54e on Jun 13, 2019
  61. PastaPastaPasta referenced this in commit 9da7376071 on Jul 6, 2019
  62. PastaPastaPasta referenced this in commit 6563defdc2 on Jul 6, 2019
  63. PastaPastaPasta referenced this in commit 52d3acbc32 on Jul 8, 2019
  64. PastaPastaPasta referenced this in commit 07fcd907f3 on Jul 9, 2019
  65. PastaPastaPasta referenced this in commit 4cb3e543ac on Jul 11, 2019
  66. PastaPastaPasta referenced this in commit 3dc37b7419 on Jul 13, 2019
  67. PastaPastaPasta referenced this in commit 00df52cd6e on Jul 17, 2019
  68. PastaPastaPasta referenced this in commit 22fd59016f on Jul 17, 2019
  69. PastaPastaPasta referenced this in commit 526036ead8 on Jul 18, 2019
  70. schancel referenced this in commit d1cf65ab2e on Jul 18, 2019
  71. PastaPastaPasta referenced this in commit 45025cec6d on Sep 23, 2019
  72. PastaPastaPasta referenced this in commit a37a5009ee on Dec 21, 2019
  73. PastaPastaPasta referenced this in commit 2b82bd4b70 on Jan 2, 2020
  74. PastaPastaPasta referenced this in commit 82e93d37ff on Jan 4, 2020
  75. PastaPastaPasta referenced this in commit c960f49af6 on Jan 4, 2020
  76. PastaPastaPasta referenced this in commit 40c0fb579b on Jan 10, 2020
  77. PastaPastaPasta referenced this in commit 0e136836b4 on Jan 10, 2020
  78. PastaPastaPasta referenced this in commit 75ec11ee93 on Jan 10, 2020
  79. PastaPastaPasta referenced this in commit 0ac78523bb on Jan 12, 2020
  80. barrystyle referenced this in commit 0d74a46db4 on Jan 22, 2020
  81. zkbot referenced this in commit 7d94064616 on Sep 29, 2020
  82. ckti referenced this in commit f91e3cdb8f on Mar 28, 2021
  83. MarcoFalke referenced this in commit 32f1f021bf on May 5, 2021
  84. random-zebra referenced this in commit 71275c1896 on Jun 9, 2021
  85. gades referenced this in commit b3b8e24d13 on Jun 28, 2021
  86. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 18:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me