utils: Add fstream wrapper to allow to pass unicode filename on Windows #13878

pull ken2812221 wants to merge 4 commits into bitcoin:master from ken2812221:iofstream-custom changing 9 files +228 −16
  1. ken2812221 commented at 4:49 pm on August 4, 2018: contributor

    If compiled with mingw, use glibc++ extension stdio_filebuf to open the file by FILE* instead of filename.

    In other condition, we can use boost::fstream.

  2. ken2812221 force-pushed on Aug 4, 2018
  3. DrahtBot commented at 8:12 pm on August 4, 2018: member
    • #14372 (msvc: build secp256k1 and leveldb locally by ken2812221)
    • #14303 (rpc: Early call once CWallet::MarkDirty in import calls by promag)
    • #14151 (windows: Fix remaining compiler warnings (MSVC) by practicalswift)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. ken2812221 renamed this:
    utils: Add fstream wrapper to allow to pass unicode filename
    utils: Add fstream wrapper to allow to pass unicode filename on Windows
    on Aug 5, 2018
  5. laanwj added the label Windows on Aug 6, 2018
  6. laanwj added the label Utils/log/libs on Aug 6, 2018
  7. DrahtBot added the label Needs rebase on Aug 15, 2018
  8. ken2812221 force-pushed on Aug 15, 2018
  9. alexeyneu commented at 5:12 pm on August 15, 2018: none

    but there’s override:

    0#include <fstream>
    1int main(int argc, char* argv[])
    2{	
    3	std::fstream tr(L"hk");
    4	return 0;
    5}
    

    compiles fine.
    ( msys2 )

    0c++ json_c.cpp -lws2_32 -static-libgcc -static-libstdc++ -static-libgcc -static-libstdc++ -Wl,-Bstatic -lstdc++ -lpthread
    
  10. ken2812221 commented at 5:26 pm on August 15, 2018: contributor
    This is for mingw
  11. DrahtBot removed the label Needs rebase on Aug 15, 2018
  12. alexeyneu commented at 7:10 pm on August 15, 2018: none
    untitled 1 @ken2812221 unexpected statement
  13. ken2812221 commented at 2:55 am on August 16, 2018: contributor
    @alexeyneu To be clear, this is for the mingw on Ubuntu 18.04 which is what we use for gitian to build release binaries. If you can test this on Ubuntu, that would be great.
  14. alexeyneu commented at 10:32 am on August 16, 2018: none
    msys2 .h files are nothing more then msvc copy-paste. it may not work but linux edition has same headers . i’ve used it in msvc with np. this applies to constructor only. this stuff isn’t avaible for .open() and others. last time i’ve seen there smth own-written was windows edition of watcom c .
  15. alexeyneu commented at 11:39 pm on August 16, 2018: none
    Which is to say, i dont see much sense in testing it coz it will work .
    it’s not my PR anyway.
  16. ken2812221 force-pushed on Aug 24, 2018
  17. DrahtBot added the label Needs rebase on Aug 29, 2018
  18. ken2812221 force-pushed on Aug 29, 2018
  19. DrahtBot removed the label Needs rebase on Aug 29, 2018
  20. laanwj added this to the "Blockers" column in a project

  21. ken2812221 force-pushed on Aug 31, 2018
  22. ryanofsky commented at 9:27 pm on September 5, 2018: member

    I don’t think I understand how this PR works (does it depend on #13866?), or why reimplementing fstream classes is a good solution compared to alternatives. If this is a bug in boost, wouldn’t it be better to fix the bug upstream and maybe patch it locally? Or could we drop the use of boost here and instead use std::ifstream and std::ofstream?

    It would be helpful to have a clear description of the bug and possible workarounds.

  23. in src/fs.h:47 in 2e84c0b24a outdated
    41@@ -38,6 +42,41 @@ namespace fsbridge {
    42         void* hFile = (void*)-1; // INVALID_HANDLE_VALUE
    43 #endif
    44     };
    45+
    46+
    47+#if defined WIN32 && defined __GNUC__
    


    ryanofsky commented at 9:36 pm on September 5, 2018:

    Should this be checking for __GLIBCXX__ instead of __GNUC__? It looks like __gnu_cxx::stdio_filebuf might be present in libstdc++ but not libc++ (https://stackoverflow.com/questions/22624503/how-to-get-a-file-descriptor-from-a-stdbasic-ios-for-clang-on-os-x).

    Will different workarounds be needed for MSVC and clang?

  24. ken2812221 commented at 11:18 pm on September 5, 2018: contributor

    boost::filesystem::i/ofstream calls std::filebuf internally. So if std::filebuf(wchar_t*) exist, boost will call that function (MSVC). Otherwise it will call std::filebuf(char*) which can lead to encoding issue. After c++17, we can use std::i/ofstream(std::filesystem::path). But before then, we can’t.

    If building with libstdc++, we can use its extension stdio_filebuf by passing FILE *.

    Will different workarounds be needed for MSVC and clang?

    I don’t know about clang for Windows. But for MSVC, we can use boost::filesystem::i/ofstream.

    does it depend on #13866?

    Without #13866, it would work. But to solve the encoding issue, it requires #13866.

  25. ken2812221 force-pushed on Sep 5, 2018
  26. in src/fs.cpp:203 in bc6a3c6aa9 outdated
    203+    if (file != nullptr) {
    204+        filebuf.close();
    205+        fclose(file);
    206+    }
    207+    file = nullptr;
    208+}
    


    ryanofsky commented at 8:59 pm on September 6, 2018:

    I think it would be good to add a static assert here to detect if this problem happens with other libraries or versions of libraries:

     0#if defined WIN32 && defined __GLIBCXX__
     1...
     2#elif WIN32
     3static_assert(sizeof(*fs::path().BOOST_FILESYSTEM_C_STR) == sizeof(wchar_t),
     4    "Warning: This build is using boost::filesystem ofstream and ifstream "
     5    "implementations which will fail to open paths containing multibyte "
     6    "characters. You should delete this static_assert to ignore this warning, "
     7    "or switch to a different C++ standard library like the Microsoft C++ "
     8    "Standard Library (where boost uses non-standard extensions to construct "
     9    "stream objects with wide filenames), or the GNU libstdc++ library (where "
    10    "a more complicated workaround has been implemented above).");
    11#endif
    
  27. in src/fs.h:60 in bc6a3c6aa9 outdated
    41@@ -38,6 +42,41 @@ namespace fsbridge {
    42         void* hFile = (void*)-1; // INVALID_HANDLE_VALUE
    43 #endif
    44     };
    45+
    46+
    47+#if defined WIN32 && defined __GLIBCXX__
    


    ryanofsky commented at 9:59 pm on September 6, 2018:

    Can we add a comment to describe the bug and workaround? Suggestion:

     0// GNU libstdc++ specific workaround for opening UTF-8 paths on Windows.
     1//
     2// On Windows, it is only possible to reliably access multibyte file paths through
     3// `wchar_t` APIs, not `char` APIs. But because the C++ standard doesn't require
     4// ifstream/ofstream `wchar_t` constructors, and the GNU library doesn't
     5// provide them (in contrast to the Microsoft C++ library, see
     6// https://stackoverflow.com/questions/821873/how-to-open-an-stdfstream-ofstream-or-ifstream-with-a-unicode-filename/822032#822032),
     7// Boost is forced to fall back to `char` APIs which may not work properly.
     8//
     9// Work around this issue by creating stream objects with `_wfopen` in
    10// combination with `__gnu_cxx::stdio_filebuf`. This workaround can be removed
    11// with an upgrade to C++17, where streams can be constructed directly from
    12// `std::filesystem::path` objects.
    
  28. in src/fs.cpp:101 in bc6a3c6aa9 outdated
     96+#if defined WIN32 && defined __GLIBCXX__
     97+
     98+/**
     99+*  reference: https://github.com/gcc-mirror/gcc/blob/571ee70a6d4226145f77b885937a8eda678e0fb2/libstdc%2B%2B-v3/include/std/fstream#L283
    100+*
    101+*  +---------------------------------------------------------+
    


    ryanofsky commented at 10:09 pm on September 6, 2018:

    I think I would drop this table but keep the URL link, since the table just duplicates information provided by the code below, and seems actually less readable than the code.

    For reference, https://en.cppreference.com/w/cpp/io/basic_filebuf/open also has a similar table.

  29. in src/fs.cpp:167 in bc6a3c6aa9 outdated
    162+}
    163+
    164+void ifstream::open(const fs::path& p, std::ios_base::openmode mode)
    165+{
    166+    close();
    167+    file = fopen(p, openmodeToStr(mode).c_str());
    


    ryanofsky commented at 10:15 pm on September 6, 2018:

    I don’t think it’s obvious that this is intended to call the fsbridge::fopen function above rather than ::fopen. Would suggesting adding a comment here or writing the namespace.

    Could also do the same in ofstream::open below.

  30. in src/fs.cpp:171 in bc6a3c6aa9 outdated
    166+    close();
    167+    file = fopen(p, openmodeToStr(mode).c_str());
    168+    if (file == nullptr) {
    169+        return;
    170+    }
    171+    filebuf = std::move(__gnu_cxx::stdio_filebuf<char>(file, mode));
    


    ryanofsky commented at 10:16 pm on September 6, 2018:

    std::move can be dropped here. Argument is already an rvalue so it has no effect.

    Same applies in ofstream::open below.

  31. in src/fs.h:73 in bc6a3c6aa9 outdated
    68+        void open(const fs::path& p, std::ios_base::openmode mode = std::ios_base::out);
    69+        bool is_open() { return filebuf.is_open(); }
    70+        void close();
    71+
    72+    private:
    73+        __gnu_cxx::stdio_filebuf<char> filebuf;
    


    ryanofsky commented at 10:19 pm on September 6, 2018:
    Could you prefix these members with m_ to match recommended style?
  32. in src/fs.cpp:169 in bc6a3c6aa9 outdated
    169+        return;
    170+    }
    171+    filebuf = std::move(__gnu_cxx::stdio_filebuf<char>(file, mode));
    172+    rdbuf(&filebuf);
    173+    if (mode & std::ios_base::ate) {
    174+        seekg(0, std::ios_base::end);
    


    ryanofsky commented at 10:21 pm on September 6, 2018:
    Would be nice to have a unit test to make sure this logic works. I believe this code will actually run on travis through wine.
  33. ryanofsky approved
  34. ryanofsky commented at 10:31 pm on September 6, 2018: member
    utACK f2b00f16cc5ca12fa0cabc2bedfbf7e0ba094dfa, but only if #13866 is merged first, since this change by itself doesn’t fix anything or really make sense without #13866.
  35. ken2812221 force-pushed on Sep 7, 2018
  36. ken2812221 force-pushed on Sep 7, 2018
  37. ken2812221 force-pushed on Sep 7, 2018
  38. ken2812221 force-pushed on Sep 7, 2018
  39. ken2812221 force-pushed on Sep 7, 2018
  40. ken2812221 force-pushed on Sep 7, 2018
  41. ken2812221 force-pushed on Sep 7, 2018
  42. ken2812221 force-pushed on Sep 7, 2018
  43. ken2812221 force-pushed on Sep 7, 2018
  44. ken2812221 force-pushed on Sep 7, 2018
  45. ken2812221 commented at 10:58 pm on September 7, 2018: contributor
    @ryanofsky requested change included and unit tests added.
  46. ken2812221 force-pushed on Sep 9, 2018
  47. DrahtBot added the label Needs rebase on Sep 13, 2018
  48. ken2812221 force-pushed on Sep 13, 2018
  49. ken2812221 force-pushed on Sep 13, 2018
  50. ken2812221 force-pushed on Sep 13, 2018
  51. DrahtBot removed the label Needs rebase on Sep 13, 2018
  52. in src/fs.h:54 in 0548f48f19 outdated
    49+// On Windows, it is only possible to reliably access multibyte file paths through
    50+// `wchar_t` APIs, not `char` APIs. But because the C++ standard doesn't require
    51+// ifstream/ofstream `wchar_t` constructors, and the GNU library doesn't
    52+// provide them (in contrast to the Microsoft C++ library, see
    53+// https://stackoverflow.com/questions/821873/how-to-open-an-stdfstream-ofstream-or-ifstream-with-a-unicode-filename/822032#822032),
    54+// Boost is forced to fall back to `char` APIs which may not work properly.
    


    ryanofsky commented at 6:32 pm on September 14, 2018:

    In commit “utils: Add fsbridge fstream function wrapper” (0548f48f19d33cb57b8279d7a647e91d375a916e)

    I know this is my own comment, maybe replace “APIs” with “constructors” on this to make clearer this is referring to the same constructors previously mentioned above.

  53. in src/test/fs_tests.cpp:12 in 6b6c598ab5 outdated
     7+
     8+#include <boost/test/unit_test.hpp>
     9+
    10+BOOST_FIXTURE_TEST_SUITE(fs_tests, BasicTestingSetup)
    11+
    12+BOOST_AUTO_TEST_CASE(fsbridge_fstream)
    


    ryanofsky commented at 6:34 pm on September 14, 2018:

    In commit “tests: Add test case for std::ios_base::ate” (6b6c598ab55c60ed5050c0ba0a817dba899384c0)

    Would it be possible to add a test accessing a utf8 filesystem path now that #13866 is merged?


    ken2812221 commented at 9:02 pm on September 14, 2018:
    Sure. But this seems only work if #13877 merged, so change this PR to based on it.
  54. ryanofsky approved
  55. ryanofsky commented at 6:38 pm on September 14, 2018: member
    utACK 6b6c598ab55c60ed5050c0ba0a817dba899384c0. Changes since previous review were various suggestions like adding static_assert, m_ prefixes, comments, and new tests. Thanks especially for the new tests.
  56. ken2812221 force-pushed on Sep 14, 2018
  57. ken2812221 force-pushed on Sep 14, 2018
  58. ken2812221 force-pushed on Sep 14, 2018
  59. ken2812221 force-pushed on Sep 14, 2018
  60. in src/fs.h:49 in 48373c3b20 outdated
    41@@ -39,6 +42,54 @@ namespace fsbridge {
    42     };
    43 
    44     std::string get_filesystem_error_message(const fs::filesystem_error& e);
    45+
    46+    // GNU libstdc++ specific workaround for opening UTF-8 paths on Windows.
    47+    //
    48+    // On Windows, it is only possible to reliably access multibyte file paths through
    49+    // `wchar_t` constructors, not `char` APIs. But because the C++ standard doesn't
    


    ryanofsky commented at 6:40 pm on September 21, 2018:
    Word “constructors” on this line should be changed back to “APIs”, because this is just referring in general to how multibyte paths need to be accessed on windows, not to stream object constructors in particular.
  61. ryanofsky commented at 6:47 pm on September 21, 2018: member
    utACK b11cf20f8255bf20fc3c80117308fcf77424545a. Only changes since last review were tweaking a comment and updating the tests to use utf8 paths for better coverage. Making the test work required #13877, so that PR is now included here (and hopefully can be merged first).
  62. in src/fs.h:65 in b11cf20f82 outdated
    60+#if defined WIN32 && defined __GLIBCXX__
    61+    class ifstream : public std::istream
    62+    {
    63+    public:
    64+        ifstream() = default;
    65+        ifstream(const fs::path& p, std::ios_base::openmode mode = std::ios_base::in) { open(p, mode); }
    


    practicalswift commented at 8:10 am on September 23, 2018:
    02018-09-22 21:43:16 cpplint(pr=13878): src/fs.h:65:  Constructors callable with one argument should be marked explicit.  [runtime/explicit] [5]
    
  63. utils: Add fsbridge fstream function wrapper 86eb3b3f1a
  64. Move boost/std fstream to fsbridge a554cc901a
  65. tests: Add test case for std::ios_base::ate f86a571edb
  66. Make MSVC compiler read the source code using utf-8 43c7fbb1e7
  67. in src/fs.h:79 in b11cf20f82 outdated
    74+    };
    75+    class ofstream : public std::ostream
    76+    {
    77+    public:
    78+        ofstream() = default;
    79+        ofstream(const fs::path& p, std::ios_base::openmode mode = std::ios_base::out) { open(p, mode); }
    


    practicalswift commented at 8:10 am on September 23, 2018:
    02018-09-22 21:43:16 cpplint(pr=13878): src/fs.h:79:  Constructors callable with one argument should be marked explicit.  [runtime/explicit] [5]
    
  68. ken2812221 force-pushed on Sep 26, 2018
  69. ken2812221 commented at 0:49 am on September 26, 2018: contributor
    Rebased and fix all nits from the comment
  70. in build_msvc/common.vcxproj:20 in 43c7fbb1e7
    11@@ -12,4 +12,9 @@
    12           Outputs="$(MSBuildThisFileDirectory)..\src\config\bitcoin-config.h">
    13       <Copy SourceFiles="$(MSBuildThisFileDirectory)bitcoin_config.h" DestinationFiles="$(MSBuildThisFileDirectory)..\src\config\bitcoin-config.h" />
    14   </Target>
    15+  <ItemDefinitionGroup>
    16+    <ClCompile>
    17+      <AdditionalOptions>/utf-8 %(AdditionalOptions)</AdditionalOptions>
    18+    </ClCompile>
    19+  </ItemDefinitionGroup>
    20 </Project>
    


    practicalswift commented at 5:18 am on October 2, 2018:
    Add missing newline at end of file :-)

    ken2812221 commented at 4:31 am on October 3, 2018:
    That seems unnecessary for MSVC project file, they can work without the extra newline. But you can add it in your PR. I don’t have strong opinion.
  71. laanwj commented at 8:35 am on October 18, 2018: member
    utACK 86eb3b3f1ab594142b6baa9576717ff121f3b745
  72. laanwj merged this on Oct 18, 2018
  73. laanwj closed this on Oct 18, 2018

  74. laanwj referenced this in commit 9c5f0d542d on Oct 18, 2018
  75. laanwj removed this from the "Blockers" column in a project

  76. ken2812221 deleted the branch on Oct 18, 2018
  77. Warrows referenced this in commit 1d90b06ee9 on Oct 14, 2019
  78. Warrows referenced this in commit 9a4b384306 on Nov 23, 2019
  79. deadalnix referenced this in commit 52d310b4a2 on Mar 24, 2020
  80. kittywhiskers referenced this in commit 16fe51271a on Jun 15, 2021
  81. kittywhiskers referenced this in commit 6a7202d257 on Jun 16, 2021
  82. kittywhiskers referenced this in commit 93d7ecabf6 on Jun 24, 2021
  83. PastaPastaPasta referenced this in commit e8875967b5 on Jun 25, 2021
  84. random-zebra referenced this in commit 61a098a775 on Aug 5, 2021
  85. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me