re: #22937 (review)
I prefer to overwrite the path constructor function and .string()
to make it compatible with UTF-8 and remain the implicit conversion function.
I think allowing implicit conversion will probably be a good idea a few years in the future when we can require the UTF-8 code page on windows, but right now in October 2021 implicit conversion is unsafe. The fact that it took 5 developers 1 month and 93 comments on this PR to discover all the unsafe implicit conversions that would happen after dropping imbue
in #20744 is good evidence that implicit conversions are not safe right now. I think the history of all your past fixes in #15468, #14192, #14128, #13888, #13886, #13883, #13878, #13877, #13866, #13787, and #13734 are another lesson on why we should not silently convert unicode objects to and from strings using the windows code page.
This PR does not get in the way of enabling implicit conversions in the future. The C++ standard authors obviously think that implicit conversions can be safe, and I’d agree with them as long as implicit conversions use some variant of UTF-8 instead of the current windows code page. The boost developers also thought implicit conversions were safe, but unlike the C++ standard path
class, the boost path
class actually provided a way to control the implicit conversions using imbue
.
We could also go the opposite direction on implicit conversions the future and limit them even more, see #22937 (review). But this PR is just trying to take a middle path and fix existing bugs that would happen with imbue
dropped, while providing a good margin of safety to prevent new bugs being introduced.
Because u8string()
would return std::u8string
in C++20 and I don’t think it is a nice idea to handle another type of string. There is no simple way to convert between std::string
and std::u8string
.
If we want to avoid std::u8string
showing in the code when we bump to C++20, the C++ standard behavior must change anyway. Isn’t it be simpler to modify the path constructor and .string()
function? Also, we don’t have to change that many codes.
I don’t think these comments about the u8string()
method are related to whether implicit conversion should be allowed. This PR is just defining the u8string()
method temporarily because boost path class doesn’t provide one, and this will be dropped in #20744 because the standard path class does provides one.
This PR does add 7 calls to the u8string()
method (only in Qt and RPC code) and these calls should be able to take advantage of the new std::u8string
type whenever that update happens.