Fix for #956 (GUI freeze on Windows) #980

pull Diapolo wants to merge 26 commits into bitcoin:master from Diapolo:fix#956 changing 3 files +82 −34
  1. Diapolo commented at 12:51 PM on March 23, 2012: none

    Okay, so my former assumption boost 1.47 would fix #956 was wrong and boost 1.49 did not fix it either. It turns out, that I could recreate the error condition by running Bitcoin-Qt and simply switch of my PC. So there were the 2 orphan files from boost interprocess that cause the problem.

    I found out, that no interprocess_exception was thrown, not in qtipcserver.cpp and not in bitcoin.cpp. Next step was to look through the code and the "create_or_open" option in the message_queue constructor got my attention. I changed this to create_only and voila there was an exception I could work with. It sais "file already exists", which makes sens as there are those orphan files left.

    In the end I introduced a new ipcRecover() function, which is called after catching interprocess::already_exists_error. This function works with the path were the stale message queue file "BitcoinURL" resides and checks if the file is there and tries to remove it. If that succeeds the function returns true and ipcInit() is called again. If this fails an error is written to the debug.log.

    Test case with 1.47 boost libs:

    • kill running Bitcoin-Qt instance by switching PC of
    • check for stale files, which are there
    • starting the client again with the fix disabled in the code
    • no exception, no log entries and a frozen GUI
    • starting the client again with fix enabled results in a not frozen GUI and normal client operation

    Log: ipcInit - boost interprocess exception #9: File exists. ipcRecover - old message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...success

    I know there are many commits in here, so I would suggest to first only look at the diff and if you are interested in the history of the fix take a look at the commits itself.

    As I only run Windows the Linux / Mac devs should test this code, during our discussion.

  2. TheBlueMatt commented at 10:51 PM on March 23, 2012: member

    How does bitcoin-qt cope if you start a second instance which runs this patch (ie a testnet one and a mainnet one at the same time)?

  3. in src/qt/qtipcserver.cpp:None in 360dedbe2c outdated
     113 | +        if (ex.get_error_code() == interprocess::already_exists_error)
     114 | +        {
     115 | +            // try a recovery to fix #956 and pass our message queue name
     116 | +            if (ipcRecover("BitcoinURL"))
     117 | +                // if that worked try init once more
     118 | +                ipcInit();
    


    TheBlueMatt commented at 10:53 PM on March 23, 2012:

    Though it looks like it should probably never happen, the possibility of an infinite recursion here scares me.


    Diapolo commented at 11:00 PM on March 23, 2012:

    I had the same in my mind and came to the conclusion it should never happen ... I don't decide it, but I guess there are several places in the code where are theoretical chances that something bad happens, you could surround nearly everything with try - catch. But let's discuss it :).


    TheBlueMatt commented at 11:21 PM on March 23, 2012:

    Or just add a simple flag to say, dont try again if you fail here...


    Diapolo commented at 11:40 PM on March 23, 2012:

    Yes, that's even better.


    luke-jr commented at 3:06 AM on March 24, 2012:

    Don't shoot me, but I suggest goto...


    Diapolo commented at 9:10 AM on March 24, 2012:

    I wont ever use a goto in C-like languages ^^ ... you would have to force me to do this :D!

  4. Diapolo commented at 11:02 PM on March 23, 2012: none

    @TheBlueMatt I will have to try this ... never used testnet before (does this work in the default and same data-dir?). But I have an idea if it causes problems in the current state. Perhaps I could add an "testnet" string to the message_queue name, which is now "BitcoinURL". Is there a utility function to query where I mine?

  5. TheBlueMatt commented at 11:28 PM on March 23, 2012: member

    Adding a testnet string doesnt fix the problem, there are also valid uses for opening two bitcoin-qt.exes on mainnet at the same time (using different datadirs and ports).

  6. Diapolo commented at 11:45 PM on March 23, 2012: none

    @TheBlueMatt I will do some tests on this tomorrow. I don't even know if there are issues with your mentioned case at all, but thanks for your input :)!

  7. sipa commented at 1:17 AM on March 24, 2012: member
    1. can you rebase this into one commit? (15 is a bit excessive for a bug fix...)
    2. there seem to be some coding style errors and unneccesary code moves.
    3. if this indeed fixes #956, nice! that was the last blocking bug for 0.6.0, afaik.
  8. in src/qt/qtipcserver.cpp:None in 8c81daa2b6 outdated
      16 | -using namespace boost::posix_time;
      17 |  using namespace boost;
      18 |  using namespace std;
      19 |  
      20 | +// Global state
      21 | +bool fInitCalledAfterRecovery = false;
    


    laanwj commented at 8:26 AM on March 24, 2012:

    Please make this a parameter to ipcInit instead of a global

  9. Diapolo commented at 9:13 AM on March 24, 2012: none

    @sipa What do you mean by coding style errors, would be nice if you can point me to them, so I can fix it. I only added so many commits, to get some kind of history, will rebase.

  10. in src/qt/bitcoin.cpp:None in 8c81daa2b6 outdated
     256 | @@ -256,6 +257,8 @@ int main(int argc, char *argv[])
     257 |                              mq.try_send(strURL, strlen(strURL), 0);
     258 |                          }
     259 |                          catch (boost::interprocess::interprocess_exception &ex) {
     260 | +                            printf("boost interprocess exception #%d: %s\n", ex.get_error_code(), ex.what());
     261 | +							break;
    


    laanwj commented at 9:43 AM on March 24, 2012:

    as you asked for coding style errors: please indent this break the same depth as the printf :)


    Diapolo commented at 9:52 AM on March 24, 2012:

    Thanks, used tabs there, doh ...

  11. Diapolo commented at 9:53 AM on March 24, 2012: none

    Question on coding style, is it common to only use ex for exeption handling, no matter what type they are? If yes, I will revert that, too. And what about boost::interprocess::message_queue mq -> boost::interprocess::message_queue mqMessageQueue, good or bad? I'm asking, because in bitcoin.cpp it's still mq.

  12. Diapolo commented at 10:16 AM on March 24, 2012: none

    But back to the test case ... as I currently can't compile Bitcoin-Qt with static libs, I'm not able to run 2 instanced at the same time. But I think this definately needs some further investigation and clarification!

    So 2 instances with different ports and different data-dirs currently work on RC4? They even share the same message queue, right? Then this fix needs to take this into account and I have an idea, but need your thoughts.

    Current fix version will do the following (I assume now they share the message queue file BitcoinURL):

    • start 1st instance, BitcoinURL will be created, because of interprocess::create_only flag in ipcInit() -> OK
    • start 2nd instance, BitcoinURL can't be created, because it's already there and the files is locked by the 1st instance -> triggers the exception interprocess::already_exists_error and tries to remove the file via ipcRecover(true)
    • this will cause an system::errc::broken_pipe error, which is catched and BitcoinURL is not deleted and should be normaly used in the further client execution, because in bitcoin.cpp we use "interprocess::message_queue mq(interprocess::open_only, "BitcoinURL");", which should work. Running 2 instances would trigger some entries to debug.log, but should not cause a program failure. What are your oppinions?

    Edit: I'm off for a few hours, will be back later :).

  13. laanwj commented at 10:38 AM on March 24, 2012: member

    So to summarize: If you start multiple instances, it will work, but the most recent instance will get the URLs?

    That behavior is OK with me.

    I don't think many people start multiple instances of bitcoin, and if they do, they generally have no use case involving a browser (more for servers etc...). However, it should be possible to start multiple instances with multiple datadirs.

  14. Diapolo commented at 1:42 PM on March 24, 2012: none

    Okay, so the latest version is more friendly in terms of what is displayed in the debug.log.

    stale message queue IS there and can be removed: ipcInit - boost interprocess exception #9: File already exists. ipcRecover - possible stale message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...success

    This tries ipcInit() again!


    message queue IS there and locked (this would be if another instance is running): ipcInit - boost interprocess exception #9: File already exists. ipcRecover - possible stale message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...unneeded

    This doesn't try ipcInit() again!


    (stale?) message queue IS there, not locked, but cannot be removed: ipcInit - boost interprocess exception #9: File already exists. ipcRecover - possible stale message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...failed ipcRecover - removal of old message queue failed with error (value): (message)

    This doesn't try ipcInit() again! But URL handling will most likely not work ...


    Edit: I will rebase after the discussion is finished, if this is okay!

  15. in src/qt/qtipcserver.cpp:None in e424384cb3 outdated
      27 | +bool ipcRecover(const char* pszFilename)
      28 |  {
      29 | -    message_queue* mq = (message_queue*)parg;
      30 | +    std::string strIpcDir;
      31 | +    // get path to stale ipc message queue file
      32 | +    interprocess::ipcdetail::tmp_filename(pszFilename, strIpcDir);
    


    luke-jr commented at 1:23 AM on March 25, 2012:

    There is no interprocess::ipcdetail on Linux.


    Diapolo commented at 11:30 AM on March 25, 2012:

    Investigating... got it, with boost 1.47 this is "detail" instead of "ipcdetail" for 1.49 ... will see if the fix still compiles with 1.47 (I was on 1.49 in my build-lab).

    I would vote for a switch to 1.49 in the near future!


    Diapolo commented at 12:11 PM on March 25, 2012:

    Fixed with last commit, compiles again with boost 1.47!

  16. in src/qt/bitcoin.cpp:None in 20862f6395 outdated
     134 | @@ -133,13 +135,14 @@ int main(int argc, char *argv[])
     135 |          {
     136 |              const char *strURL = argv[i];
     137 |              try {
     138 | -                boost::interprocess::message_queue mq(boost::interprocess::open_only, "BitcoinURL");
     139 | +                interprocess::message_queue mq(interprocess::open_only, "BitcoinURL");
    


    luke-jr commented at 12:38 PM on March 25, 2012:

    Don't see a reason to change this. It's better to have explicit namespaces, IMO.


    Diapolo commented at 1:04 PM on March 25, 2012:

    There IS really no consistency in the source code, some us it, some not ... what's the coding standard? I'm fine with leaving out using namespace boost, but then this should be changed globally. Or can you explain it to me?


    luke-jr commented at 9:06 PM on March 25, 2012:

    I think it's been "use absolute namespaces in new code, but don't bother changing old code". In this case, you're changing absolute to relative, and it also makes it harder to find the actual code changes. ;)

  17. in src/qt/qtipcserver.cpp:None in 20862f6395 outdated
      27 | +bool ipcRecover(const char* pszFilename)
      28 |  {
      29 | -    message_queue* mq = (message_queue*)parg;
      30 | +    std::string strIpcDir;
      31 | +    // get path to stale ipc message queue file (hint: ::detail changes to ::ipcdetail when switching to boost 1.49)
      32 | +    interprocess::detail::tmp_filename(pszFilename, strIpcDir);
    


    luke-jr commented at 12:42 PM on March 25, 2012:

    It doesn't matter which boost version's internals you use, this is still Windows-only... Isn't there a way to recover with documented methods?


    Diapolo commented at 1:06 PM on March 25, 2012:

    As I said, treat me as a newb, so ... what you mean by Windows-only?

    I read through the boost docs and found no other way to get the current path and filename to the stale file. We have a condition here, that causes problems and I'm only trying to fix it ...


    luke-jr commented at 1:13 PM on March 25, 2012:

    boost::interprocess doesn't use files, except on Windows. Other platforms don't have these internals.


    Diapolo commented at 2:23 PM on March 25, 2012:

    Thanks, didn't know that ... I'm rewriting the code to remove the need for directly calling functions from ::detail / ::ipcdetail!

  18. sipa commented at 2:25 PM on March 25, 2012: member

    Since this is a workaround for a boost-specific windows-only problem, I have no problem with using undocumented and windows-only boost internals. You do need to make sure it keeps working on other systems though, so enclose the fix in #ifdef WIN32 / #else / #endif, and try to keep the windows-specific part as small as possible. You'll probably need other #ifdefs to select specific boost versions as well, but for now that seems the only option. At least building on non-windows systems should keep working with all (recent) boost versions.

    Regarding explicit namespace or using namespace: do whatever makes the code easiest to read. I don't care.

  19. Diapolo commented at 2:47 PM on March 25, 2012: none

    For now I will add the needed #ifdefs, so that the fix is activated only on Windows. After we are sure this works and breaks nothing, I will definately look into how I can write this in a platform independent way (new pull request and no blocking change ^^).

    Oh and thanks for the "using namespace" clarification ... that was in my mind, too. Easier to read code and better understandable code.

  20. gavinandresen commented at 3:36 PM on March 25, 2012: contributor

    It feels to me like this feature (click-to-open bitcoin: URIs) needs a re-think on Windows; poking boost::interprocess internals is a bad idea that we should avoid if at all possible.

    For the 0.6 release, I'm inclined to disable click-to-open bitcoin: URIs on Windows entirely, and get a cleaner fix for 0.7.

  21. Diapolo commented at 4:02 PM on March 25, 2012: none

    Tell me the intended release date for 0.6 ... I'm trying to get a cleaner fix ready.

  22. TheBlueMatt commented at 4:42 PM on March 25, 2012: member

    Due to the massive list of changes since rc4, we really need to push an rc5 and give it a while to simmer before we can really release a 0.6 release (IMO). In that case, I have no problem merging this and poking windows-only boost internals, as long as all the changes in here are within #ifdef WIN32's

  23. Diapolo commented at 5:03 PM on March 25, 2012: none

    May I suggest the following, as this one has plenty of commits, I will rebase it into a single one with the current code here.

    It could then be used in a possible RC5, but I would really like a dev to test a compilation and execution with Linux / Mac before, as I can't do this. During the testing phase of a possible RC5 I have time to work on a cross-platform compatible version without the use of boost::interprocess internals.

  24. sipa commented at 5:06 PM on March 25, 2012: member

    Don't worry, we won't merge it if it doesn't compile cleanly.

  25. Diapolo commented at 5:50 PM on March 25, 2012: none

    Last commit does not use any boost::interprocess internals, it works and removes stale message queues (tested this after I once more crashed / switched of my Windows machine to create a stale mq).

    If an existing message queue is detected the first try is to remove it. If it's locked (i.e. by another BC instance) that won't work and I try ipcInit() again with the open_or_create mode on the message queue to use it. If the stale message queue could be removed, ipcInit() is called once more with the create_only mode for the mq and a new one is created.

    Perhaps I can now even safely remove the #ifdef WIN32, but this should be checked by the Linux / Mac devs.

    Edit: The old ipcRecover() is in just for reference during the discussion, I would like to remove that one and change ipcRecover2() into ipcRecover()!

  26. TheBlueMatt commented at 8:20 PM on March 25, 2012: member

    An alternative to this that (should) work and also fixes #981 can be found at: https://github.com/TheBlueMatt/bitcoin/commit/1b327daeb510d91c065ea3610c4b5740c743fffe though its more of a hack since I really, really dont feel like diving into the win32 apis and figuring out whats going on in that ugly mess.

  27. Diapolo commented at 8:28 PM on March 25, 2012: none

    @TheBlueMatt I compiled Bitcoin-Qt with boost 1.49 and that did NOT fix this bug. Is your commit a backport from boost > 1.47 to 1.47?

  28. TheBlueMatt commented at 8:37 PM on March 25, 2012: member

    No, that commit is something I wrote, afaict boost has not yet fixed the bug (though it was filed 12 months ago as https://svn.boost.org/trac/boost/ticket/5392 )

  29. gavinandresen commented at 3:19 PM on March 26, 2012: contributor

    I'm going to be the "bad guy" for this one and take the conservative route: lets get this fixed and thoroughly tested early in the 0.7 release, and disable bitcoin: URI handling on Windows for the 0.6 release.

  30. sipa commented at 3:21 PM on March 26, 2012: member

    Given #985, it seems URI's never worked on windows anyway...

  31. Diapolo commented at 3:38 PM on March 26, 2012: none

    The URI feature is little known, so perhaps the problem with URI handling in general was not observed, since no one ever used it on Windows (i.e. because currently there are very few links out in the internet). @gavinandresen I'm fine with 0.7 ... you would only be the bad guy, if you said that code sucks all over the place, go away ;) ... but you didn't do that.

  32. Diapolo commented at 8:13 PM on March 26, 2012: none

    Would the main devs prefer a rebase now, so that it's a single commit or after it's discussed, tested and finished?

  33. added debug.log output for boost::interprocess::interprocess_exception c656f2399a
  34. removed using namespace for boost::interprocess and boost::posix_time to make debugging easier 6caed63f89
  35. changed message_queue properties from open_or_create to create_only (only this triggers an exception after a hard Bitcoin-Qt crash) 2a1a94cb05
  36. forgot to add an boost::interprocess after removing that namespace 1d8a449602
  37. integrated an ipcRecover() function to detect and remove a stale ipc message queue file 6779cba486
  38. fixed typo: exeption -> exception 261b80da4d
  39. only call ipcRecover, if the boost interprocess exception threw already_exists_error 13cc75a0e6
  40. include boost interprocess exception error number in debug.log for
    exceptions in bitcoin.cpp, too
    989b954b9d
  41. added safety check to filesystem::remove() to avoid another exception, if access to the file is denied and made ipcRecover() return a bool 6d0ae4d9e8
  42. removed prefix boost:: as we are using namespace boost e05676473e
  43. removed 2 lines of code for writing debug.log messages 8ef8458a40
  44. polished boost exception error message strings bbfb0e8eee
  45. coding style polish 5645e4e69a
  46. if ipcRecover() succeeds, try ipcInit() again e87f2a241a
  47. coding style polish f1634c3b66
  48. added flag fInitCalledAfterRecovery to avoid an infinite recursion in the exception handling 2608e2428c
  49. made fInitCalledAfterRecovery a parameter to ipcInit() a6f09c5b8c
  50. added using namespace boost to bitcoin.cpp too and fixed an indent glitch 830fe3ece9
  51. added code to take care of a locked BitcoinURL message queue file f5904a2707
  52. renamed ::ipcdetail to ::detail, as we are still using boost 1.47 8ad21c9942
  53. made fix active only on Windows, prepared for a switch to boost > 1.47 and some comments were updated b7cbbf5d02
  54. revert naming of mqMessageQueue -> mq and exIPCException -> ex to be consistent with bitcoin.cpp again b036928394
  55. added ipcRecover2() to introduce a version which does not rely on interprocess:detail / interprocess::ipcdetail internals ff443fa791
  56. renamed ipcRecover2() -> ipcRecover() and removed old function / added qtipcserver.h to qtipcserver.cpp 5327f2c55d
  57. added BCQT_MESSAGE_QUEUE_NAME definition in qtipcserver.h to define the message queue name globally d180418544
  58. added doxygen comments to qtipcserver.cpp and qtipcserver.h / revert disabling of URI handling in Windows ee440453a0
  59. Diapolo commented at 9:49 AM on March 27, 2012: none

    Rebased with current master and fixed up a little git mess I created ...

  60. Diapolo commented at 9:26 PM on March 29, 2012: none

    Doh, this one seems completely obsolete in the light of my current work on the IPC server and URI-handling code. It's a pain in Windows to use a "file" for doing IPC communication it seems (at least when things go wrong ^^). There a severe handling problems with stale / wrong mq files and the combination of 2 Bitcoin-Qt instances ... but I made some good progress.

    I would like to close this one tomorrow and will publish my work until the weekend, so it can be discussed and reviewed!

  61. Diapolo commented at 11:33 AM on April 1, 2012: none

    Closed!

  62. Diapolo closed this on Apr 1, 2012

  63. suprnurd referenced this in commit e8ec9a6e06 on Dec 5, 2017
  64. ptschip referenced this in commit 030bf01d91 on Mar 7, 2018
  65. lateminer referenced this in commit 8b81d8f6f9 on Oct 30, 2019
  66. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 03:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me