I ran into this during testing on BIP101. My mempool was pretty full, so GBT was slow. During one 12-second GBT call, ckpool found a block:
[2015-11-14 00:26:16.931] Possible block solve diff 541623.733740 !
[2015-11-14 00:26:26.164] HTTP socket read+write took 10.954s in json_rpc_call ( "getblock...)
[2015-11-14 00:26:29.144] BLOCK ACCEPTED!
[2015-11-14 00:26:29.157] Solved and confirmed block 604647
Note the timestamp. That's 12.2 seconds of delay between ckpool making the block solve and bitcoind accepting the block. I can see the other 1.3 seconds as possibly being block verification time or dealing with the other network messages that piled on during the 10.9 seconds that GBT took. IIRC, this was a 5 MB block.
This problem was seen on BitcoinXT, not Bitcoin Core, but it affects you too, so I thought I should notify upstream (Core). The underlying issue is how locks are done in CreateNewBlock and elsewhere. Locking cs_main in everything, including both submitblock (rpcmining.cpp:639) and getblocktemplate, is problematic.
A workaround for this is to set up multiple bitcoinds, and set the poolserver to do submitblock to all of them in parallel. Kinda nasty, though. Another workaround is to speed up CNB a la @morcos's work. The locks should still get fixed, though.