Put simply: "minority can earn revenue in excess of their contribution" as written in this paper: http://arxiv.org/abs/1311.0243
Surprisingly, the return from keeping your mined block secret for a few minutes outweighs the risk of someone beating you to it (with a big enough pool, such as the ones that already exist).
The paper suggests a solution (should take less than 5 minutes):
"We propose a simple, backwards-compatible change to the Bitcoin protocol to address this problem and raise the threshold. Specically, when a miner learns of competing branches of the same length, it should propagate all of them, and choose which one to mine on uniformly at random. In the case of two branches of length 1, as discussed in Section 4, this would result in half the nodes (in expectancy) mining on the pool's branch and the other half mining on the other branch. This yields Y = 1/2, which in turn yields a threshold of 1/4. Each miner implementing our change decreases the selsh pool's ability to increase through control of data propagation. This improvement is independent of the adoption of the change at other miners, therefore it does not require a hard fork. This change to the protocol does not introduce new vulnerabilities to the protocol: Currently, when there are two branches of equal length, the choice of each miner is arbitrary, eectively determined by the network topology and latency. Our change explicitly randomizes this arbitrary choice, and therefore does not introduce new vulnerabilities." (Eyal & Sirer 2013)