Instead of building a full copy of a CTransaction being signed, and then modifying bits and pieces until its fits the form necessary for computing the signature hash, use a wrapper serializer that only serializes the necessary bits on-the-fly.
This makes it easier to see which data is actually being hashed, reduces load on the heap, and also marginally improves performances (around 3-4us/sigcheck here). The performance improvements are much larger for large transactions, though.
The old implementation of SignatureHash is moved to a unit tests, to test whether the old and new algorithm result in the same value for randomly-constructed transactions.